Loop to change content of href for many anchors - javascript

The content of my posts in Wordpress is a big markup. It is coming from MS Word so it is text wrapped by HTML nested tags and inline styles.
I have a segment of code that is repeated many times in the content (It represents text footnotes). This segment, for the first footnote for example is:
<sup><a title="" href="file:///C:/Users/hp/Desktop/file.docx#_ftn1" name="_f
tnref1">
<span class="MsoFootnoteReference">
<span dir="LTR">
<span class="MsoFootnoteReference">
<span lang="EN-US" style="font-size: 16pt; line-height: 115%;">
[1]
</span>
</span>
</span>
</span>
</a></sup>
.....
<a title="" href="file:///C:/Users/hp/Desktop/file.docx#_ftnref1" name="_ftn1">
<span class="MsoFootnoteReference">
<span dir="LTR" lang="EN-US" style="font-size: 12.0pt; font-family: 'Simplified Arabic','serif';">
<span class="MsoFootnoteReference">
<span lang="EN-US" style="font-size: 12pt; line-height: 115%;">
[1]
</span>
</span>
</span>
</span>
</a>
My goal is to change the 2 hrefs from:
href="file:///C:/Users/hp/Desktop/file.docx#_ftn1"
href="file:///C:/Users/hp/Desktop/file.docx#_ftnref1"
to:
href="#_ftn1"
href="#_ftnref1"
so that the user can jump from one anchor to the other.
Questions:
1- Is is better to use server side language instead of jquery?
2- How to loop over the repetitive segments and change the href contents of each couple of anchors?
Thank you very much in advance for your invaluable assistance.
Solution:
With the use of Regular expression provided by Broxzier + PHP, the code below is working and can be applied to any data before persisting it on the database.
if(preg_match_all('/href\s*=\s*"[^"]+(#[^"]+)"/',get_the_content(),$match))
{
echo preg_replace('/href\s*=\s*"[^"]+(#[^"]+)"/','href="$1"', get_the_content());
}

1- Is is better to use server side language instead of jquery?
Neither. The best and fastest option would be to totally remove the website and page name from the link if they're the same as the current page.
One way would be using Regular Expressions, this could be done via JavaScript, but I strongly suggest doing this by using a text editor and replace the old data (Wordpress saves revisions anyway).
The following regex will grab the href attribute
href\s*=\s*"[^"]+(#[^"]+)"
Replace this with:
href="\1"
And you're done.
2- How to loop over the repetitive segments and change the href contents of each couple of anchors?
Use a global flag to do this. Since it's content I advice you to do it manually or change the regex so that it will only match the current url.
Please note that this will also replace occurrences in the content, if there is any text like href="website#flag" in there. I assumed this was not the case.
--

Using jQuery.attr() and hash property of <a>
$('a').has('.MsoFootnoteReference').attr('href',function( idx,oldHref){
return this.hash;
});
You might want to use some html cleaning on your WYSIWYG html submissions that will clean out unwanted classes and modify the href's for you.
For example SimpleHtmlDOM php library uses css type selectors to modify html and you could use it to modify any href with file:// in it for example

Related

How to read the html tags innerHTML value using css selector or xpath.?

My actual query is, I want to retrieve the HTML inner text value using css selector or xpath. I am able to acheive this by using document.getElementById but not using selectors, instead I can only able to print the tag element but not the text from it.
For Ex:
<li class="active">
<span id="lastPrice">1,603.35</span>
<span id="CAToday"></span><br>
<span class="up" id="change">28.80</span>
</li>
From the above HTML, I want to print 1,603.35 using either
Css or
xpath
Note: I have drilled the forum already but couldn't able to find required solution.
This XPath expression
string(/li/span[#id='lastPrice'])
With this well-formed XML
<li class="active">
<span id="lastPrice">1,603.35</span>
<span id="CAToday"></span><br/>
<span class="up" id="change">28.80</span>
</li>
Result
1,603.35
Check in http://www.utilities-online.info/xpath/?save=07d6e884-4f7e-46cc-9aaf-904e6a440f50-xpath
You should be able to use the XPath version 2 matches function:
//div[matches(text(), 'Hello ?\w+ What would you like to do today \w+')]
which does allow regular expressions.
Java Selenium (xpath example):
driver.findElement(By.xpath("//span[#id='lastPrice']").getAttribute("innerText")
Python Selenium (CSS example):
driver.find_element_by_css_selector("span#lastPrice").get_attribute("innerText")

Inject html after review widget loads for schema markup

My webstore uses Kudobuzz for product reviews, but our e-commerce platform (PDG) isn't supported for SEO markup data.
This widget does not support schema markup on it's own, so I want to somehow select the relevant pieces and inject the schema markup to the various divs/spans that make up the widget. One problem is figuring out how to inject code that google can parse, and another is figuring out how to make the actual selectors for this super bloated widget.
Here is a codepin of the widget and some markup data that is already on the site: http://codepen.io/anon/pen/GpddpO
Here is a link to a product page if you want to see how everything works: https://www.asseenontvhot10.com/product/2835/Professional-Leather--Vinyl-Repair-Kit
This is (roughly) the markup I'm trying to add if it helps:
<div itemscope itemtype="http://schema.org/Review">
<div itemprop="reviewBody">Blah Blah it works 5 star</div>
<div itemprop="author" itemscope itemtype="http://schema.org/Person">
Written by: <span itemprop="name">Author</span></div>
<div itemprop="itemReviewed" itemscope itemtype="http://schema.org/Thing">
<span itemprop="name">Stop Snore</span></div>
<div><meta itemprop="datePublished" content="2015-10-07">Date published: 10/07/2015</div>
<div itemprop="reviewRating" itemscope itemtype="http://schema.org/Rating">
<meta itemprop="worstRating" content="1"><span itemprop="ratingValue">5</span> / <span itemprop="bestRating">5</span> stars</div>
</div>
Theoretically you could write a very small amount of microdata using css :before and :after - with content but it would need all spaces and symbols converted into ISO format, eg.
#name:before { "\003cspan\2002itemprop\0022name\2033"}
#name:after { content: "\2044\003cspan003e"
even spaces need to be substitued with \2002 or an equivalent whitespace
code
should wrap this microdata to your HTML to any element called name:
<span itemprop="name">...</span>
Clearly this can only work if the widget lets you have clear ids or class names for the elements added, and it may be useless you know the type of object reviewed first (eg Book, Movie, since this needs to go at the start in the example I gave - which is incomplete). The code would need to be nested correctly so if you want further help can you edit your question with example HTML for a completed review.
Writing your own JSON-LD script at the top of the page is another option - it would be a different question (if you get stuck) but isn't embedded within the data itself
Edit
it's a good idea to test the css in a separate environment first, eg setup a jsfiddle

Are they any syntax highlighting plugins that will allow you to embed an ignorable html element into a snippet?

I am trying to make dynamic code examples for our api that can be constructed from from input html elements.
A paired down example looks like this, I give the user an input to name the device they would like to create.
<input class="observable-input" data-key="deviceName" type="text" value="deviceKey" />
I would then like that input to update code examples (replacing the device name in the example with the one the user inputs).
<code lang="python">
device = { "name": "<span data-observeKey="deviceName">Name</span>" }
client.createDevicewrite(device)
</code>
I have all of the code setup for observing a change in the input and updating the code examples, this works great. All of the syntax highlighters I have looked at, usually chop the snippet up and rerender the example wrapped with its own html (for styling). Is there an option/configurable way to get a syntax highlighter to not strip the these tags, or is there a different approach I should be looking at for preserving the syntax highlighting and still supporting dynamic updates without having to do a full text search of each snippet's rendered tags.
The example output of the pygment (current syntax highlighter I'm using).
<li>
<div class="line">
<span class="n">device</span>
<span class="o">=</span>
<span class="n">{</span>
<span class="s">"name"</span>
<span class="p">:</span>
<span class="s">"Name"</span>
<span class="n">}</span>
</div>
</li>
I decided to just go with a brute force approach, it ended up being decently performant, ill leave my code here if anyone is interested in what I did
https://gist.github.com/selecsosi/5d41dae843b9dea4888f
Since i use backbone, lodash, and jquery as my base app frameworks the gist uses those. I have a manager which will push updates from inputs to spans on the page which I use to dynamically update the code examples

jQuery highlight pieces of text in an element across tags

I want to select and return searched text using jQuery.
The problem is; parts of the text may be located in <span> or other inline elements, so when searching for 'waffles are tasty' in this text: 'I'm not sure about <i>cabbages</i>, but <b>waffles</b> <span>are</span> <i>tasty</i>, indeed.', you wouldn't get any matches, while the text appears uninterrupted to people.
Let's use this HTML as an example:
<div id="parent">
<span style="font-size: 1.2em">
I
</span>
like turtles
<span>
quite a
</span>
lot, actually.
<span>
there's loads of
</span>
tortoises over there, OMG
<div id="child">
<span style="font-size: 1.2em">
I
</span>
like turtles
<span>
quite a
</span>
lot, actually.
TURTLES!
</div>
</div>
With this (or similar) JavaScript:
$('div#parent').selectText({query: ['i like', 'turtles', 'loads of tortoises'], caseinsensitive: true}).each(function () {
$(this).css('background-color', '#ffff00');
});
//The (hypothetical) SelectText function would return an array of wrapped elements to chain .each(); on them
You would want to produce this output: (without the comments, obviously)
<div id="parent">
<span style="font-size: 1.2em">
<span class="selected" style="background-color: #ffff00">
I
</span> <!--Wrap in 2 separate selection spans so the original hierarchy is disturbed less (as opposed to wrapping 'I' and 'like' in a single selection span)-->
</span>
<span class="selected" style="background-color: #ffff00">
like
</span>
<span class="selected" style="background-color: #ffff00"> <!--Simple match, because the search query is just the word 'turtles'-->
turtles
</span>
<span>
quite a
</span>
lot, actually.
<span>
there's
<span class="selected" style="background-color: #ffff00">
loads of
</span> <!--Selection span needs to be closed here because of HTML tag order-->
</span>
<span class="selected" style="background-color: #ffff00"> <!--Capture the rest of the found text with a second selection span-->
tortoises
</span>
over there, OMG
<div id="child"> <!--This element's children are not searched because it's not a span-->
<span style="font-size: 1.2em">
I
</span>
like turtles
<span>
quite a
</span>
lot, actually.
TURTLES!
</div>
</div>
The (hypothetical) SelectText function would wrap all selected text in <span class="selected"> tags, regardless of whether parts of the search are located in other inline elements like <span>, '', etc. It does not search the child <div>'s contents because that's not an inline element.
Is there a jQuery plugin that does something like this? (wrap search query in span tags and return them, oblivious to whether parts of the found text may be located in other inline elements?)
If not, how would one go about creating such a function? This function's kinda what I'm looking for, but it doesn't return the array of selected spans and breaks when parts of the found text are nested in other inline elements.
Any help would be greatly appreciated!
Piece of cake! See this.
Folded notation:
$.each(
$(...).findText(...),
function (){
...
}
);
In-line notation:
$(...).findText(...).each(function (){
...
}
);
Three options:
Use the browser's built-in methods for this. For the finding, IE has TextRange with its findText() method; other browsers (with the exception of Opera, last time I checked, which was a long time ago) have window.find(). However, window.find() may be killed off without being replaced at some point, which is not ideal. For the highlighting, you can use document.execCommand().
Use my Rangy library. There's a demo here: http://rangy.googlecode.com/svn/trunk/demos/textrange.html
Build your own code to search text content in the DOM and style it.
The first two options are covered in more detail on this answer:
https://stackoverflow.com/a/5887719/96100
Since I just so happened to be working on a similar thing right now, in case you'd like to see the beginnings of my interpretation of "option 3", I thought I'd share this, with the main feature being that all text nodes are traversed, without altering existing tags. Not tested across any unusual browsers yet, so no warranty whatsoever with this one!
function DOMComb2 (oParent) {
if (oParent.hasChildNodes()) {
for (var oNode = oParent.firstChild; oNode; oNode = oNode.nextSibling) {
if (oNode.nodeType==3 && oNode.nodeValue) { // Add regexp to search the text here
var highlight = document.createElement('span');
highlight.appendChild(oNode.cloneNode(true));
highlight.className = 'selected';
oParent.replaceChild(highlight, oNode);
// Or, the shorter, uglier jQuery hybrid one-liner
// oParent.replaceChild($('<span class="selected">' + oNode.nodeValue + '</span>')[0], oNode);
}
if (oNode.tagName != 'DIV') { // Add any other element you want to avoid
DOMComb2(oNode);
}
}
}
}
Then search through things selectively with jQuery perhaps:
$('aside').each(function(){
DOMComb2($(this)[0]);
});
Of course, if you have asides within your asides, strange things might happen.
(DOMComb function adapted from the Mozilla dev reference site
https://developer.mozilla.org/en-US/docs/Web/API/Node)
I wrote a draft as a fiddle. The main steps:
I made a plugin for jQuery
$.fn.selectText = function(params){
var phrases = params.query,
ignorance = params.ignorecase;
wrapper = $(this);
. . .
return(wrapper);
};
Now I can call the selection as a $(...).selectText({query:["tortoise"], ignorance: true, style: 'selection'});
I know you want to have iterator after the function call, but in your case it is impossible, because iterator have to return valid jQuery selectors. For example:
word <tag>word word</tag> word is not valid selector.
After sanitizing the content of wrapper, for each search makeRegexp() makes personal regular expression.
Each searched piece of html source goes to emulateSelection() and then wrapWords()
Main idea is to wrap in <span class="selection">...</span> each single piece of phrase not separated by tags, but not analyze the whole tree of nodes.
NOTE:
It's NOT working with <b><i>... tags in html. You have to make corrections in regexp string for it.
I not guarantee it will work with unicode. But who knows...
As I understood, we talking about iterators like $.each($(...).searchText("..."),function (str){...});.
Check the David Herbert Lawrence poem:
<div class="poem"><p class="part">I never saw a wild thing<br />
sorry for itself.<br />
A small bird will drop frozen dead from a bough<br />
without ever having felt sorry for itself.<br /></p></div>
Actually, after rendering, browser will understood it like this:
<div class="poem">
<p class="part">
<br>I never saw a wild thing</br>
<br>sorry for itself.</br>
<br>A small bird will drop frozen dead from a bough</br>
<br>without ever having felt sorry for itself.</br>
</p>
</div>
For example, I looking for the phrase: "wild thing sorry for". Therefore, I have to highligt the exerpt:
wild thing</br><br>sorry for
I can not wrap it like this <span>wild thing</br><br>sorry for</span>, then create jQuery selector by some temporary id="search-xxxxxx", and return it back -- it's wrong html. I can wrap each piece of text like this:
<span search="search-xxxxx">wild thing</span></br><br><span search="search-xxxxx">sorry for</span>
Then I have to call some function and return jQuery array of selectors:
return($("[search=search-xxxxx]"));
Now we have two "results": a) "wild thing"; b) "sorry for". Is it really what you want?
OR
You have to write you own each() function like another plugin to jQuery:
$.fn.eachSearch = function(arr, func){
...
};
where arr will be not an array of selectors, but array of arrays of selectors, like:
arr = [
{selector as whole search},
{[{selector as first part of search]}, {[selector as second part of search]}},
...
]

Regex: Optional HTML tags in HTML?

I need to parse some values from HTML. I'm using the following regex to parse out some groups, but am having difficulty when there are optional tags in the middle of the HTML. I need some rule to pull out the values from repeated version of the HTML page, even when the optional tags are included.
onclick="return raise('SelectFare', new SelectFareEventArgs(1, 3, 'F'))" required="true" requiredError="Please select a flight and fare in every market."></td><td>Regular Fare</td><td>Adult<br></td><td align="right" style="font-size:110%;">91.99 EUR<br><div style="font-style: italic; font-size: 10px;">Only<span style="color: red;"> 4 </span>seats left at this fare</div></td><td></td><td><b>Fri</b>30 Sep 11<br><b>Flight</b>FR 818</td><td>15:10 Depart<br>16:15 Arrive</td></tr><tr id="1_2011_8_30_23_45_00"><td><div class="planeImg1" title="Click to select this fare on this flight"></div></td><td><input
For example, the optional <div style="font-style: italic; font-size: 10px;">Only<span style="color: red;"> 4 </span>seats left at this fare</div> section of this is messing it up.
tr><tr id="1_2011_9_21_16_05_00"><td><div class="planeImg1" title="Click to select this fare on this flight"></div></td><td><input id="AvailabilityInputFRSelectView_RadioButtonMkt1Fare2" type="radio" name="AvailabilityInputFRSelectView$market1" value="H~HDIS1~XXXC~~RoundFrom|FR~ 816~ ~~DUB~10/21/2011 14:55~EDI~10/21/2011 16:05" onclick="return raise('SelectFare', new SelectFareEventArgs(1, 2, 'H'))" required="true" requiredError="Please select a flight and fare in every market."></td><td>No Taxes</td><td>Adult<br></td><td align="right" style="font-size:110%;"><strike style="color:#F00;font-size:80%;"><b style="color: #999;">22.99 EUR</b></strike>
 (-35%)
<br>14.94 EUR<br></td><td></td><td><b>Fri</b>21 Oct 11<br><b>Flight</b>FR 816</td><td>14:55 Depart<br>16:05 Arrive</td></tr><tr id="1_2011_9_21_16_15_00"><td><div class="planeImg1" title="Click
The
<strike . . </strike>. . (-35%). . <br>14.94 EUR<br></td>
part of the HTML above is messing it up as well.
This is the regex I'm trying (and various other versions!!):
"Please select(?:.*?)<td>(.*?)</td><td>(.*?)<br></td><td align=\"right\" style=\"font-size:110%;\">(.*?)<br>(.*?)<br>(?:.*?)</b>(.*?)<br><b>Flight</b>(.*?)</td><td>(.*?)<br>(.*?)</td>"
I'd appreciate any help at all on this, or even a reference to learning how to parse out optional HTML tags altogether.
Thanks.
You can't parse (X)HTML with RegEx, so don't do it. You need to use a proper parser that will build you a Document Object Model (DOM). As you have tagged your question with JavaScript, I recommend that you use jQuery to build an object graph of your HTML, simply like this:
var $document = $(html);
This $document object can now be operated on with methods like $document.find() to dig out the elements you want from the HTML.

Categories