Replace string - How to replace each word once

Replace string - How to replace each word once - javascript

I have an xml dictionary as shown below.
<word definition="The primary income-earner in a household"&gtbread winner&lt/word&gt
<word definition="One who wins, or gains by success in competition, contest, or gaming"&gtwinner&lt/word&gt
Whenerver there is a word from dictionary in my html, that word will be replaced with link and definition as title. When link is hovered, user should see the definition.
var allwords = xmlDoc.getElementsByTagName("word");
for (var i=0; i<allwords.length; i++)
{
var name = allwords[i].lastChild.nodeValue;
var linked = '' + allwords[i].lastChild.nodeValue + '';
}
Here is my replacer
function replacer(oldstring, newstring) {
document.body.innerHTML = document.body.innerHTML.replace(oldstring, newstring);
}
But problem is
once bread winner changes to linked form, also winner changes since bread winner includes winner, winner changes twice, and all the code mixes up.
I am asking if there is a way, once bread winner changes winner should not change anymore.
Thanks in advance!

What about something like this:
for (var i=0; i<allwords.length; i++)
{
if(allwords[i].firstChild.name == 'a') {
// This word has been linked already, skip it
}
// your code
}

You need some kind of sentry to prevent processing a term that's already been processed. I'd recommend wrapping the replaced terms with another element (not clear on how your html is structured, so I'm not sure what would work here, but a span would be the simplest way in normal html). Then your logic would just skip replacing words that had a parent element of whatever you decided to wrap it with.

You'll need to iterate through the matching text nodes, and only replace those that don't have an A tag as an ancestor in the DOM.

Try function strtr($text, $fromList, $toList). It should replace each term once.

Related

Scraping data from HTML using JavaScript RegExp [duplicate]

I'm trying to figure out how to, in raw javascript (no jQuery, etc.), find an element with specific text and modify that text.
My first incarnation of the solution... is less than adequate. What I did was basically:
var x = document.body.innerHTML;
x.replace(/regular-expression/,"text");
document.body.innerHTML = x;
Naively I thought I succeeded with flying colors, especially since it was so simple. So then I added an image to my example and thought I could check every 5 seconds (because this string may enter the DOM dynamically)... and the image flickered every 5 seconds.
Oops.
So, there has to be a correct way to do this. A way that specifically singles out a specific DOM element and updates the text portion of that DOM element.
Now, there's always "recursively search through the children till you find the deepest child with the string" approach, which I want to avoid. And even then, I'm skeptical about "changing the innerHTML to something different" being the correct way to update a DOM element.
So, what's the correct way to search through the DOM for a string? And what's the correct way to update a DOM element's text?

Now, there's always "recursively search through the children till you find the deepest child with the string" approach, which I want to avoid.
I want to search for an element in an unordered random list. Now, there's a "go through all the elements till you find what you're looking for approach", which I want to avoid.
Old-timer magno tape, record, listen, meditate.
Btw, see: Find and replace text with JavaScript on James Padolsey's github
(also hig blog articles explaining it)

Edit: Changed querySelectorAll to getElementsByTagName from RobG's suggestion.
You can use the getElementsByTagName function to grab all of the tags on the page. From there, you can check their children and see if they have any Text Nodes as children. If they do, you'd then look at their text and see if it matches what you need. Here is an example that will print out the text of every Text Node in your document with the console object:
var elms = document.getElementsByTagName("*"),
len = elms.length;
for(var ii = 0; ii < len; ii++) {
var myChildred = elms[ii].childNodes;
len2 = myChildred.length;
for (var jj = 0; jj < len2; jj++) {
if(myChildred[jj].nodeType === 3) {
console.log(myChildred[jj].nodeValue);
// example on update a text node's value
myChildred[jj].nodeValue = myChildred[jj].nodeValue.replace(/test/,"123");
}
}
}
To update a DOM element's text, simple update the nodeValue property of the Text Node.

Don't use innerHTML with a regular expression, it will almost certainly fail for non-trivial content. Also, there are still differences in how browsers generate it from the live DOM. Replacing the innerHTML will also remove any event listeners added as element properties (i.e. like element.onclick = fn).
It is best if you can have the string enclosed in an element with an attribute or property you can search on (id, class, etc.) but failing that, a search of text nodes is the best approach.
Edit
Attempting a general purpose text selection function for an HTML document may result in a very complex algorithm since the string could be part of a complex structure, e.g.:
<h1>Some <span class="foo"><em>s</em>pecial</span> heading</h1>
Searching for the string "special heading" is tricky as it is split over 2 elements. Wrapping it another element (say for highlighting) is also not trivial since the resulting DOM structure must be valid. For example, the text matching "some special" in the above could be wrapped in a span but not a div.
Any such function must be accompanied by documentation stating its limitations and most appropriate use.

Forget regular expressions.
Iterate over each text node (and doing it recursively will be the most elegant) and modify the text nodes if the text is found. If just looking for a string, you can use indexOf().

x.replace(/regular-expression/,"text");
will return a value so
var y = x.replace(/regular-expression/,"text");
now you can assign new value.
document.body.innerHTML = y;
Bu you want to think about this, you dont't want to get the whole body just to change one small piece of code, why not get the content of a div or any element and so on
example:
<p id='paragraph'>
... some text here ...
</p>
now you can use javascript
var para = document.getElementById('paragraph').innerHTML;
var newPara = para.replace(/regex/,'new content');
para.innerHTML = newPara;
This should be the simplest way.

Insert a span around nth-word within a div

I'm designing a rudimentary spell checker of sorts. Suppose I have a div with the following content:
<div>This is some text with xyz and other text</div>
My spell checker correctly identifies the div (returning a jQuery object entitled current_object) and an index for the word (in the case of the example, 5 (due to starting at zero)).
What I need to do now, is surround this word with a span e.g.
<span class="spelling-error">xyz</span>
Leaving me with the final structure like this:
<div>
This is some text with
<span class="spelling-error">xyz</span>
and other text
</div>
However, I need to do this without altering the existing user selection / moving the caret / invoking methods that do so e.g.
window.getSelection().getRangeAt(0).cloneRange().surroundContents();
In other words, if the user is working on the 4th div in the contenteditable document, my code would identify issues in the other divs (1st - 3rd) while not removing focus from the 4th div.
Many thanks!

You've tagged this post as jQuery but I don't think it's particularly necessary to use it. I've written you an example.
https://jsfiddle.net/so0jrj2b/2/
// Redefine the innerHTML for our spellcheck target
spellcheck.innerHTML = (function(text)
{
// We're using an IIFE here to keep namespaces tidy.
// words is each word in the sentence split apart by text
var words = text.split(" ");
// newWords is our array of words after spellchecking.
var newWords = new Array;
// Loop through the sentences.
for (var i = 0; i < words.length; ++i)
{
// Pull the word from our array.
var word = words[i];
if (i === 5) // spellcheck logic here.
{
// Push this content to the array.
newWords.push("<span class=\"mistake\">" + word + "</span>");
}
else
{
// Push the word back to the array.
newWords.push(word);
}
}
// Return the rejoined text block.
return newWords.join(" ");
})(spellcheck.innerHTML);
Worth noting my usage of an IIFE her can be easily reproduced by moving that logic to its own function declaration to make better use of it.
Be aware you also need to account for punctuation in your spellchecking instances.

Splitting a long phrase into an array

I need to take the phrase
It’s that time of year when you clean out your closets, dust off shelves, and spruce up your floors. Once you’ve taken care of the dust and dirt, what about some digital cleaning? Going through all your files and computers may seem like a daunting task, but we found ways to make the process fairly painless.
and upon pressing a button
split it into an array
iterate over that array at each step
Build SPAN elements as you go, along with the attributes
Add the SPAN elements to the original DIV
Add a click handler to the SPAN elements, or to the DIV, which causes the style on the SPAN to change on mouseover.
So far I had
function splitString(stringToSplit, separator) {
var arrayOfStrings = stringToSplit.split(separator);
print('The original string is: "' + stringToSplit + '"');
print('The separator is: "' + separator + '"');
print("The array has " + arrayOfStrings.length + " elements: ");
for (var i=0; i < arrayOfStrings.length; i++)
print(arrayOfStrings[i] + " / ");
}
var space = " ";
var comma = ",";
splitString(tempestString, space);
splitString(tempestString);
splitString(monthString, comma);
for (var i=0; i < myArray.length; i++)
{
}
var yourSpan = document.createElement('span');
yourSpan.innerHTML = "Hello";
var yourDiv = document.getElementById('divId');
yourDiv.appendChild(yourSpan);
yourSpan.onmouseover = function () {
alert("On MouseOver");
}
and for html I have
The DIV that will serve as your input (and output) is here, with
id="transcriptText":</p>
<div id="transcriptText"> It’s that time of year when you clean out your
closets, dust off shelves, and spruce up your floors. Once you’ve taken
care of the dust and dirt, what about some digital cleaning? Going
through all your files and computers may seem like a daunting task, but
we found ways to make the process fairly painless.</div>
<br>
<div id="divideTranscript" class="button"> Transform the
Transcript! </div>
Any help on how to move one? I have been stuck for quite some time

Well, first off this looks like homework.
That said, I'll try to help without giving you the actual code, since we're not supposed to give actual working solutions to homework. You're splitting the string too many times (once is all that's needed based on the instructions you gave) and you have to actually store the result of the split call somewhere that your other code can use it.
Your instructions say to add attributes to the span, but not which attributes nor what their contents should be.
Your function should follow the instructions:
1) Split the string. Since it doesn't specify on what, I'd assume words. So split it on spaces only and leave the punctuation where it is.
2) with the array of words returned from the split() function, iterate over it like you attempt to, but inside the braces that scope the loop is where you want to concatenate the <span> starting and ending tags around the original word.
3) use the document.createElement() to make that current span into a DOM element. Attach the mouseover and click handlers to it, then appendChild() it to the div.
add the handler to your button to call the above function.
Note that it's possibly more efficient to use the innerHTML() function to insert all the spans at once, but then you have to loop again to add the hover/click handlers.

Detect Once a Certain Word Has Just Been Entered in a Textarea

Considering features like EditArea's and CodeMirror's autocomplete, I was wondering if, like Dreamweaver, there is a way to detect if the last word you entered is in a certain list then provide the same kind of suggestion box but with the function's arguments. I imagine you would use a regular expression on the entire field or possibly split() the whole thing (or the current line) then use the length attribute of the array to find the last bit of text, then do something involving an indexOf-like operation; however, this seems like it would get a bit resource-intensive. It almost looks like I've answered my own question, but it always helps to fully explain one's quandary, especially publicly. There's gotta be a better solution than mine. I appreciate any input. Thank you.

Put the list of words to match in an object, have the text or options to display as the value. Then on keyup or keypress you can get the last word of the text area using a function like:
function showLastWord(id){
var el = document.getElementById(id);
var lastWord = el.value.match(/\w+$/);
return lastWord? lastWord[0] : '';
}
Then check if the word is in the list and do stuff appropriately.
Edit
A small example is:
<textarea onkeyup="showHelp(this);"></textarea>
<script>
var getLastWord = (function() {
re = /\w+$/;
return function (s){
var lastWord = s.match(re);
return lastWord? lastWord[0] : '';
}
}());
var keyWords = {foo:'foo was typed',bar:'bar was typed'};
function showHelp(el) {
var lastWord = getLastWord(el.value);
// Check for matching own property of keyWords
if (keyWords.hasOwnProperty(lastWord)) {
// Do stuff
console.log(keyWords[lastWord]);
}
}

Regex to search html return, but not actual html jQuery

I'm making a highlighting plugin for a client to find things in a page and I decided to test it with a help viewer im still building but I'm having an issue that'll (probably) require some regex.
I do not want to parse HTML, and im totally open on how to do this differently, this just seems like the the best/right way.
http://oscargodson.com/labs/help-viewer
http://oscargodson.com/labs/help-viewer/js/jquery.jhighlight.js
Type something in the search... ok, refresh the page, now type, like, class or class=" or type <a you'll notice it'll search the actual HTML (as expected). How can I only search the text?
If i do .text() it'll vaporize all the HTML and what i get back will just be a big blob of text, but i still want the HTML so I dont lose formatting, links, images, etc. I want this to work like CMD/CTRL+F.
You'd use this plugin like:
$('article').jhighlight({find:'class'});
To remove them:
.jhighlight('remove')
==UPDATE==
While Mike Samuel's idea below does in fact work, it's a tad heavy for this plugin. It's mainly for a client looking to erase bad words and/or MS Word characters during a "publishing" process of a form. I'm looking for a more lightweight fix, any ideas?

You really don't want to use eval, mess with innerHTML or parse the markup "manually". The best way, in my opinion, is to deal with text nodes directly and keep a cache of the original html to erase the highlights. Quick rewrite, with comments:
(function($){
$.fn.jhighlight = function(opt) {
var options = $.extend($.fn.jhighlight.defaults, opt)
, txtProp = this[0].textContent ? 'textContent' : 'innerText';
if ($.trim(options.find.length) < 1) return this;
return this.each(function(){
var self = $(this);
// use a cache to clear the highlights
if (!self.data('htmlCache'))
self.data('htmlCache', self.html());
if(opt === 'remove'){
return self.html( self.data('htmlCache') );
}
// create Tree Walker
// https://developer.mozilla.org/en/DOM/treeWalker
var walker = document.createTreeWalker(
this, // walk only on target element
NodeFilter.SHOW_TEXT,
null,
false
);
var node
, matches
, flags = 'g' + (!options.caseSensitive ? 'i' : '')
, exp = new RegExp('('+options.find+')', flags) // capturing
, expSplit = new RegExp(options.find, flags) // no capturing
, highlights = [];
// walk this wayy
// and save matched nodes for later
while(node = walker.nextNode()){
if (matches = node.nodeValue.match(exp)){
highlights.push([node, matches]);
}
}
// must replace stuff after the walker is finished
// otherwise replacing a node will halt the walker
for(var nn=0,hln=highlights.length; nn<hln; nn++){
var node = highlights[nn][0]
, matches = highlights[nn][1]
, parts = node.nodeValue.split(expSplit) // split on matches
, frag = document.createDocumentFragment(); // temporary holder
// add text + highlighted parts in between
// like a .join() but with elements :)
for(var i=0,ln=parts.length; i<ln; i++){
// non-highlighted text
if (parts[i].length)
frag.appendChild(document.createTextNode(parts[i]));
// highlighted text
// skip last iteration
if (i < ln-1){
var h = document.createElement('span');
h.className = options.className;
h[txtProp] = matches[i];
frag.appendChild(h);
}
}
// replace the original text node
node.parentNode.replaceChild(frag, node);
};
});
};
$.fn.jhighlight.defaults = {
find:'',
className:'jhighlight',
color:'#FFF77B',
caseSensitive:false,
wrappingTag:'span'
};
})(jQuery);
If you're doing any manipulation on the page, you might want to replace the caching with another clean-up mechanism, not trivial though.
You can see the code working here: http://jsbin.com/anace5/2/
You also need to add display:block to your new html elements, the layout is broken on a few browsers.

In the javascript code prettifier, I had this problem. I wanted to search the text but preserve tags.
What I did was start with HTML, and decompose that into two bits.
The text content
Pairs of (index into text content where a tag occurs, the tag content)
So given
Lorem <b>ipsum</b>
I end up with
text = 'Lorem ipsum'
tags = [6, '<b>', 10, '</b>']
which allows me to search on the text, and then based on the result start and end indices, produce HTML including only the tags (and only balanced tags) in that range.

Have a look here: getElementsByTagName() equivalent for textNodes.
You can probably adapt one of the proposed solutions to your needs (i.e. iterate over all text nodes, replacing the words as you go - this won't work in cases such as <tag>wo</tag>rd but it's better than nothing, I guess).

I believe you could just do:
$('#article :not(:has(*))').jhighlight({find : 'class'});
Since it grabs all leaf nodes in the article it would require valid xhtml, that is, it would only match link in the following example:
<p>This is some paragraph content with a link</p>
DOM traversal / selector application could slow things down a bit so it might be good to do:
article_nodes = article_nodes || $('#article :not(:has(*))');
article_nodes.jhighlight({find : 'class'});

May be something like that could be helpful
>+[^<]*?(s(<[\s\S]*?>)?e(<[\s\S]*?>)?e)[^>]*?<+
The first part >+[^<]*? finds > of the last preceding tag
The third part [^>]*?<+ finds < of the first subsequent tag
In the middle we have (<[\s\S]*?>)? between characters of our search phrase (in this case - "see").
After regular expression searching you could use the result of the middle part to highlight search phrase for user.

We Keep Coding

JavaScript is the programming language of the Web.

Replace string - How to replace each word once - javascript

What about something like this: for (var i=0; i<allwords.length; i++) { if(allwords[i].firstChild.name == 'a') { // This word has been linked already, skip it } // your code }

You'll need to iterate through the matching text nodes, and only replace those that don't have an A tag as an ancestor in the DOM.

Try function strtr($text, $fromList, $toList). It should replace each term once.

Related

Scraping data from HTML using JavaScript RegExp [duplicate]

Insert a span around nth-word within a div

Splitting a long phrase into an array

Detect Once a Certain Word Has Just Been Entered in a Textarea

Regex to search html return, but not actual html jQuery

Categories

Resources