Replacing content of html document - javascript

I am trying to replace a word in an html document with selected word using javascript.
JavaScript
var node=document.body;
var childs=node.childNodes;
var n=childs.length,i=0;
while (i < n) {
node=childs[i];
if (node.nodeType == 3) {
if (node.textContent) {
node.nodeValue=node.nodeValue.replace("injected","hai");
}
}
i++;
}
but string is not getting replaced...pls help

Add document.body=node; at the end. When you set node to equal body you are copying the value, not editing it by reference.

I'm not sure why you're trying to work with the text node directly. console.log on nodeValue shows that the textContent of displayed tags is neither retrieved nor set in your code.
This works great. Live demo here (click).
<p>something to be replaced.</p>
and the js:
var childs = document.body.childNodes;
var len = childs.length;
for (var i=0; i<len; ++i) {
var node=childs[i];
if (node.nodeName === 'P') {
node.textContent = node.textContent.replace("to be replaced","was replaced");
}
}

There is a much simpler method using the String replace method. For example, you can convert the body of the page into a string and use regular expressions to replace the word. This means that you can avoid having to traverse the entire DOM and node lists, which is unnecessarily slow for your task.
document.getElementByTagName("body")[0].innerHTML.replace("injected","hai")

Related

Remove Text from Element Without Removing Reference

I would like to replace all the text in some element (including text in children) with some other text. For example, the html
<div id="myText">
This is some text.
This is some other text.
<p id="toHide">
This is even more text.
Click this text to hide it.
</p>
</div>
should become
<div id="myText">
That is some text.
That is some other text.
<p id="toHide">
That is even more text.
Click That text to hide it.
</p>
</div>
Essentially, I've replaced all of /this/gi with "That". However, I cannot use the following:
$("#myText").innerHTML = $("#myText").innerHTML.replace(/this/gi, "");
This is because I keep a lot of references to the children of myText. This references will be erased. I realize that in simple cases, I can just update these references, but I have a fairly large file, and many references (and it would be troublesome and error prone to have to update every reference every time this function is called).
I also store some data not visible to innerHTML. For example, I use
$("#toHide").test = "test";
This is lost when writing to innerHTML.
How can I replace text in a div without innerHTML (preferably without jquery)?
Jsfiddle http://jsfiddle.net/prankol57/ZEfM7/
Here's a solution:
var n, walker = document.createTreeWalker(document.getElementById("myText"), NodeFilter.SHOW_TEXT);
while (n = walker.nextNode()) {
n.nodeValue = n.nodeValue.replace(/this/ig, "that");
}
Basically, walk all the text nodes, and substitute their values.
For better compatibility, here's some reusable code:
function visitTextNodes(el, callback) {
if (el.nodeType === 3) {
callback(el);
}
for (var i=0; i < el.childNodes.length; ++i) {
visitTextNodes(el.childNodes[i], callback);
}
}
Then you can do:
visitTextNodes(document.getElementById("myText"), function(el) {
el.nodeValue = el.nodeValue.replace(/this/ig, "that");
});
You can use DOM methods (a.k.a. the old and safe way)
function replaceText(el, pattern, txt) {
for(var i=0; i<el.childNodes.length; ++i) {
var node = el.childNodes[i];
switch(node.nodeType){
case 1: // Element
replaceText(node, pattern, txt); continue;
case 3: // Text node
node.nodeValue = node.nodeValue.replace(/this/gi, "that"); continue;
}
}
}
Demo
Here my version of replaceText:
function replaceText(elem) {
if(elem.nodeType === Node.TEXT_NODE) {
elem.nodeValue = elem.nodeValue.replace(/this/gi, 'that')
return
}
var children = elem.childNodes
for(var i = 0, len = children.length; i < len; ++i)
replaceText(children[i]);
}
NB this take an element as the first parameter and traverse all children, hence it works even with complex elements.
Here the updated fiddle: http://jsfiddle.net/ZEfM7/6/

TEXT_NODE: returns ONLY text?

I'm using JavaScript in order to extract all text from a DOM object. My algorithm goes over the DOM object itself and it's descendants, if the node is a TEXT_NODE type than accumulates it's nodeValue.
For some weird reason I also get things like:
#hdr-editions a { text-decoration:none; }
#cnn_hdr-editionS { text-align:left;clear:both; }
#cnn_hdr-editionS a { text-decoration:none;font-size:10px;top:7px;line-height:12px;font-weight:bold; }
#hdr-prompt-text b { display:inline-block;margin:0 0 0 20px; }
#hdr-editions li { padding:0 10px; }
How do I filter this? Do I need to use something else? I want ONLY text.
From the looks of things, you're also collecting the text from <style> elements. You might want to run a check for those:
var ignore = { "STYLE":0, "SCRIPT":0, "NOSCRIPT":0, "IFRAME":0, "OBJECT":0 }
if (element.tagName in ignore)
continue;
You can add any other elements to the object map to ignore them.
You want to skip over style elements.
In your loop, you could do this...
if (element.tagName == 'STYLE') {
continue;
}
You also probably want to skip over script, textarea, etc.
This is text as far as the DOM is concerned. You'll have to filter out (skip) <script> and <style> tags.
[Answer added after reading OP's comments to Andy's excellent answer]
The problem is that you see the text nodes inside elements whose content is normally not rendered by browsers - such as STYLE and SCRIPT tags.
When scan the DOM tree, using depth-first search I assume, your scan should skip over the content of such tags.
For example - a recursive depth-first DOM tree walker might look like this:
function walker(domObject, extractorCallback) {
if (domObject == null) return; // fail fast
extractorCallback(domObject);
if (domObject.nodeType != Node.ELEMENT_NODE) return;
var childs = domObject.childNodes;
for (var i = 0; i < childs.length; i++)
walker(childs[i]);
}
var textvalue = "":
walker(document, function(node) {
if (node.nodeType == Node.TEXT_NODE)
textvalue += node.nodeValue;
});
In such a case, if your walker encounters tags that you know you won't like to see their content, you should just skip going into that part of the tree. So walker() will have to be adapted as thus:
var ignore = { "STYLE":0, "SCRIPT":0, "NOSCRIPT":0, "IFRAME":0, "OBJECT":0 }
function walker(domObject, extractorCallback) {
if (domObject == null) return; // fail fast
extractorCallback(domObject);
if (domObject.nodeType != Node.ELEMENT_NODE) return;
if (domObject.tagName in ignore) return; // <--- HERE
var childs = domObject.childNodes;
for (var i = 0; i < childs.length; i++)
walker(childs[i]);
}
That way, if we see a tag that you don't like, we simply skip it and all its children, and your extractor will never be exposed to the text nodes inside such tags.

ID Ends With in pure Javascript

I am working in a Javascript library that brings in jQuery for one thing: an "ends with" selector. It looks like this:
$('[id$=foo]')
It will find the elements in which the id ends with "foo".
I am looking to do this without jQuery (straight JavaScript). How might you go about this? I'd also like it to be as efficient as reasonably possible.
Use querySelectorAll, not available in all browsers (like IE 5/6/7/8) though. It basically works like jQuery:
http://jsfiddle.net/BBaFa/2/
console.log(document.querySelectorAll("[id$=foo]"));
You will need to iterate over all elements on the page and then use string functions to test it. The only optimizations I can think of is changing the starting point - i.e. not document.body but some other element where you know your element will be a child of - or you could use document.getElementsByTagName() to get an element list if you know the tag name of the elements.
However, your task would be much easier if you could use some 3rd-party-javascript, e.g. Sizzle (4k minified, the same selector engine jQuery uses).
So, using everything that was said, I put together this code. Assuming my elements are all inputs, then the following code is probably the best I am going to get?
String.prototype.endsWith = function(suffix) {
return this.indexOf(suffix, this.length - suffix.length) !== -1;
};
function getInputsThatEndWith(text) {
var result = new Array();
var inputs = document.getElementsByTagName("input");
for(var i=0; i < inputs.length; i++) {
if(inputs[i].id.endsWith(text))
result.push(inputs[i]);
}
return result;
}
I put it on JsFiddle: http://jsfiddle.net/MF29n/1/
#ThiefMaster touched on how you can do the check, but here's the actual code:
function idEndsWith(str)
{
if (document.querySelectorAll)
{
return document.querySelectorAll('[id$="'+str+'"]');
}
else
{
var all,
elements = [],
i,
len,
regex;
all = document.getElementsByTagName('*');
len = all.length;
regex = new RegExp(str+'$');
for (i = 0; i < len; i++)
{
if (regex.test(all[i].id))
{
elements.push(all[i]);
}
}
return elements;
}
}
This can be enhanced in a number of ways. It currently iterates through the entire dom, but would be more efficient if it had a context:
function idEndsWith(str, context)
{
if (!context)
{
context = document;
}
...CODE... //replace all occurrences of "document" with "context"
}
There is no validation/escaping on the str variable in this function, the assumption is that it'll only receive a string of chars.
Suggested changes to your answer:
RegExp.quote = function(str) {
return str.replace(/([.?*+^$[\]\\(){}-])/g, "\\$1");
}; // from https://stackoverflow.com/questions/494035/#494122
String.prototype.endsWith = function(suffix) {
return !!this.match(new RegExp(RegExp.quote(suffix) + '$'));
};
function getInputsThatEndWith(text) {
var results = [],
inputs = document.getElementsByTagName("input"),
numInputs = inputs.length,
input;
for(var i=0; i < numInputs; i++) {
var input = inputs[i];
if(input.id.endsWith(text)) results.push(input);
}
return results;
}
http://jsfiddle.net/mattball/yJjDV/
Implementing String.endsWith using a regex instead of indexOf() is mostly a matter of preference, but I figured it was worth including for variety. If you aren't concerned about escaping special characters in the suffix, you can remove the RegExp.quote() bit, and just use
new RegExp(suffix + '$').
If you know the type of DOM elements you are targeting,
then get a list of references to them using getElementsByTagName , and then iterate over them.
You can use this optimization to fasten the iterations:
ignore the elements not having id.
target the nearest known parent of elements you want to seek, lets say your element is inside a div with id='myContainer', then you can get a restricted subset using
document.getElementById('myContainer').getElementsByTagName('*') , and then iterate over them.

How can I loop through ALL DOM elements on a page?

I'm trying to loop over ALL elements on a page, so I want to check every element that exists on this page for a special class.
So, how do I say that I want to check EVERY element?
You can pass a * to getElementsByTagName() so that it will return all elements in a page:
var all = document.getElementsByTagName("*");
for (var i=0, max=all.length; i < max; i++) {
// Do something with the element here
}
Note that you could use querySelectorAll(), if it's available (IE9+, CSS in IE8), to just find elements with a particular class.
if (document.querySelectorAll)
var clsElements = document.querySelectorAll(".mySpeshalClass");
else
// loop through all elements instead
This would certainly speed up matters for modern browsers.
Browsers now support foreach on NodeList. This means you can directly loop the elements instead of writing your own for loop.
document.querySelectorAll('*').forEach(function(node) {
// Do whatever you want with the node object.
});
Performance note - Do your best to scope what you're looking for by using a specific selector. A universal selector can return lots of nodes depending on the complexity of the page. Also, consider using document.body.querySelectorAll instead of document.querySelectorAll when you don’t care about <head> children.
Was looking for same. Well, not exactly. I only wanted to list all DOM Nodes.
var currentNode,
ni = document.createNodeIterator(document.documentElement, NodeFilter.SHOW_ELEMENT);
while(currentNode = ni.nextNode()) {
console.log(currentNode.nodeName);
}
To get elements with a specific class, we can use filter function.
var currentNode,
ni = document.createNodeIterator(
document.documentElement,
NodeFilter.SHOW_ELEMENT,
function(node){
return node.classList.contains('toggleable') ? NodeFilter.FILTER_ACCEPT : NodeFilter.FILTER_REJECT;
}
);
while(currentNode = ni.nextNode()) {
console.log(currentNode.nodeName);
}
Found solution on
MDN
As always the best solution is to use recursion:
loop(document);
function loop(node){
// do some thing with the node here
var nodes = node.childNodes;
for (var i = 0; i <nodes.length; i++){
if(!nodes[i]){
continue;
}
if(nodes[i].childNodes.length > 0){
loop(nodes[i]);
}
}
}
Unlike other suggestions, this solution does not require you to create an array for all the nodes, so its more light on the memory. More importantly, it finds more results. I am not sure what those results are, but when testing on chrome it finds about 50% more nodes compared to document.getElementsByTagName("*");
Here is another example on how you can loop through a document or an element:
function getNodeList(elem){
var l=new Array(elem),c=1,ret=new Array();
//This first loop will loop until the count var is stable//
for(var r=0;r<c;r++){
//This loop will loop thru the child element list//
for(var z=0;z<l[r].childNodes.length;z++){
//Push the element to the return array.
ret.push(l[r].childNodes[z]);
if(l[r].childNodes[z].childNodes[0]){
l.push(l[r].childNodes[z]);c++;
}//IF
}//FOR
}//FOR
return ret;
}
For those who are using Jquery
$("*").each(function(i,e){console.log(i+' '+e)});
Andy E. gave a good answer.
I would add, if you feel to select all the childs in some special selector (this need happened to me recently), you can apply the method "getElementsByTagName()" on any DOM object you want.
For an example, I needed to just parse "visual" part of the web page, so I just made this
var visualDomElts = document.body.getElementsByTagName('*');
This will never take in consideration the head part.
from this link
javascript reference
<html>
<head>
<title>A Simple Page</title>
<script language="JavaScript">
<!--
function findhead1()
{
var tag, tags;
// or you can use var allElem=document.all; and loop on it
tags = "The tags in the page are:"
for(i = 0; i < document.all.length; i++)
{
tag = document.all(i).tagName;
tags = tags + "\r" + tag;
}
document.write(tags);
}
// -->
</script>
</head>
<body onload="findhead1()">
<h1>Heading One</h1>
</body>
</html>
UPDATE:EDIT
since my last answer i found better simpler solution
function search(tableEvent)
{
clearResults()
document.getElementById('loading').style.display = 'block';
var params = 'formAction=SearchStocks';
var elemArray = document.mainForm.elements;
for (var i = 0; i < elemArray.length;i++)
{
var element = elemArray[i];
var elementName= element.name;
if(elementName=='formAction')
continue;
params += '&' + elementName+'='+ encodeURIComponent(element.value);
}
params += '&tableEvent=' + tableEvent;
createXmlHttpObject();
sendRequestPost(http_request,'Controller',false,params);
prepareUpdateTableContents();//function js to handle the response out of scope for this question
}
Getting all elements using var all = document.getElementsByTagName("*"); for (var i=0, max=all.length; i < max; i++); is ok if you need to check every element but will result in checking or looping repeating elements or text.
Below is a recursion implementation that checks or loop each element of all DOM elements only once and append:
(Credits to #George Reith for his recursion answer here: Map HTML to JSON)
function mapDOMCheck(html_string, json) {
treeObject = {}
dom = new jsdom.JSDOM(html_string) // use jsdom because DOMParser does not provide client-side Window for element access
document = dom.window.document
element = document.querySelector('html')
// Recurse and loop through DOM elements only once
function treeHTML(element, object) {
var nodeList = element.childNodes;
if (nodeList != null) {
if (nodeList.length) {
object[element.nodeName] = []; // IMPT: empty [] array for parent node to push non-text recursivable elements (see below)
for (var i = 0; i < nodeList.length; i++) {
console.log("nodeName", nodeList[i].nodeName);
if (nodeList[i].nodeType == 3) { // if child node is **final base-case** text node
console.log("nodeValue", nodeList[i].nodeValue);
} else { // else
object[element.nodeName].push({}); // push {} into empty [] array where {} for recursivable elements
treeHTML(nodeList[i], object[element.nodeName][object[element.nodeName].length - 1]);
}
}
}
}
}
treeHTML(element, treeObject);
}
Use *
var allElem = document.getElementsByTagName("*");
for (var i = 0; i < allElem.length; i++) {
// Do something with all element here
}
i think this is really quick
document.querySelectorAll('body,body *').forEach(function(e) {
You can try with
document.getElementsByClassName('special_class');

Getting the contents of an element WITHOUT its children [duplicate]

This question already has answers here:
How to get the text node of an element?
(11 answers)
Closed 6 months ago.
I have a mild preference in solving this in pure JS, but if the jQuery version is simpler, then jQuery is fine too. Effectively the situation is like this
<span id="thisone">
The info I want
<span id="notthisone">
I don't want any of this nonsense
</span>
</span>
I effectively want to get
The info I want
but not
The info I want I don't want any of this nonsense
and I especially don't want
The info I want <span id="notthisone"> I don't want any of this nonsense </span>
which is unfortunately what I am getting right now...
How would I do this?
With js only:
Try it out: http://jsfiddle.net/g4tRn/
var result = document.getElementById('thisone').firstChild.nodeValue;
​alert(result);​
With jQuery:
Try it out: http://jsfiddle.net/g4tRn/1
var result = $('#thisone').contents().first().text();
alert(result);​
Bonus:
If there are other text nodes in the outer <span> that you want to get, you could do something like this:
Try it out: http://jsfiddle.net/g4tRn/4
var nodes = document.getElementById('thisone').childNodes;
var result = '';
for(var i = 0; i < nodes.length; i++) {
if(nodes[i].nodeType == 3) { // If it is a text node,
result += nodes[i].nodeValue; // add its text to the result
}
}
alert(result);
​
If you just want the first child then it's rather simple. If you are looking for the first text-only element then this code will need some modification.
var text = document.getElementById('thisone').firstChild.nodeValue;
alert(text);
Have you tried something like this?
var thisone = $("#thisone").clone();
thisone.children().remove();
var mytext = thisone.html();
FROM: http://viralpatel.net/blogs/2011/02/jquery-get-text-element-without-child-element.html
$("#foo")
.clone() //clone the element
.children() //select all the children
.remove() //remove all the children
.end() //again go back to selected element
.text(); //get the text of element
Pure JavaScript
In this pure JavaScript example, I account for the possibility of multiple text nodes that could be interleaved with other kinds of nodes. Pass a containing NodeList in from calling / client code.
function getText (nodeList, target)
{
var trueTarget = target - 1;
var length = nodeList.length; // Because you may have many child nodes.
for (var i = 0; i < length; i++) {
if ((nodeList[i].nodeType === Node.TEXT_NODE) && (i === trueTarget)) {
return nodeList.childNodes[i].nodeValue;
}
}
return null;
}
You might use this function to create a wrapper function that uses this one to accumulate multiple text values.
To get a string of the child text nodes and not the element or other child nodes from a given element:
function getTextNodesText(el) {
return Array.from(el.childNodes)
.filter((child) => child.nodeType === Node.TEXT_NODE)
.map((child) => child.textContent)
.join("");
}

Categories