Get a text from HTML using javascript

Get a text from HTML using javascript - javascript

I am a beginner in JavaScript.
I'm trying to understand how can I work with DOM in js...
I would like to get a text from some website, from every DIV no matter how complex the structure is.
If I run my code below it will give me the text but:
patern div give me his text and text from child div... then child div give me his text...
So a get a lot of repeated text.
var items = document.body.getElementsByTagName('*');
for(var i=0; i<items.length; i++)
{
document.write(items[i].textContent);
}
It's important to me to know the node of current text So I can't use this because I get the text but I don't know the nodes of text:
var body = document.body, textContent = 'textContent' in body ? body.textContent : body.innerText;
document.write(textContent);
I know the resolution is jQuery., but I'm trying to understand how to do this in JS.

You can try getting all the matching elements using the jQuery .get() function.
Example:
var elements = [];
elements = $('div').get();
Once you have all the elements you can then grab the text, if any from each element and store it in another array like so:
var textStrings = [];
var len = elements.length;
for(var a = 0; a < len; a++)
textStrings[a] = elements[a].text();
The second code block will run through the first array of elements and pull the text from each one, saving it in a second array called textStrings. The 'len' variable is used so to optimize the javascript code and prevent the browser from evaluating 'elements.length' each time through the loop.
Hope this helps.

The usual textContent or innerHTML approach fails in this situation, as the .innerHTML contains the HTML of the child nodes.
But there is another property you can use: childNodes. This list does not only contain the child elements, but all child nodes, including text nodes:
var items = document.body.getElementsByTagName('*');
for(var i=0; i<items.length; i++)
{
var currItem = items[i];
for(var j = 0; j < currItem.childNodes.length; ++j)
{
if(currItem.childNodes[j].nodeName === "#text")
{
// the current child node is a text node
document.write(items[i].textContent);
}
}
}
Since every node has a unique position in the DOM this will ensure that every node is written exactly once.
References:
W3C: DOM Level 3 (Official documents)
Node.nodeName
Node.childNodes
Mozilla Developer Network (Easier to understand)
Node.nodeName
Node.childNodes

Related

Get Elements By Class Name Nodelist does not apply new styles on every element [duplicate]

if I use
var temp = document.querySelectorAll(".class");
for (var i=0, max=temp.length; i<max; i++) {
temp[i].className = "new_class";
}
everything works fine. All nodes change their classes.
But, with gEBCN:
var temp = document.getElementsByClassName("class");
for (var i=0, max=temp.length; i<max; i++) {
temp[i].className = "new_class";
}
I get error. Code jumps out of the loop at some point, not finishing the job with msg "can't set className of null".
I understand that this is static vs live nodelist problem (I think), but since gEBCN is much faster and I need to traverse through huge list of nodes (tree), I would really like to use getElementsByClassName.
Is there anything I can do to stick with gEBCN and not being forced to use querySelectorAll?

That's because HTMLCollection returned by getElementsByClassName is live.
That means that if you add "class" to some element's classList, it will magically appear in temp.
The oposite is also true: if you remove the "class" class of an element inside temp, it will no longer be there.
Therefore, changing the classes reindexes the collection and changes its length. So the problem is that you iterate it catching its length beforehand, and without taking into account the changes of the indices.
To avoid this problem, you can:
Use a non live collection. For example,
var temp = document.querySelectorAll(".class");
Convert the live HTMLCollection to an array. For example, with one of these
temp = [].slice.call(temp);
temp = Array.from(temp); // EcmaScript 6
Iterate backwards. For example, see #Quentin's answer.
Take into account the changes of the indices. For example,
for (var i=0; i<temp.length; ++i) {
temp[i].className = "new_class";
--i; // Subtract 1 each time you remove an element from the collection
}
while(temp.length) {
temp[0].className = "new_class";
}

Loop over the list backwards, then elements will vanish from the end (where you aren't looking any more).
for (var i = temp.length - 1; i >= 0; i--) {
temp[i].className = "new_class";
}
Note, however, that IE 8 supports querySelectorAll but not getElementsByClassName, so you might want to prefer querySelectorAll for better browser support.
Alternatively, don't remove the existing class:
for (var i=0, max=temp.length; i<max; i++) {
temp[i].className += " new_class";
}

Why can't the same node be inserted at two different places in a document?

I am trying to insert a button node in a document but for some reasons, it's not getting inserted at both places.
the code is below:
var elements = document.querySelectorAll('.Tabelle-Titel-nur-oben');
var buttonElement = document.createElement("Button");
var t0 = document.createTextNode("CLICK ME");
buttonElement.appendChild(t0);
for(var i = 0; i< elements.length; i++)
{
document.body.insertBefore(buttonElement, elements[i]);
}
In my code, there are two elements which get matched for querySelectorAll. But my button is only get inserted at the second element. If I use two different button instances it works. I would like to know why a button instance does not get inserted in two places?

Since your buttonElement is a reference to the same object, you need to clone it before adding it:
var elements = document.querySelectorAll('.Tabelle-Titel-nur-oben');
var buttonElement = document.createElement("Button");
var t0 = document.createTextNode("CLICK ME");
buttonElement.appendChild(t0);
for(var i = 0; i< elements.length; i++)
{
var btnClone = buttonElement.clone(true);
document.body.insertBefore(btnClone, elements[i]);
}
Or create the button within the loop as #Roberrrt just pointed out as I was about to hit submit.

It comes down to the structure of the page. An HTML page is represented by the Document Object Model. This is a Tree Structure.
In a tree structure a node can have children. Allowing a node to be in two places at once would change the DOM from a Tree into a Directed Acyclic Graph. If one node was an ancestor of another node that appeared as a child of itself, that would make it a graph with cycles (i.e. loops).
This doesn't match the structure of HTML.
If you want something to appear twice in the document, it has to appear twice in the Document Object Model. Even if two objects appear to be the same object twice, they're really two different but identical objects.

The instance of the buttonElement only exists once, you'll have to recreate it again (or clone the initial one, as #Brian suggested) for it to be placed multiple times. Fortunately, you already loop through your nodelist, so you utilize this to create a new button per instance:
var elements = document.querySelectorAll('.Tabelle-Titel-nur-oben');
for(var i = 0; i< elements.length; i++) {
var buttonElement = document.createElement("Button");
var t0 = document.createTextNode("CLICK ME");
buttonElement.appendChild(t0);
document.body.insertBefore(buttonElement, elements[i]);
}

Why do these `element` objects have the `search` property?

Howdy guys.
I am wondering why does the following code snippet work (tested in the latest Firefox Nightly):
var links = document.querySelectorAll('a[href]');
for (var i = 0; i < links.length; ++i) {
console.log(links[i].search); // Where does `search` come from?
}
As “usual,” I get the query string of the href in each a element (something I can also do with a simple substr or something, but that's not the point); whereas, if I do something like this:
var divs = document.querySelectorAll('div');
for (var i = 0; i < divs.length; ++i) {
console.log(divs[i].search);
}
All I get is undefined.
According to MDN, there is no such thing as search property available for element objects (document.querySelectorAll(selector) returns a non-live NodeList of element objects). So, where does all this come from?
Any help would be greatly appreciated.

Different sorts of HTML element nodes in the DOM have different APIs. The nodes corresponding to <a> tags implement an API for examining URLs. The "search" property is one of those special type-specific things. Basically an <a> node has the same properties as window.location, more or less.

surroundContents() make changes `live`?

Links to live examples # jsfiddle & jsbin.
So this function:
function symbolize(e){
var elements = e.childNodes; // text nodes are necessary!
console.log(elements);
for(var i=0; i < elements.length; i++){
t = elements[i];
var range = document.createRange(), offset = 0, length = t.nodeValue.length;
while(offset < length){
range.setStart(t, offset); range.setEnd(t, offset + 1);
range.surroundContents(document.createElement('symbol'));
offset++;
}
}
}
..should iterate over every letter and wrap it in a <symbol/> element. But it doesn't seem to be working.
So I added the console.log(); right after the *.childNodes have been fetched, but as you'll see in the example site above, the log contains 2 unexpected elements in front(!) of the array. And yeah, because of this, I have a feeling that surroundContents(); make the changes live(!). couldn't find any reference on this though
One of the elements is an empty Text node, the other is my <symbol/>. But yeah, this is totally unexpected result and messes up the rest of the function.
What could be wrong with it?
Thanks in advance!
Update
Oh, looks like the elements are added on Chrome, Firefox doesn't add the elements, but still halts the function.

Element.childNodes is indeed a live list , it could not be otherwise (that would mean an incorrect list of nodes). The easiest solution is to freeze (make a copy of) it before you mess with it (by surrounding existing ranges).
var elements = Array.prototype.slice.call(e.childNodes, 0);
https://developer.mozilla.org/en/childNodes it's of type NodeList
https://developer.mozilla.org/En/DOM/NodeList those are live lists

Trying to append to an element using js?

I am trying to add to all <object>'s on a page a snippet of html. I understand I can access elements by tag name and that I can change the element, but can I simple append to it instead?
In addition, I want to add it to the contents of each tag, not the end of the document. Which of these methods will work?

Assuming no library...
var elementToAppend = document.createElement(tagName); // Your tag name here
// Set attributes and properties on elementToAppend here with
// elementToAppend.attribute = value (eg elementToAppend.id = "some_id")
// You can then append child text and elements with DOM methods
// (createElement or createTextNode with appendChild)
// or with innerHTML (elementToAppend.innerHTML = "some html string")
var objects = document.getElementsByTagName('object');
for(var i = 0; i < objects.length; i++) {
elementToAppend = elementToAppend.cloneNode(true);
objects[i].appendChild(elementToAppend);
}
Using innerHTML or outerHTML as other answers have suggested will likely cause problems for whatever you've embedded with <object>.

appendChild is what you're looking for.

We Keep Coding

JavaScript is the programming language of the Web.

Get a text from HTML using javascript - javascript

Related

Get Elements By Class Name Nodelist does not apply new styles on every element [duplicate]

Why can't the same node be inserted at two different places in a document?

Why do these `element` objects have the `search` property?

surroundContents() make changes `live`?

Trying to append to an element using js?

Categories

Resources