Is it normal that JavaScript can create otherwise invalid DOM? - javascript

Somewhat by accident, I found out that a span inserted directly inside a tbody stays in place when done with JavaScript (insertBefore), where such invalid DOM would if created with literal HTML lead to the span being placed before the entire table.
I expected either the same behaviour as with literal HTML or some DOM Exception being thrown.
E.g. this HTML
<table>
<thead><tr><th>Table Header</th></td></thead>
<tbody>
<span>from HTML → goes up</span>
<tr><td>Table Contents</td></tr>
</tbody>
</table>
with this JavaScript:
var span = document.createElement('span'),
tbody = document.querySelector('tbody');
span.innerHTML = 'Created with JS → stays in place';
tbody.insertBefore(span, tbody.querySelector('tr'));
renders "Created with JS → stays in place" between the header and the first row; the original, literal, span moves outside of the table.
Is this normal, and can/should I count on this? (It behaves the same in FF, Chrome, Opera, IE >= 9 (not tested below)).
Also, is there a way to query the DOM whether content of a certain type would (under normal circumstances) be valid at a certain point in the DOM? This is actually what I wanted to do when I found out about this quirk (which it is, imho).
The fiddle is here: http://jsfiddle.net/xr37g9kw/2/

As for "is this normal, and can/should I count on this?" Sadly, yes. But mostly you should be aware of the node types you are working with. NB, in case of table, there are a handful of not so well known DOM methods (HTMLTableElement.rows. InsertRow() and so on).
As for "is there a way to query the DOM whether content of a certain type would (under normal circumstances) be valid at a certain point in the DOM?" nothing built-in for this exact purpose, but you could exploit one native feature of JavaScript -> DOM API: you can let browser to re-parse HTML chunk in the "literal way". Yes, I am speaking about innerHTML.
In your fiddle, adding**tbody.outerHTML = tbody.outerHTML** "fixes" the structure, so you could hypothetically take some DOM node, look at its DOM tree, clone, "re-eval" it and compare with original.

Yes, that is the default behavior, as you saw. It's not valid HTML of course as you can check here http://www.freeformatter.com/html-validator.html if you input the HTML.

Related

Why "document.title" and NOT "document.head.title"? RE: Traversing the DOM

I am just beginning to learn client-side JavaScript and using an online tutorial, so please bear with me.
This question is based on my understanding of the following:
To access the properties of the document's body, the syntax is "document.body", which returns all the elements in the body.
Similarly when you access the head, you use "document.head". Makes sense and most importantly, it works.
However, when I attempt to access elements WITHIN the body or head following the same logic, I get a return value of "undefined". For example, document.body.h1, returns "undefined", in spite of there being an h1 element inside the body element.
Further, when I enter document.head.title -- "undefined".
Strangely, however, when I enter "document.title", it returns the string value associated with the title tag.
I thought in order to access the title, you would have to access it through the head, since it is an element nested inside the head. But ok, that's fine. Using the same logic, I should then be able to enter document.h1 and get its value. Nope, instead, I get undefined.
Would someone be kind enough to explain to me why this behavior is so inconsistent. Thanks in advance.
You've really asked two questions:
Why document.title rather than document.head.title?
and
Why doesn't document.body.h1 return an element if there's an h1 in the body?
document.title
document.title is historical. Various parts of the browser environment were developed somewhat ad hoc by multiple different people/organizations in the 1990s. :-) That said, it's the title of the document, so this isn't an unreasonable place to put it, even if you use the title tag in head.
document.body.h1
One answer is: Because no one decided to design it that way. There were some early things like document.all (a list of all elements in the document) and even tag-specific ones (I forget exactly what they were, but they weren't a million miles off your document.body.h1 — I think document.tags.h1 or something, where again it was a list.)
But another answer is: Because the DOM is a tree. body can have multiple h1 elements, both as direct children and as children of children (or deeper); collectively, descendants. Creating automatic lists with all of these proved not to be scalable to large documents.
Instead, you can query the DOM (either the entire document, or just the contents of a specific element) via a variety of methods:
getElementById - (Just on document) Get an element using its id attribute value.
querySelector - Find the first element matching a CSS selector (can use it on document or on an element). Returns null if there were no matches.
querySelectorAll - Get a list of all elements matching a CSS selector (can use it on document or on an element). You can rely on getting back a list; its length may be 0, of course.
getElementsByTagName - Get a list of all elements with a given tag name (such as "h1").
getElementsByClassName - (No support in IE8 and earlier) Get a list of all elements with a given class.
There are many more. See MDN's web documentation and/or the WHAT-WG DOM Standard for more.
Some of the automatic lists persist (they got so much use that they had to be maintained/kept), such as document.forms, document.links, the rows property on HTMLTableElement and HTMLTableSectionElement instances, the cells property on HTMLTableRowElement instances, and various others.
document.head.title is a thing... but not what you might think.
title is an attribute that is applicable to all html elements; that is, it is a global attribute. It's meaning is 'advisory information'; one use is to display a tooltip:
<span title="hover over me and you'll see this">information</span>
So, all elements have a title attribute - including head. The title element - which is completely different - should be a child of the head though. So you might be tempted to set its value via document.head.title = "my title" , but document.head.title is not the head's title element, it's a property of the head element.
What you're actually doing is setting the title property on the head element:
<head title="my title">.... </head>
... which isn't what you want at all.
The correct way to set the title is document.title, which is a shortcut way of doing
document.querySelector("title").innerText = "my title"

How do I get reference to the ::before or ::after node in JS?

As far as I know, standard JavaScript has no way to get at the ::before or ::after pseudo-elements. Element.children doesn't let you get to it.
I know there has to be a way, at least in Chrome-privileged Firefox add-on code, since it lists every ::before element in the page (and apparently getComputedStyle() works on it too, as you can list all styles of it in inspector, which is written in JavaScript).
Where is this API documented, and is it something that's different and privileged-only in say Firefox and Chrome browser, or something that is on track to be standard soon?
The CSS generated content is not part of the DOM, and you wouldn't be able to do much with the ::before/::after pseudo-elements, even if you get at them. The only use-cases I can think of are:
Access the CSS computed values on the pseudo-elements. window.getComputedStyle() supports this via an optional 2nd parameter.
Enumerate the generated content. You can accomplish this:
by using a browser-specific API. In Firefox, the DevTools inspector uses a special interface - inIDeepTreeWalker.
or by walking the DOM and checking (for each element) if it has content in its computed style for :before / :after. For example:
window.getComputedStyle(elt, ':before').content
Get the "live" value of a counter defined in CSS, like in How to access CSS generated content with JavaScript - see that question for details.
At least to me, your question is unclear as to exactly what you are attempting to do, or get.
The most direct equivalent to ::before and ::after:
If you are wanting to actually insert content, which is what the ::before and ::after CSS selectors do, then the most direct equivalent is Element.insertAdjacentHTML(position, text). In that case:
The equivalent of ::before would be:
Element.insertAdjacentHTML("beforebegin", "<p>Additional HTML content before element.</p>");
The equivalent of ::after would be:
Element.insertAdjacentHTML("afterend", "<p>Additional HTML content after element.</p>");
Element.insertAdjacentHTML() also has options of afterbegin and beforeend which insert the HTML text just after the beginning, or just before the end, of the referenced Element.
Alternately:
You could insert nodes using Node.insertBefore(newNode, referenceNode).
For ::before it would be (insert newNode before myNode):
myNode.parentNode.insertBefore(newNode, myNode);
For ::after it would be (insert newNode after myNode):
myNode.parentNode.insertBefore(newNode, myNode.nextSibling);
Obtaining references:
If you are attempting to get a reference to the element that is earlier in the DOM, then it sounds like you are looking for Node.previousSibling. If you are looking for a reference to the element that is later in the DOM, then you are looking for Node.nextSibling.
In DOM walk order:
It is also possible that you are looking for the elements that are just before and just after the reference Node in DOM walk order. However, that is not really what the CSS selectors ::before and ::after do. However, from your mention of Page Inspector, it kind of sounds like this is what you want. If so, then you will can use a TreeWalker to walk the DOM tree.
The following should do what you want (Note: Currently untested, so might be missing something.):
//referenceNode is the node for which we want to find the elements
// before and after in DOM walk order.
//Create the TreeWalker
let treeWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_ELEMENT,
{acceptNode: function(node) {
return NodeFilter.FILTER_ACCEPT;
}
},
false );
//Point the TreeWalker at the referenceNode.
treeWalker.currentNode = referenceNode;
//Get the node immediately prior to the referenceNode in DOM walk order
let thePreviousNode = treeWalker.previousNode();
//Point the TreeWalker back at the referenceNode.
treeWalker.currentNode = referenceNode;
//Get the node immediately after to the referenceNode in DOM walk order
let theNextNode = treeWalker.nextNode();
As mentioned by Nickolay, if you want the full detail that Page Inspector, or the DOM Inspector (documentation), provides then you will need to use an inIDeepTreeWalker. However, it is unlikely that you want, or need, the detail which using that Firefox specific non-standard interface provides. You only need it if you want to walk through how something like how an XUL <toolbarbutton> is constructed (not the attributes/properties, but the XBL which makes up a XUL elements like a <toolbarbutton>). For the vast majority of what you are potentially thinking about, a standard TreeWalker should be just fine.
With the exception of inIDeepTreeWalker, all of the above are standard parts of JavaScript and do not require elevated privileges (i.e do not require it to be in an add-on).
You can use iniDOMUtils - selectorMatchesElement() function.
You can read more about it here - https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Reference/Interface/inIDOMUtils#selectorMatchesElement%28%29

innerText vs innerHTML vs label vs text vs textContent vs outerText

I have a dropdown list which is populated by Javascript.
Whilst deciding what should be the default value to show on load, I realised that the following properties showed exactly the same values:
innerText
innerHTML
label
text
textContent
outerText
My own research shows bench marking tests or comparisons between a few of them, but not all.
I can use my own common sense and choose 1 or the other as they provide the same result, but, I'm concerned this is not going to be a good idea if the data were to change.
My findings are:
innerText will show the value as is and ignores any HTML formatting which may be included
innerHTML will show the value and apply any HTML formatting
label appears to be the same as innerText, so I can't see the difference
text appears to be the same as innerText but the jQuery shorthand version
textContent appears to the same as innerText but keeps formatting (such as \n)
outerText appears to be the same as innerText
My research can only take me so far as I can only test what I can think of or read what is published, can any one confirm though if my research is correct and if there is anything special about label and outerText?
From MDN:
Internet Explorer introduced element.innerText. The intention is pretty much the same [as textContent] with a couple of differences:
Note that while textContent gets the content of all elements, including <script> and <style> elements, the mostly equivalent IE-specific property, innerText, does not.
innerText is also aware of style and will not return the text of hidden elements, whereas textContent will.
As innerText is aware of CSS styling, it will trigger a reflow, whereas textContent will not.
So innerText will not include text that is hidden by CSS, but textContent will.
innerHTML returns the HTML as its name indicates. Quite often, in order to retrieve or write text within an element, people use innerHTML. textContent should be used instead. Because the text is not parsed as HTML, it's likely to have better performance. Moreover, this avoids an XSS attack vector.
In case you missed that, let me repeat it more clearly: Do not use .innerHTML unless you specifically intend to insert HTML within an element and have taken the necessary precautions to ensure that the HTML you are inserting cannot contain malicious content. If you only want to insert text, use .textContent or if you need to support IE8 and earlier, use feature detection to switch off between .textContent and .innerText.
A main reason that there are so many different properties is that different browsers originally had different names for these properties, and there still isn't complete cross-browser support for all of them. If you are using jQuery, you should stick to .text() since that is designed to smooth out cross-browser differences.*
For some of the others: outerHTML is basically the same as innerHTML, except that it includes the start and end tags of the element it belongs to. I can't seem to find much description of outerText at all. I think that is probably an obscure legacy property and should be avoided.
Addendum to JLRishe's otherwise excellent answer:
The reason innerText and outerText both exist is for symmetry with innerHTML and outerHTML. This becomes important when you assign to the property.
Suppose you've got an element e with HTML code <b>Lorem Ipsum</b>:
e.innerHTML = "<i>Hello</i> World!"; => <b><i>Hello</i> World!</b>
e.outerHTML = "<i>Hello</i> World!"; => <i>Hello</i> World!
e.innerText = "Hello World!"; => <b>Hello World!</b>
e.outerText = "Hello World!"; => Hello World!
A dropdown list comprises a collection of Option objects, so you should use the .text property to inspect the textual representation of the element, i.e.
<option value="123">text goes here</option>
^^^^^^^^^^^^^^
Btw,
.text appears to be the same as .innerText but the JQuery shorthand version
That's not correct; $(element).text() is the jQuery version whereas element.text is the property access version.
text and label remove extra spaces. I got these results when querying options in a dropdown:
e.textContent = "A B C D "
e.text = "A B C D"
e.label = "A B C D"
textContent will not format (\n)
See the browsers compatibility http://www.quirksmode.org/dom/html/ if you are targeting specific browsers. Because it seems like they all have their own way of doing things. That is why is is better to use JQuery .text() (http://api.jquery.com/text/) if you do not want to fiddle around.

how to use js to do very basic syntax coloring?

here's a very simple js but i don't know where to begin.
in a html page, if some text is enclosed by angle brackets, like this:
〈some text〉
i want the text to be colored (but not the brackets).
in normal html, i'd code it like this
〈<span class="booktitle">some text</span>〉
So, my question is, how do i start to write such a js script that search the text and replace it with span tags?
some basic guide on how to would be sufficient. Thanks.
(i know i need to read the whole html, find the match perhaps using regex, then replace the page with the new one. But have no idea how that can be done with js/DOM. Do i need to traverse every element, get their inner text, do possible replacement? A short example would be greatly appreciated.)
It depends partially on how cautious you need to be not to disturb event handlers on the elements you're traversing. If it's your page and you're in control of the handlers, you may not need to worry; if you're doing a library or bookmarklet or similar, you need to be very careful.
For example, consider this markup:
<p>And <a href='foo.html'>the 〈foo〉 is 〈bar〉</a>.</p>
If you did this:
var p = /* ...get a reference to the `p` element... */;
p.innerHTML = p.innerHTML.replace(/〈([^〉]*)〉/g, function(whole, c0) {
return "〈<span class='booktitle'>" + c0 + "</span>〉";
});
(live example) (the example uses unicode escapes and HTML numeric entities for 〈 and 〉 rather than the literals above, because JSBin doesn't like them raw, presumably an encoding issue)
...that would work and be really easy (as you see), but if there were an event handler on the a, it would get blown away (because we're destroying the a and recreating it). But if your text is uncomplicated and you're in control of the event handlers on it, that kind of simple solution might be all you need.
To be (almost) completely minimal-impact will require walking the DOM tree and only processing text nodes. For that, you'd be using Node#childNodes (for walking through the DOM), Node#nodeType (to know what kind of node you're dealing with), Node#nodeValue (to get the text of a text node), Node#splitText (on the text nodes, to split them in two so you can move one of them into your span), and Node#appendChild (to rehome the text node that you need to put in your span; don't worry about removing them from their parent, appendChild handles that for you). The above are covered by the DOM specification (v2 here, v3 here; most browsers are somewhere between the two; the links in the text above are to the DOM2 spec).
You'll want to be careful about this sort of case:
<p>The 〈foo <em>and</em> bar〉.</p>
...where the 〈 and the 〉 are in different text nodes (both children of the p, on either side of an em element); there you'll have to move part of each text node and the whole of the em into your span, most likely.
Hopefully that's enough to get you started.
If the text could be anywhere in the page, you have to traverse through each DOM element, split the text when you found a match using a regex.
I have put my code up there on jsfiddle: http://jsfiddle.net/thai/RjHqe/
What it does: It looks at the node you put it in,
If it's an element, then it looks into every child nodes of it.
If it's a text node, it finds the text enclosed in 〈angle brackets〉. If there is a match (look at the first match only), then it splits the text node into 3 parts:
left (the opening bracket and also text before that)
middle (the text inside the angle bracket)
right (the closing bracket and text after it)
the middle part is wrapped inside the <span> and the right part is being looked for more angle brackets.

JavaScript & copy style

I am copying a table cell with javascript.
It works fine, just that it doesn't copy the style.
I wanted to copy like below, but that didn't work.
newCell.style=oldCell.style;
So I figured that for my text-align, I have to copy it like this:
newCell.style.textAlign=oldCell.style.textAlign;
That worked, but whenever I add a new style item, I have to remember to register it here.
So, my problem now is how can I loop over the style and copy every item in there?
With chrome, I managed to do it like this:
var strAttribute = GetDomNameFromAttributeName(oRow.cells[1].style[0]);
var styletocopy = eval('oRow.cells[1].style.'+strAttribute);
eval("newCell.style."+strAttribute+"='"+styletocopy+"'"); // //newCell.style.textAlign='center';
But that doesn't work with IE. Haven't tested it with FF, but assume chrome compatibiity.
Is there any way to loop over the style elements in IE?
Or is there any better way to copy all style elements?
eval('oRow.cells[1].style.'+strAttribute)
Never use eval like this(*). In JavaScript you can access a property whose name is stored in a string using square brackets. object.plop is the same as object['plop']:
to.style[name]= from.style[name];
(*: never use eval at all if you can help it. There are only a few very specific and rare occasions you need it.)
Is there any way to loop over the style elements
The style object is supposed to support the DOM Level 2 CSS CSSStyleDeclaration interface. You could loop over the rules and apply them to another element like this:
for (var i= from.style.length; i-->0;) {
var name= from.style[i];
to.style.setProperty(name,
from.style.getPropertyValue(name),
priority= from.style.getPropertyPriority(name)
);
}
in IE?
No, IE does not support the whole CSSStyleDeclaration interface and the above won't work. However there is a simpler way not involving looping that will work on IE and the other browsers too:
to.style.cssText= from.style.cssText;
As simple as that! IE doesn't quite preserve the CSS text the way it should, but the difference doesn't matter for simple inline style copying.
However, as Pikrass said (+1), if you are copying a whole element and not just the styles, cloneNode is by far the most elegant way to do that.
You can copy a DOM Element with all its content (including attributes) with .cloneNode(true) :
var clonedTr = document.getElementById('id').cloneNode(true);
Then clonedTr is an exact copy of the tr #id.
The "true" means you want to copy the content of the element.
To copy all style elements from one node to another you can use
newCell.setAttribute('style', oRow.cells[1].getAttribute('style'))

Categories