Elements outside document - javascript

I was just reading this article by MDN and saw that, according to the specifications for "document.getElementById", elements not in the document are not searched.
I'm confused by why/how elements can be outside the document. How does this differ from the definition of an absolutely positioned element (namely, that absolutely positioned elements are removed from the document flow). I'm not entirely sure the absolutely positioned element case applies to this, but a clarification on what it means to be "outside the document" and why something like that would be used would be greatly appreciated.

A document is a tree, but you can have nodes (leaves/branches) that aren't on the tree (either because they never were, or because they've come off it).
Examples will probably make this clearer.
Example 1: Never in the tree:
Here's an element that's not in any document:
var elm = document.createElement('div');
elm.id = "foo";
That's an element, with an id, but it isn't part of any document.
Example 2: Removed from the tree:
HTML:
<body>
<div id="foo"></div>
</body>
JavaScript:
// The div is in the document, so this works:
var elm = document.getElementById("foo");
// Now we remove it:
elm.parentNode.removeChild(elm);
// 'elm' is no longer in any document
console.log(document.getElementById("foo")); // null
Example 2 Live | Source

"not in the document" means "not stored in the DOM tree of the current document", i.e. the nodes exist in memory, but they're not "attached" to any node on the page.
A corrollary of that is that the elements must therefore be invisible, but it's for an entirely different reason to that of absolute position. In the latter, the nodes do exist in the DOM, they just may not be position "on screen".

Related

Why is there no HTMLSectionElement and no HTMLArticleElement in Javascript?

You can test whether an element is a div or a span like this:
const div = document.createElement('div');
console.log(div instanceof HTMLDivElement);
const span = document.createElement('span');
console.log(span instanceof HTMLSpanElement);
This way of testing so far has worked for most HTML elements I'm aware of.
Unfortunately, the same approach of checking an element type is not available for section and article elements, which would mean I'd probably have to resort to el.tagName === 'SECTION' respectively el.tagName === 'ARTICLE'.
Edit: Just tested, the following globals all don't exist either:
HTMLNavElement
HTMLHeaderElement
HTMLMainElement
HTMLAsideElement
HTMLFooterElement
Does anyone know, and have any reference, as of why there are no HTMLSectionElement and HTMLArticleElement globals?
Is this because all of them are technically div elements with a different tag name to provide better semantics?
Thanks to #pointy for his comment pointing this out, the following section of the current HTML specification explains this:
The basic interface, from which all the HTML elements' interfaces inherit, and which must be used by elements that have no additional requirements, is the HTMLElement interface.

Why "document.title" and NOT "document.head.title"? RE: Traversing the DOM

I am just beginning to learn client-side JavaScript and using an online tutorial, so please bear with me.
This question is based on my understanding of the following:
To access the properties of the document's body, the syntax is "document.body", which returns all the elements in the body.
Similarly when you access the head, you use "document.head". Makes sense and most importantly, it works.
However, when I attempt to access elements WITHIN the body or head following the same logic, I get a return value of "undefined". For example, document.body.h1, returns "undefined", in spite of there being an h1 element inside the body element.
Further, when I enter document.head.title -- "undefined".
Strangely, however, when I enter "document.title", it returns the string value associated with the title tag.
I thought in order to access the title, you would have to access it through the head, since it is an element nested inside the head. But ok, that's fine. Using the same logic, I should then be able to enter document.h1 and get its value. Nope, instead, I get undefined.
Would someone be kind enough to explain to me why this behavior is so inconsistent. Thanks in advance.
You've really asked two questions:
Why document.title rather than document.head.title?
and
Why doesn't document.body.h1 return an element if there's an h1 in the body?
document.title
document.title is historical. Various parts of the browser environment were developed somewhat ad hoc by multiple different people/organizations in the 1990s. :-) That said, it's the title of the document, so this isn't an unreasonable place to put it, even if you use the title tag in head.
document.body.h1
One answer is: Because no one decided to design it that way. There were some early things like document.all (a list of all elements in the document) and even tag-specific ones (I forget exactly what they were, but they weren't a million miles off your document.body.h1 — I think document.tags.h1 or something, where again it was a list.)
But another answer is: Because the DOM is a tree. body can have multiple h1 elements, both as direct children and as children of children (or deeper); collectively, descendants. Creating automatic lists with all of these proved not to be scalable to large documents.
Instead, you can query the DOM (either the entire document, or just the contents of a specific element) via a variety of methods:
getElementById - (Just on document) Get an element using its id attribute value.
querySelector - Find the first element matching a CSS selector (can use it on document or on an element). Returns null if there were no matches.
querySelectorAll - Get a list of all elements matching a CSS selector (can use it on document or on an element). You can rely on getting back a list; its length may be 0, of course.
getElementsByTagName - Get a list of all elements with a given tag name (such as "h1").
getElementsByClassName - (No support in IE8 and earlier) Get a list of all elements with a given class.
There are many more. See MDN's web documentation and/or the WHAT-WG DOM Standard for more.
Some of the automatic lists persist (they got so much use that they had to be maintained/kept), such as document.forms, document.links, the rows property on HTMLTableElement and HTMLTableSectionElement instances, the cells property on HTMLTableRowElement instances, and various others.
document.head.title is a thing... but not what you might think.
title is an attribute that is applicable to all html elements; that is, it is a global attribute. It's meaning is 'advisory information'; one use is to display a tooltip:
<span title="hover over me and you'll see this">information</span>
So, all elements have a title attribute - including head. The title element - which is completely different - should be a child of the head though. So you might be tempted to set its value via document.head.title = "my title" , but document.head.title is not the head's title element, it's a property of the head element.
What you're actually doing is setting the title property on the head element:
<head title="my title">.... </head>
... which isn't what you want at all.
The correct way to set the title is document.title, which is a shortcut way of doing
document.querySelector("title").innerText = "my title"

puzzled about document.documentElement

I got puzzled about it .
1 Is document equal to document.documentElement? I think they are both root node.
2 Why I can use document.documentElement.getElementsByTagName() but I can not use
document.documentElement.getElementById()?
There is a difference between the document object and the document element.
When an HTML document is loaded into a web browser, it becomes a document object.
The document object is the root node of the HTML document and the common ancestor of all other nodes, such as element nodes (including the document element), text nodes and attribute nodes.
One of the differences is that an element has getElementsByTagName() but not getElementById(), which is part of the document itself.
To successfully use an element to get another one based on ID, you need to go through its document:
var elem2 = elem1.ownerDocument.getElementById(whatever)

First element of NodeList in document.body.childNodes

There is HTML page with contents like this.
Documentation at MDN says that childNodes returns a collection of child nodes of the given element which is a NodeList.
So, according to the doc, the first child for the NodeList should be <h1>PyCon Video Archive</h1>.
But, in Developer Tools (Chromium), it says the other way.
![enter image description here][2]
So, why exactly the first node is not <h1>PyCon Video Archive</h1>?
Why a text object as first element?
I would appreciate some help here.
EDIT
So, I just figured out that in Firebug (FF), the same function behaves differently.
My new question: Is using .childNodes() an unreliable way of accessing DOM elements?
To get the first element child, you can use...
document.body.firstElementChild;
...but older brwosers don't support it.
A method that has greater support is the children collection...
document.body.children[0];
...which has pretty good support but still has some holes in terms of older browsers.
(Just double checked, and as long as you don't support Firefox 3, and as long as you don't include HTML code comments in the markup, using .children will be safe.)
To ensure that you have the widest browser support, create a function...
function firstElementChild( parent ) {
var el = parent.firstChild;
while( el && el.nodeType !== 1 )
el = el.nextSibling;
return el;
}
and use it like this...
var h1 = firstElementChild( document.body );
Because there's a white-space text-node before the h1 element. Presumably, in the source (if you view source), the h1 opening tag's been either indented, or moved to a new line within the body (or both) in order for readability. At a guess, I'd imagine that it's something like the following:
<body>
<h1>PyCon Video Archive</h1>
<!-- ...other html... -->
If you revise that to:
<body><h1>PyCon Video Archive</h1><!-- ...other html... -->
Then the first childNode will, indeed, be the h1 element.
It's worth noting that text, even outside of an element tag, is still a child-node of the parent element. Albeit one that can't be easily targeted with a selector.

What information about a DOM element would allow JavaScript to identify it (somewhat) uniquely? (e.g. when it doesn't have `id`)

Here's what I'm trying to do: I have a bookmarklet that is looking for elements in the current page (which can be any site) and dispatch a click event on the ones that match. I have that part working.
In some cases though, nothing matches automatically and I want to be able to show (by hovering it) what element should be activated and then save some info about it in localStorage. The next time I'm using the bookmarklet on that page, I want to retrieve that info to identify the element in the DOM and then dispatch a click event.
The question is: what information should I save to be able to identify it? (in most cases, since it will always be possible to create a case where it doesn't work)
In the best case, said-element will have an id value and I'm good to go. In some other cases, it won't and I'd like to see your suggestions as to what info and what method I should use to get it back.
So far my idea is to save some of the element's properties and traverse the DOM to find elements that match everything. Not all properties will work (e.g. clientWidth will depend on the size of the browser) and not all types of elements will have all properties (e.g. a div node won't have a src value), which means that on one hand, I can't blindly save all properties, but on the other, I need to either choose a limited list of properties that will work for any kinds of element (at the risk of losing some useful info) or have different cases for different elements (which doesn't sound super great).
Things I was thinking I could use:
id of course
className, tagName would help, though className is likely to not be a clear match in some cases
innerHTML should work in a lot of cases if the content is text
src should work in most cases if the content is an image
the hierarchy of ancestors (but that can get messy)
...?
So, my question is a bit "how would you go about this?", not necessarily code.
Thanks!
You could do what #brendan said. You can also make up a jQuery-style selector string for each element in the DOM by figuring out the element's "index" in terms of its place in its parent's list of child nodes, and then building that up by walking up the DOM to the body tag.
What you'd end up with is something that looks like
body > :nth-child(3) > :nth-child(0) > :nth-child(4)
Of course if the DOM changes that won't work so good. You could add class names etc, but as you said yourself things like this are inherently fragile if you don't have a good "id" to start with, one that's put there at page creation time by whatever logic knows what's supposed to be in the page in the first place.
an approach would be using name, tagName and className-combination. innerHTML could may be too big.
another approach would be to look for child elements of your choosen element which have an id.
check for id => check for childs with id => check for name, tagName and className-combination (if => tell user to choose a different item :-)
What about finding all elements without an ID and assigning them a unique id. Then you could always use id.
What about using the index (integer) of the element within the DOM? You could loop through every element on page load and set a custom attribute to the index...
var els = document.getElementsByTagName("*");
for(var i = 0, l = els.length; i < l; i++) {
els[i].customIndex = i;
}

Categories