Strange issues loading html fragments with innerHTML

Strange issues loading html fragments with innerHTML - javascript

i have a userjs script for opera which displays its own interface, currently using DOM methods to create elements. This works well, but is hard to maintain as the interface is tied to the code. So I'm looking for a way to separate layout from code. Also i want to keep things simple and really don't want to rely on a framework (jquery...) for that. I don't care about cross-browser functionality, this thing can only work on opera anyway.
i got all the style stuff into css, that helped. Now i'm looking for a way to abstract the layout. A good part of the UI is quite dynamic, so I can't just use one big static html. The idea that came up was to have a piece of html containing the layout for the different UI parts, extract fragments from that and put everything together as needed.
This works pretty well to some degree:
create a div, never parent it.
use .innerHTML to load the html into it
use this getElementsByClassName() to find widgets in it
clone them with widget.cloneNode(true)
parent it etc...
I know of some issues with cloneNode() (risk of duplicate ids, and event handlers missing in the clone) but i can work around them.
The problem is, with .innerHTML loading i get different results from the current DOM code, even though i use a captured layout from the DOM code version ! I'm seeing this with tables for example. For a simple
<table><tr><td></td></tr></table>
the innerHTML version shows up with <tbody> tags in it in dragonfly, and css rules like this one don't apply anymore because of it:
table > tr > td { ... }
I have a baaaad feeling about all this ...
Are there other big differences between DOM and html layouts ?
Maybe i should really be using <tbody> in the DOM stuff ?
How would you do this ?
Bonus question:
what is the reason behind createDocumentFragment() existence ? what can you do with it that can't be done otherwise ?

You're right, it looks like a table whose markup doesn't define <tbody>, will be converted to markup with the <tbody> tag present when reading the innerHTML poperty of a table.
But this shouldn't cause too much trouble for you: as for the CSS issue, drop the > from your selectors (restricts to direct descendent).
One possible benefit of DocumentFragment is that when you need to do significant amount of DOM manipulation, it may cause some performance gain if only a document fragment is manipulated and once all transformations are done, it is attached to the DOM.

Related

JavaScript HTML injection efficiency/best practice

I'm looking to inject HTML via JavaScript into a page at work.
What I'd like to know is if injecting a re-write of the page is more or less efficient than injecting snippets throughout the page with methods like getElementById().
For example:
document.getElementById("Example").innerHTML = '<h2 id="Example" name="Example">Text</H2>'
document.getElementsByClassName("Example").innerHTML = '<H1>Test</H1>'
...etc. Is this more efficient/effective than simply injecting my own version of the entire page's HTML start to finish?
Edit: Per Lix's comment, I should clarify that I likely will be injecting a large amount of content into the page, but it will affect no more than a dozen elements at any time.

If your project can manage it, it could be better to create DOM Elements and append them to the tree.
The big problem with efficiency would be that setting .innerHTML property would first remove all the nodes and only then parse the html and append it to the DOM.
It's obvious that you should avoid removing and the re-appending identical elements, so if you're sure the "Example" elements would always remain on the page, your way of setting them seems to be a nice optimazation.
If you want to optimize it even further, you could parse the html you want to append to nodes and have a function that checks which ones should be appended and which one shouldn't. But be aware that accessing the DOM is costly. Read more about the ECMA-DOM bridge.
Edit: In some cases it might be better to let the browser do the html parsing and injecting through innerHTML. It depends on the amount of HTML you're inserting and the amount you're deleting. See #Nelson Menezes's comments about innerHTML vs. append.

Depends on the context. If it was only decoration of existing content, then your proposal would suffice. I'd use jQuery anyway, but that's only my preference.
But when injecting the actual content you have two concerns:
maintainability - Make the structure of your code readable and subject to easy change when you need (and you will need).
accessibility - When javascript is disabled, then no content will be visible at all. You should provide a link to desired content in <noscript/> tag or ensure accessibility to everyone any other way you prefer. That's a minority of internet users at the moment, but for professional webmasters they make it count.
To address both of above concerns I prefer to use ajax to load a whole page, some part or even plaintext into existing element. It makes it readable, 'cause the content is sitting in another file completely separated from the script. And since it's a file, you may redirect to it directly when javascript is disabled. It makes the content accessible to anyone.
For plain javascript you'd have to use XMLHttpRequest object, like here.
With jQuery it's even simpler. Depending on what you need you may use .load, .get or .ajax.

Best practice today is using JQuery Manipulation functions.
Most time you'd use one of this 3 functions :
Replace existing HTML node:
$("div").html("New content");
Append a sibling node:
$("div").append("New content");
Remove a node:
$("div").remove();

Designing a Filter

The question is no more relevent.
I just wanted to add that if anybody encounter a need for filtering HTML, I would propose using a MutationObserver (That was my final choice ..)

Read a bit more on DOM manipulation, it will help. Yes, using content scripts here is the right way to go. You'll want to have your script run at "document_start" which is run before Chrome begins parsing the DOM (that way you'll get a head start, essentially) You could do this filtering a few ways actually;
Just use the DOM as one big string, and remove the words you want. However, this is a messy way of eoing things, if something important was named after your word (like a class, tag name, or whatnot), you'd break the DOM.
Loop over the text nodes (strings) in a DOM as soon as the DOM is ready, and replace and edit those text nodes.
There's tutorials detailing this kind of thing you can find online easily, along the lines of DOM manipulation and text node replacement. This article details it much more intimately and better than I have here: Replacing text in the DOM...solved?
And here is a JS library (from the same guy who wrote that article) which practically does all the hard work for you: https://github.com/padolsey/findAndReplaceDOMText

The way I would approach this is through a jQuery/other framework plugin. I would suggest you take a look at some highlight plugins that are available in pretty much every framework around. See here for some examples of doing highlighting.
Essentially what you'll need to do is select heading tags p tags and spans and loop over each replacing bad words.

performance issue : storing a reference to DOM element vs using selectors

So in my app, the user can create some content inside certain div tags, and each content, or as I call them "elements" has its own object. Currently I use a function to calculate the original div tag that the element has been placed inside using jquery selectors, but I was wondering in terms of performance, wouldn't it be better to just store a reference to the div tag once the element has been created, instead of calculating it later ?
so right now I use something like this :
$('.div[value='+divID+']')
but instead I can just store the reference inside the element, when im creating the element. Would that be better for performance ?

If you have lots of these bindings it would be a good idea to store references to them. As mentioned in the comments, variable lookups are much much faster than looking things up in the DOM - especially with your current approach. jQuery selectors are slower than the pure DOM alternatives, and that particular selector will be very slow.
Here is a test based on the one by epascarello showing the difference between jQuery, DOM2 methods, and references: http://jsperf.com/test-reference-vs-lookup/2. The variable assignment is super fast as expected. Also, the DOM methods beat jQuery by an equally large margin. Note, that this is with Yahoo's home page as an example.
Another consideration is the size and complexity of the DOM. As this increases, the reference caching method becomes more favourable still.

A local variable will be super fast compared to looking it up each time. Test to prove it.

jQuery is a function that builds and returns an object. That part isn't super expensive but actual DOM lookups do involve a fair bit of work. Overhead isn't that high for a simple query that matches an existing DOM method like getElementById or getElementsByClassName (doesn't in exist in IE8 so it's really slow there) but yes the difference is between work (building an object that wraps a DOM access method) and almost no work (referencing an existing object). Always cache your selector results if you plan on reusing them.
Also, the xpath stuff that you're using can be really expensive in some browsers so yes, I would definitely cache that.
Stuff to watch out for:
Long series of JQ params without IDs
Selector with only a class in IE8 or less (add the tag name e.g. 'div.someClass') for a drastic improvement - IE8 and below has to hit every piece of HTML at the interpreter level rather than using a speedy native method when you only use the class
xpath-style queries (a lot of newer browsers probably handle these okay)
When writing selectors consider how much markup has to be looked at to get to it. If you know you only want divs of a certain class inside a certain ID, do one of these $('#theID div.someClass') rather than just $('div.someClass');
But regardless, just on the principle of work avoidance, cache the value if you're going to use it twice or more. And avoid haranguing the DOM with repeated requests as much as you can.

looking up an element by ID is super fast. i am not 100% sure i understand your other approach, but i doubt it would be any better than a simple lookup of an element by its id, browsers know how to this task best. from what you've explained I can't see how your approach would be any faster.

When do you use DOM-based Generation vs. using strings/innerHTML/JQuery to generate DOM content?

I was wondering when to use DOM-based generation versus .innerHTML or appending strings using JQuery's .append method? I read a related post here Should you add HTML to the DOM using innerHTML or by creating new elements one by one? but I'm still unsure of the use case for each method.Is it just a matter of performance where I would always choose one over the other?
Let's say that form is an arbitrary variable:
DOM generation
var div = document.createElement("div"),
label = document.createElement("label"),
input = document.createElement("input");
div.appendChild(label);
div.appendChild(input);
form.appendChild(div);
JQuery
$(form).append("<div><label></label><input></input></div>")

The second one is more readable, although that comes from jQuery which does the innerHTML work for you. In vanilla JS, it would be like this:
form.insertAdjacentHTML("beforeend", "<div><label></label><input></input></div>");
...which I think beats even jQuery. Although, you should not worry about performance. The performance always depends on the amount of nodes to insert - for single ones, the HTML parser would be slower than creating them directly, for large HTML strings the native parser is faster than the script. If you really do worry about performance, you will need to test, test, test (and I'd say there is something wrong with your app).
Yet, there is a great difference between the two methods: With #1, you have three variables with references to the DOM elements. If you would for example like to add an event listener to the input, you can immediately do it and don't need to call a querySelector on form, which would be much slower. Of course, when inserting really many elements - with innerHTML -, you wouldn't need to do that at all because you would use delegated events for a real performance boost then.
Note that you can also shorten method #1 with jQuery to a oneliner:
var div, label, input;
$(form).append(div=$("<div/>").append(input=$("<input/>"),label=$("<label/>")));
My conclusion:
For creating only few elements the DOM approach is cleaner.
Mostly, html strings are more readable.
None of the two is faster in standard situations - benchmark results vary wide.
Personally, I don't like (direct) innerHTML for a few reasons, which are outlined well in these two answers and here as well. Also, IE has a bug on tables (see Can't set innerHTML on tbody in IE)

Generally speaking, hitting the DOM repeatedly is much slower than say swapping out a big block of HTML with innerHTML. I believe there are two reasons for this. One is reflow. The browser has to recalc for potential layout impact across potentially wide variety of elements. The other, I believe, and somebody correct me if I'm wrong, is that there's a bit of overhead involved in translating the stuff going on at the browser's post-compiled execution environment where rendering and layout state is being handled into an object you can use in JavaScript. Since the DOM is often under constantly changing conditions you have to run through the process every time with few opportunities to cache results of any kind, possibly to a degree even if you're just creating new elements without appending them (since you're likely to going to want pre-process CSS rules and things like what 'mode' the browser is in due to doctype, etc, that can be applied in a general context beforehand).
DOM methods allow you construct document fragments and create and append HTML element to those without affecting the actual document layout, which helps you avoid unnecessary reflow.
But here's where it gets weird.
Inserting new HTML into a node with nothing in it - close to a tie or something innerHTML is typically much faster at in a lot of (mostly older) browsers
Replacing a ton of HTML contents - this is actually something where DOM methods tend to win out when performance isn't too close to call.
Basically, innerHTML, if it stinks, tends to stink at the teardown process where large swaps are happening. DOM methods are better at teardown but tend to be slower at creating new HTML and injecting directly without replacing anything when there's any significant difference at all.
There are actually hybrid methods out there that can do pretty marvelous things for performance when you have the need. I used one over a year ago and was pretty impressed by response time improvement for swapping large swathes of HTML content for a lazy-loading grid vs. just using innerHTML alone. I wish I could find a link to the guy who deserves credit for figuring this out and spelling it out on the web (author, has written a lot of RegEx stuff too - couldn't google for the life of me).
As a matter of style vs perf, I think you should avoid tweaking the actual DOM node structure repeatedly but constructing HTML in a document fragment beforehand vs. using innerHTML is pretty much a matter of judgement. I personally like innerHTML for the most part because JS has a lot of powerful string methods that can rapidly convert data to HTML-ready strings. For instance:
var htmlStr = '<ul><li>' + arrayOfNames.join('</li><li>') + '</li></ul>';
That one-liner is a UL I can assign directly to innerHTML. It's almost as easy to build complete tables with the right data structure and a simple while loop. Now go build the same UL with as many LIs as the length of the arrayOfNames with the DOM API. I really can't think of a lot of good reasons to do that to yourself. innerHTML became de facto standard for a reason before it was finally adopted into the HTML 5 spec. It might not fit the node-based htmlElement object tweaking approach of the DOM API but it's powerful and helps you keep code concise and legible. What I would not likely do, however, is use innerHTML to edit and replace existing content. It's much safer to work from data, build, and swap in new HTML than it is to refer to old HTML and then start parsing innerHTML strings for attributes, etc when you have DOM manipulation methods convenient and ready for that.
Your chief performance concern should probably be to avoid hammering away at the 'live' portions of the DOM, but the rest I would leave up to people as a matter of style and testing where HTML generation is concerned. innerHTML is and has for years now been in the HTML5 working draft, however, and it is pretty consistent across modern browsers. It's also been de facto spec for years before that and was perfectly viable as an option since before Chrome was new, IMO but that's a debate that's mostly done at this point.

It is just a matter of performance. Choose the one that fits you best.
jsPerf is full of those performance test, like this one: test

Display DOM node in multiple places w/o cloning/copying

Disclaimer:
I've blathered on kind-of excessively here in an attempt to provide enough context to pre-empt all questions you folks might have of me. Don't be scared by the length of this question: much of what I've written is very skim-able (especially the potential solutions I've come up with).
Goal:
The effect I'm hoping to achieve is displaying the same element (and all descendants) in multiple places on the same page. My current solution (see below for more detail) involves having to clone/copy and then append in all the other places I want it to appear in the DOM. What I'm asking for here is a better (more efficient) solution. I have a few ideas for potentially more efficient solutions (see below). Please judge/criticize/dismiss/augment those, or add your own more-brilliant-er solution!
"Why?" you ask?
Well, the element (and it's descendants) that I'm wanting to display more than once potentially has lots of attributes and contents - so cloning it, and appending it someplace else (sometimes more than one other place) can get to be quite a resource-hogging DOM manipulation operation.
Some context:
I can't describe the situation exactly (damn NDA's!) but essentially what I've got is a WYSIWYG html document editor. When a person is editing the DOM, I'm actually saving the "original" node and the "changed" node by wrapping them both in a div, hiding the "original" and letting the user modify the new ("changed") node to their heart's content. This way, the user can easily review the changes they've made before saving them.
Before, I'd just been letting the user navigate through the "diff divs" and temporarily unhiding the "original" node, to show the changes "inline". What I'm trying to do now is let the user see the whole "original" document, and their edited ("changed") document in a side-by-side view. And, potentially, I'd like to save the changes through multiple edit sessions, and show 'N' number of versions side-by-side simultaneously.
Current Solution:
My current solution to achieve this effect is the following:
Wrap the whole dang dom (well, except the "toolbars" and stuff that they aren't actually editing) in a div (that I'll call "pane1"), and create a new div (that I'll call "pane2"). Then deep-clone pane1's contents into pane2, and in pane1 only show the "original" nodes, and in pane2 only show the "changed" nodes (in the diff regions - everything outside of that would be displayed/hidden by a toggle switch in a toolbar). Then, repeat this for panes 3-through-N.
Problem with Current Solution:
If the document the user is editing gets super long, or contains pictures/videos (with different src attributes) or contains lots of fancy styling things (columns, tables and the like) then the DOM can potentially get very large/complex, and trying to clone and manipulate it can make the browser slow to a crawl or die (depending on the DOM's size/complexity and how many clones need to be made as well as the efficiency of the browser/the machine it's running on). If size is the issue I can certainly do things like actually remove the hidden nodes from the DOM, but that's yet more DOM manipulation operations hogging resources.
Potential Solutions:
1. Find a way to make the DOM more simple/lightweight
so that the cloning/manipulating that I'm currently doing is more efficient. (of course, I'm trying to do this as much as I can anyway, but perhaps it's all I can really do).
2. Create static representations of the versions with Canvas elements or something.
I've heard there's a trick where you can wrap HTML in an SVG element, then use that as an image source and draw it onto a canvas. I'd think that those static canvasses (canvi?) would have a much smaller memory footprint than cloned DOM nodes. And manipulating the DOM (hiding/showing the appropriate nodes), then drawing an image (rinse & repeat) should be quicker & more efficient than cloning a node and manipulating the clones. (maybe I'm wrong about that? Please tell me!)
I've tried this in a limited capacity, but wrapping my HTML in SVG messes with the way it's rendered in a couple of weird cases - perhaps I just need to message the elements a bit to get them to display properly.
3. Find some magic element
that just refers to another node and looks/acts like it without being a real clone (and therefore being somehow magically much more lightweight). Even if this meant that I couldn't manipulate this magic element separately from the node it's "referencing" (or its fake children) - in that case I could still use this for the unchanged parts, and hopefully shave off some memory usage/DOM Manipulation operations.
4. Perform some of the steps on the server side.
I do have the ability to execute server side code, so maybe it's a lot more efficient (some of my users might be on mobile or old devices) to get all ajax-y and send the relevant part of the DOM (could be the "root" of the document the user is editing, or just particularly heavy "diff divs") to the server to be cloned/manipulated, then request the server-manipulated "clones" and stick 'em in their appropriate panes/places.
5. Fake it to make it "feel" more efficient
Rather than doing these operations all in one go and making the browser wait till the operations are done to re-draw the UI, I could do the operations in "chunks" and let the browser re-render and catch a breather before doing the next chunk. This probably actually would result in more time spent, but to the casual user it might "feel" quicker (haha, silly fools...). In the end, I suppose, it is user experience that is what's most important.
Footnote:
Again, I'm NDA'd which prevents me from posting the actual code here, as much as I'd like to. I think I've thoroughly explained the situation (perhaps too thoroughly - if such a thing exists) so it shouldn't be necessary for you to see code to give me a general answer. If need be, I suppose I could write up some example code that differs enough from my company's IP and post it here. Let me know if you'd like me to do that, and I'll be happy to oblige (well, not really, but I'll do it anyway).

Take a look at CSS background elements. They allow you to display DOM nodes elsewhere/repeatedly. They are of course they are read-only, but should update live.
You may still have to come up with a lot of magic around them, but it is a similar solution to:
Create static representations of the versions with Canvas elements or something.
CSS background elements are also very experimental, so you may not get very far with them if you have to support a range of browsers.

To be honest, after reading the question I almost left thinking it belongs in the "too-hard-basket", but after some thought perhaps I have some ideas.
This is a really difficult problem and the more I think about it the more I realise that there is no real way to escape needing to clone. You're right in that you can create an SVG or Canvas but it won't look the same, though with a fair amount of effort I'm sure you can get quite close but not sure how efficient it will be. You could render the HTML server-side, take a snapshot and send the image to the client but that's definitely not scalable.
The only suggestions I can think of are as follows, sorry if they are long-winded:
How are you doing this clone? If you're going through each element and as you go through each you are creating a clone and copying the attributes one by one then this is heaavvvyy. I would strongly suggest using jQuery clone as my guess is that it's more efficient than your solution. Also, when you are making structural changes it might be useful to take advantage of jQuery's detach/remove (native JS: removeChild()) methods as this will take the element out of the DOM so you can alter it before reinserting.
I'm not sure how you'v got your WYSIWYG, but avoid using inputs as they are heavy. If you must then I'm assuming they don't look like inputs so just swap them out with another element and style (CSS) to match. Make sure you do these swaps before you reinsert the clone in to the DOM.
Don't literally put video at the time of showing the user comparisions. The last thing we want to do is inject 3rd party objects in to the page. Use an image, you only have to do it while comparing. Once again, do the swap before inserting the clone in to the DOM.
I'm assuming the cloned elements won't have javascript attached to them (if there is then remove it, less moving parts is more efficiency). However, the "changed" elements will probably have some JS events attached so perhaps remove them for the period of comparision.
Use Chrome/FF repaint/reflow tools to see how your page is working when you restructure the DOM. This is important because you could be doing some "awesome" animations that are costing you intense resources. See http://paulirish.com/2011/viewing-chromes-paint-cycle/
Use CSS over inline styling where possible as modern browsers are optimised to handle CSS documents
Can you make it so your users use a fast modern browser like Chrome? If it's internal then might be worth it.
Can you do these things in Silverlight or Adobe Air? These objects get special resource privileges, so this will most likely solve your problem (according to what I'm imagining the depth of the problem is)
This one is a bit left-field but could you open in another window? Modern browsers like Chrome will run the other window in its own process which may help.
No doubt you've probably looked in to these things more than I but good luck with it. Would be curious how you solved it.

You may also try: http://html2canvas.hertzen.com/
If it works for you canvas has way better support.

In order to get your "side-by-side" original doc and modified doc.. rather than cloning all pane1 into pane2.. could you just load the original document in an iframe next to the content you are editing? A lot less bulky?
You could tweak how the document is displayed when it's in an iframe (e.g. hide stuff outside editable content).
And maybe when you 'save' a change, write changes to the file (or a temp) and open it up in a new iframe? That might accomplish your "multiple edit sessions"... having multiple iframes displaying the document in various states.
Just thinking out loud...
(Sorry in advance if I'm missing/misunderstanding any of your goals/requirements)

I don't know if it's already the case for you but you should consider using jQuery library as it allows to perform different kinds of DOM elements manipulation such as create content and insert it into several elements at once or select an element on the page and insert it into another.
Have a look on .appendTo(), .html(), .text(), .addClass(), .css(), .attr(), .clone()
http://api.jquery.com/category/manipulation/
Sorry if I'm just pointing out something you already know or even work with but your NDA is in the way of a more accurate answer.

We Keep Coding

JavaScript is the programming language of the Web.