I'm retrieving an entire HTML document via AJAX - and that works fine. But I need to extract certain parts of that document and do things with them.
Using a framework (jquery, mootools, etc) is not an option.
The only solution I can think of is to grab the body of the HTML document with a regex (yes, I know, terrible) ie. <body>(.*)</body> put that into the current page's DOM in a hidden element, and work with it from there.
Is there an easier/better way?
Update
I've done some testing, and inserting an entire HTML document into a created element behaves a bit differently across browsers I've tested. For example:
FF3.5: keeps the contents of the HEAD and BODY tags
IE7 / Safari4: Only includes what's between ...
Opera 10.10: Keeps HEAD and everything inside it, Keeps contents of BODY
The behavior of IE7 and Safari are ideal, but different browsers are doing this differently. Since I'm loading a predetermined HTML document I think I'm going to use the regEx to grab what I want and insert it into a DOM element - unless someone has other suggestions.
Elements can exist without being in the page itself. Just dump the HTML into a dummy div.
var wrapper = document.createElement('div');
wrapper.innerHTML = "<ul><li>foo</li><li>bar</li></ul>";
wrapper.getElementsByTagName('li').length; // 2
Given your edits, we run into a sticky situation, since you want getElementById. The matter would probably be easy if you could just create a new virtual document via document.implementation.createDocument, but IE doesn't support that at all.
Using a regex is a messy business, since what if we see something like <body><input value="</body>" /></body>? You could probably just make your regex greedy so that it moves on to the last instance of </body>, but if you do end up running into troubles, a more thorough parsing may be necessary. Even if a full framework isn't an option, you might end up wanting to use something like Sizzle, the core of libraries like jQuery, to look for the element you want. Or, if you're really feeling in a purist sort of mood, you could write the recursive search function yourself - but why take that hit if someone else has already taken it?
var response_el = document.createElement('html'), foo;
response_el.innerHTML = the_html_elements_content;
foo = Sizzle('#foo', response_el);
Related
I'm looking to inject HTML via JavaScript into a page at work.
What I'd like to know is if injecting a re-write of the page is more or less efficient than injecting snippets throughout the page with methods like getElementById().
For example:
document.getElementById("Example").innerHTML = '<h2 id="Example" name="Example">Text</H2>'
document.getElementsByClassName("Example").innerHTML = '<H1>Test</H1>'
...etc. Is this more efficient/effective than simply injecting my own version of the entire page's HTML start to finish?
Edit: Per Lix's comment, I should clarify that I likely will be injecting a large amount of content into the page, but it will affect no more than a dozen elements at any time.
If your project can manage it, it could be better to create DOM Elements and append them to the tree.
The big problem with efficiency would be that setting .innerHTML property would first remove all the nodes and only then parse the html and append it to the DOM.
It's obvious that you should avoid removing and the re-appending identical elements, so if you're sure the "Example" elements would always remain on the page, your way of setting them seems to be a nice optimazation.
If you want to optimize it even further, you could parse the html you want to append to nodes and have a function that checks which ones should be appended and which one shouldn't. But be aware that accessing the DOM is costly. Read more about the ECMA-DOM bridge.
Edit: In some cases it might be better to let the browser do the html parsing and injecting through innerHTML. It depends on the amount of HTML you're inserting and the amount you're deleting. See #Nelson Menezes's comments about innerHTML vs. append.
Depends on the context. If it was only decoration of existing content, then your proposal would suffice. I'd use jQuery anyway, but that's only my preference.
But when injecting the actual content you have two concerns:
maintainability - Make the structure of your code readable and subject to easy change when you need (and you will need).
accessibility - When javascript is disabled, then no content will be visible at all. You should provide a link to desired content in <noscript/> tag or ensure accessibility to everyone any other way you prefer. That's a minority of internet users at the moment, but for professional webmasters they make it count.
To address both of above concerns I prefer to use ajax to load a whole page, some part or even plaintext into existing element. It makes it readable, 'cause the content is sitting in another file completely separated from the script. And since it's a file, you may redirect to it directly when javascript is disabled. It makes the content accessible to anyone.
For plain javascript you'd have to use XMLHttpRequest object, like here.
With jQuery it's even simpler. Depending on what you need you may use .load, .get or .ajax.
Best practice today is using JQuery Manipulation functions.
Most time you'd use one of this 3 functions :
Replace existing HTML node:
$("div").html("New content");
Append a sibling node:
$("div").append("New content");
Remove a node:
$("div").remove();
The following javascript (run in the chrome console) does not do what I'd expect:
> var elem = document.createElement("foo");
undefined
> elem.innerHTML = "<tr></tr>"
"<tr></tr>"
> elem.outerHTML
"<foo></foo>"
The <tr> tag has disappeared!
This seems specific to table-related elements. Using <div> or <span> works as expected.
I expect what I'm doing is invalid, as "foo" is not a known element, and presumably table-related elements can only appear within a . Interestingly, the following code works just fine:
> var elem = document.createElement("foo"), tr = document.createElement("tr");
> elem.appendChild(tr);
> elem.outerHTML
"<foo><tr></tr></foo>"
So it seems like the construction itself (a <tr> not within a <table>) is allowable, but the method of using innerHTML to place it there does not work - perhaps this goes through some html cleanup, which removes things that are not strictly, while creating DOM nodes directly is not subject to the same validation.
My question: is there any way to populate an arbitrary DOM node from a string without running into such cleanup / validation issues? My use case will end up with perfectly valid structure (I plan to place this as the child of a sometime later), but the browser is stopping me while I'm trying to build the individual parts.
It sounds a little like DocumentFragment should be what I'm looking for, but as far as I can tell those are only constructable programmatically - they don't support innerHTML.
some background on why I want to do this:
My use case is javascript-based live templating (i.e not outputting html strings, but actual DOM nodes). So the requirements are:
template input must be allowed to be arbitrary HTML (this is why I'm using innerHTML and not constructing nodes programmatically)
it must be possible to create sub-templates that are then attached into a larger document (that's why I can't just create the whole at once).
The second point is how I encountered this bug. My template contains a sub-template.
var row = Html("<tr></tr>");
var table = Html(["<table><thead>", row, "</thead></table>"]);
I will later add code like:
row.append(Html(["<td>", column.header, "</td>"]));
to actually populate the columns. So when it's fully constructed, the html will be valid. But in the intermediate stages, each template / snippet is constructed under a single element. That means that templates like:
Html(["Hello <span>", name, "</span>"]);
still come out as a single node (so that they can be manipulated as a single entity):
<foo>Hello <span>bob</span></foo>
When the template results in only a single child inside the <foo>, the outer node is removed. But during construction, the row template above should look like <foo><tr></tr></foo>. Due to the validation behaviour I'm seeing when using innerHTML it just ends up as <foo></foo>.
I've checked all code works the same in both firefox & chrome, so I don't expect I'm just hitting a browser bug.
Unfortunately the answer to your general question is no, there is no way to use innerHTML to add arbitrarily incomplete HTML fragments. I know this is not the answer you want to hear but that's the way it is.
One of the most misunderstood thing about innerHTML stems from the way the API is designed. It overloads the + and = operators to perform DOM insertion. This tricks programmers into thinking that it is merely doing string operations when in fact innerHTML behaves more like a function rather than a variable. It would be less confusing to people if innerHTML was designed like this:
element.innerHTML('some <b>html</b> here');
unfortunately it's too late to change the API so we must instead understand that it is really an API instead of merely an attribute/variable.
Now, to understand the so called "validation" behavior of innerHTML. When you modify innerHTML it triggers a call to the browser's HTML compiler. It's the same compiler that compiles your html file/document. There's nothing special about the HTML compiler that innerHTML calls. Therefore, whatever you can do to a html file you can pass to innerHTML (the one exception being that embedded javascript don't get executed - probably for security reasons).
This makes sense from the point of view of a browser developer. Why include two separate HTML compilers in the browser? Especially considering the fact that HTML compilers are huge, complex beasts.
The down side to this is that incomplete HTML will be handled the same way it is handled for html documents. In the case of <td> elements not inside a table most browsers will simply strip it away (as you've observed for yourself). That is essentially what you're trying to do - create invalid/incomplete HTML.
There are two work arounds to this:
Extract the table from the page then using string processing (regex et. el.) insert the <td> into the table string then innerHTML the whole table back into the page.
Parse the inserted HTML string and if you find any <td> or <tr> (or <option>) extract out the html element and insert it using DOM methods.
Unfortunately both are quite painful.
Mihai Stancu's comment about jquery made me think: surely jquery manages this if you call $("<tr></tr>"). I know jquery has a shortcut for strings that look like single tags, but it must work for complex HTML as well.
So I took a dive into the jquery source code, and found just the ticket:
https://github.com/jquery/jquery/blob/6a0ee2d9ed34b81d4ad0662423bf815a3110990f/src/manipulation.js#L450
It's using a regex to detect just the name of the first tag in the string, then using this info to figure out what "context" it needs to wrap it in for the innerHTML process to treat it as valid. I think this technique should work for all well-formed inputs.
I've adopted this code into a standalone function which will turn an arbitrary string into a DOM node:
https://gist.github.com/gfxmonk/5299096
I've noticed that jQuery can create, and access non-existent/non-standard HTML tags. For example,
$('body').append('<fake></fake>').html('blah');
var foo = $('fake').html(); // foo === 'blah'
Will this break in some kind of validation? Is it a bad idea, or are there times this is useful? The main question is, although it can be done, should it be done?
Thanks in advance!
You can use non-standard HTML tags and most of the browsers should work fine, that's why you can use HTML5 tags in browsers that don't recognize them and all you need to do is tell them how to style them (particularly which tags are display: block). But I wouldn't recommend doing it for two reasons: first it breaks validation, and second you may use some tag that will later get added to HTML and suddenly your page stops working in newer browsers.
The biggest issue I see with this is that if you create a tag that's useful to you, who's to say it won't someday become standard? If that happens it may end up playing a role or get styles that you don't anticipate, breaking your code.
The rules of HTML do say that if manipulated through script the result should be valid both before and after the manipulation.
Validation is a means to an end, so if it works for you in some way, then I wouldn't worry too much about it. That said, I wouldn't do it to "sneak" past validation while using something like facebook's <fb:fan /> element - I'd just suck it up and admit the code wasn't valid.
HTML as such allows you to use any markup you like. Browsers may react differently to unknown tags (and don't they to known ones, too?), but the general bottom line is that they ignore unknown tags and try to render their contents instead.
So technically, nothing is stopping you from using <fake> elements (compare what IE7 would do with an HTML5 page and the new tags defined there). HTML standardization has always been an after-the-fact process. Browser vendors invented tags and at some point the line was drawn and it was called HTMLx.
The real question is, if you positively must do it. And if you care whether the W3C validator likes your document or not. Or if you care whether your fellow programmers like your document or not.
If you can do the same and stay within the standard, it's not worth the hassle.
There's really no reason to do something like this. The better way is to use classes like
<p class = "my_class">
And then do something like
$('p.my_class').html('bah');
Edit:
The main reason that it's bad to use fake tags is because it makes your HTML invalid and could screw up the rendering of your page on certain browsers since they don't know how to treat the tag you've created (though most would treat it as some kind of DIV).
That's the main reason this isn't good, it just breaks standards and leads to confusing code that is difficult to maintain because you have to explain what your custom tags are for.
If you were really determined to use custom tags, you could make your web page a valid XML file and then use XSLT to transform the XML into valid HTML. But in this case, I'd just stick with classes.
I am studying somebody else jquery script, and I noticed he is opening a tag without closing it, but it also seems that browsers does not care (Not yet tested with IE)
it is written :
$('#mydiv').append('<ul>')
But there is nowhere a
.append('</ul>')
The script does not close the list, but browsers do it automatically (I just did an 'inspect element' in the browser).
Is that a 'legal' behavior, or one should always close a tag in a javascript autogenerated content ?
To do it properly, you should be appending:
$('#mydiv').append('<ul></ul>')
Yes browsers will handle it (specifically the .innerHTML implementation handles it, not jQuery), at least the major ones, but why not be safe in all cases and use valid markup?
$('#mydiv').append('<ul>')
...still calls .innerHTML, not createElement, only in $('<ul>') is document.createElement() called. As I said originally, the browser handles it with .append(), not jQuery and not document.createElement (which doesn't take syntax like this anyway).
You can see test/play with what I mean here
Short answer: you should.
Long answer that lead to the short answer:
When you say .append('<ul>'),
or even .append('<ul></ul'), behind the scenes jQuery calls document.createElement and the browser knows what to do.
It's not like jQuery actually puts that string of HTML anywhere, but rather parses it and creates the necessary DOM elements
UPDATE-
As Nick pointed out, this might not always be the case. Relevant source: init
If you pass it just ul, it just calls createElement. If the html string is more complicated, it will go into buildFragment which is more complicated than that.
Based on this, I would say the best/fastest way to create a single element thru jQuery, is to do something like
$('<ul>').appendTo($target);
UPDATE 2-
So apparently jQuery only calls createElement in some methods, but append ends up calling clean which has a regex that closes tags. So either way, you're safe, jQuery saves you as usual.
Relevant source:
...
} else if ( typeof elem === "string" ) {
// Fix "XHTML"-style tags in all browsers
elem = elem.replace(rxhtmlTag, "<$1></$2>");
...
UPDATE 3- So it turns out jQuery doens't fix anything for you when you call append, and it just injects the string into a temporary div element. Seems like most browsers know how to deal with the HTML even if not closed properly, but to be save it's probably best to close it yourself! Or if you're feeling lazy, do something like .append($('<ul>')) which doesn't use innerHTML
(excuse me if this is not the right forum to post - i couldn't find anything related to non-native programming and related to this topic)
I Am trying to set a dynamic HTML into an iFrame on the webpage. I have tried a couple of things but none of them seem to work. I m able to read the innerHTML but can't seem to update it.
// Able to read using
document.getElementById('iFrameIdentifier').innerHTML;
// On Desktop IE, this code works
document.getElementById('iFrameId').contentWindow.document.open();
document.getElementById('iFrameId').contentWindow.document.write(dynamicHTML);
document.getElementById('iFrameId').contentWindow.document.close();
Ideally the same function should work as how it works for div's but it says 'Object doesn't support this method or property".
I have also tried document.getElementById('iFrameId').document.body.innerHTML.
This apparently replaces the whole HTML of the page and not just the innerHTML.
I have tried out a couple of things and they didn't work
document.getElementById('iFrameId').body.innerHTML
document.frames[0].document.body.innerHTML
My purpose is to have a container element which can contain dynamic HTML that's set to it.
I've been using it well till now when I observed that the setting innerHTML on a div is taking increasing amount of time because of the onClicks or other JS methods that are attached to the anchors and images in the dynamic HTML. Appears the JS methods or the HTML is some how not getting cleaned up properly (memory leak?)
Also being discussed - http://www.experts-exchange.com/Programming/Languages/Scripting/JavaScript/Q_26185526.html#a32779090
I have tried a couple of things but none of them seem to work.
Welcome to IEMobile! Nothing you know about DOM scripting applies here.
Unfortunately, cross-iframe scripting does not appear to be possible in IEMobile6-7.
frameelement.contentDocument (the standard DOM method) isn't available
frameelement.contentWindow.document (the IE6-7 workaround version) isn't available
the old-school Netscape window.frames array only works for frames, not iframes
having the child document pass up its document object to the window.parent only works for frames, not iframes. In an iframe, window.parent===window.
So the only ways forward I can see are:
use frames instead of iframes. Nasty. Or,
use document.cookie to communicate between parent and child: the child document is just a script, that checks for a particular cookie in document.cookie on a poller, and when it's found that's a message from the parent, and it can write some HTML or whatever. Slow and nasty. Or,
using the server-side to inject content into the frames, passing it in as an argument to a script. Slow, nasty, and potentially insecure. Or,
avoid frames completely (best, if you can). Or,
drop support from IEMobile6-7 (best for preserving your sanity, if you can get away with it!)
Appears the JS methods or the HTML is some how not getting cleaned up properly (memory leak?)
Yes, probably. IEMobile6-7(*) is close to unusable at dynamic HTML. It gives you a lovely flavour of what scripting used to be like for us poor gits back in the Netscape 4 days.
Try to avoid creating and destroying lots of nodes and event handlers. Keep the page as static as possible, re-using element nodes where possible and setting text node data properties in preference to tearing everything down and making anew with createElement or innerHTML. Use an event stub (onclick="return this._onclick()") in the HTML together with writing to _onclick if you need to set event handlers from JavaScript, in preference to recreating the HTML with a new event handler (or just trying to set the property, which of course doesn't work in IEMobile). Avoid long-running single pages when you can.
It'll still crash, but hopefully it'll take longer.
*: that is, the versions of IE present on WinMo before version 6.1.4, where it became the infinitely better IEMobile8, marketed as “Internet Explorer Mobile 6” (thank you Microsoft).
Okay, I kinda resolved the issues that i was facing earlier and the bigger issue which was setting HTML to an iFrame on IEMobile. But i still have one more PIA which is related to double scollbars - which i am currently looking into. There seems to be more poor souls facing similar problem - if i fix that too. I will post an update here.
How did i finally write to iFrame on IEMobile?
Have 2 divs one to wrap the iFrame and the other to write inside an iFrame.
document.getElementById('OuterDiv').innerHTML = '';
document.getElementById('OuterDiv').innerHTML = '<iframe id="iFrameId" src="somefile.html"></iframe>';
This creates an iFrame each time and in the somefile.html on load there is a InnerDiv.innerHTML which doesn't seem to leak the memory.
In the somefile.html there will be an onLoad method which will fetch the HTML (explained below on how i managed to get it) and do a
document.getElementById('InnerDiv').innerHTML = dynamicHTML;
How did I manage to pass the HTML between parent and child iFrame
As well explained by #bobince earlier, one has to rely on 3rd party service like a cookie or a server to pass around the data between parent and the child iFrame.
I infact used an ActiveXControl to set and get data from the parent and child iFrame's javascript respectively. I won't recommend doing this if you have to introduce an ActiveX Control just for this. I accidentally already have one which I use to get the Dynamic HTML in the first place.
If you need any help you can DM me - Twitter #Swaroop
Thanks #bobince for your help. I am marking this one as an answer because it says what i did to fix the issue.