For example if JavaScript performs a bunch of manipulations on a table, the new HTML will not be visible via View -> Source. Is there some way to capture JavaScript manipulations and save everything as a plain HTML document?
The easiest way is to call
document.documentElement.outerHTML
This will get the same output as view source except that it will have all the DOM manipulations visible. It will probably be missing the DOCTYPE however. I realized that the Webkit Console was printing the doctype fine, but there is no public API for getting the DOCTYPE, so you'll have to do that yourself.
A little bookmarklet that you can add to your browser to view the dom:
javascript:(function(){win=open(%22about:blank%22,%20%22View%20DOM%20Source%22,%20%22menubar=no,resizable=yes,status=no,toolbar=no%22);win.document.write(%22<pre>%22%20+%20document.documentElement.outerHTML.split(%22&%22).join(%22&%22).split(%20%22<%22).join(%22<%22).split(%22>%22).join(%22>%22)%20+%20%22</pre>%22);win.focus();})()
(Sorry, can't post a Javascript Link).
You can view it in a DOM inspector like Firebug or IE Developer tools
You could use prompt("test",document.body.innerHTML); and copy & paste the content.
You can access the serialized current state of the table with innerHTML.
var table = document.getElementById("mytable");
table.innerHTML; // "<tbody><tr><td>..."
table.parentNode.innerHTML; // gets the serialization of the whole table, including the <table> tag
I know this question is quite old, but I came across a method to capture and save DOM manipulations using shell.js and thought would list over here in case anyone is interested.
Assuming all DOM manipulations are complete.
After that.
var shell = require('shelljs');
var data = window.document.getElementsByTagName('html')[0].innerHTML;
shell.echo(data).to("your/original/file.html");
That simple.
I found it useful especially in node.js related DOM manipulations with jsdom (which apparently doesn't save DOM manipulations on its own).
Note: This will overwrite original file.
Related
I'm working on an code editor using blockly, and my page currently has tabs for switching between Block View and Code View, kinda like some WYSIWYG editors. Now, Blockly already has plenty of stuff for going from blocks to code, and I've gotten 99% of the parts done so that I can go from code to blocks (it involves building up a bunch of block xml). My call to go from code view to block view looks like this:
var xml = Blockly.Xml.textToDom(self.xmlGenerated());
Blockly.mainWorkspace.clear();
Blockly.Xml.domToWorkspace(Blockly.mainWorkspace, xml);
The problem is, no matter what id attributes I set in my xml nodes, blockly overrides them when I try to read the block xml later. It seems they constantly increment, even though I'm clearing the workspace. This causes a problem for my auto-save feature since that means each time I go from code to blocks my xml is changing, and therefore my code is changing (the generated code is a graph structure that also uses the id fields to identity each node in the graph).
So, my question is, does anyone know how to prevent Blockly from overriding the node id I send, or is there a way to "reset" the node ids?
I asked a very similar question in the Blockly Google group and Neil added a new data XML tag for storing persistent data. Maybe you can put your ID there? From reading the code seems like the id attribute was meant for internal use so it may be unreliable to reuse it.
I'm working a page that needs to fetch info from some other pages and then display parts of that information/data on the current page.
I have the HTML source code that I need to parse in a string. I'm looking for a library that can help me do this easily. (I just need to extract specific tags and the text they contain)
The HTML is well formed (All closing/ending tags present).
I've looked at some options but they are all being extremely difficult to work with for various reasons.
I've tried the following solutions:
jkl-parsexml library (The library js file itself throws up HTTPError 101)
jQuery.parseXML Utility (Didn't find much documentation/many examples to figure out what to do)
XPATH (The Execute statement is not working but the JS Error Console shows no errors)
And so I'm looking for a more user friendly library or anything(tutorials/books/references/documentation) that can let me use the aforementioned tools better, more easily and efficiently.
An Ideal solution would be something like BeautifulSoup available in Python.
Using jQuery, it would be as simple as $(HTMLstring); to create a jQuery object with the HTML data from the string inside it (this DOM would be disconnected from your document). From there it's very easy to do whatever you want with it--and traversing the loaded data is, of course, a cinch with jQuery.
You can do something like this:
$("string with html here").find("jquery selector")
$("string with html here") this will create a document fragment and put an html into it (basically, it will parse your HTML). And find will search for elements in that document fragment (and only inside it). At the same time it will not put it in page DOM
Ok first off let me state that I know I should never do this under any circumstances for a real site. Ok. That's out of the way.
One of my coworkers was going off that Javascript is not a "real" programming language (his definition of "real" seems to be "it compiles"), because it depends on other languages to do its thing.
I told him I could write a website using nothing but javascript.
I am sure that this can be done, using document.write('') to get the doctype, and some script to create a dom and styles... but the problem is since the page is validated without JS, it can't show him that what the browser is looking at does in fact validate.
Anyone know of a way I can validate the actual source the browser is using instead of the javascript that initially loaded?
If you really want to demonstrate that JS is a "real" language, then you would probably be better off not using a browser as the foundation. A node.js server would allow you to generate an HTML document (using document.write if you like, but DOM is an option (and people have used client side libraries to manipulate a document in node.
Since the JS runs on the server, you can get the actual source from the browser via view-source or point the validator directly at the URI (so long as it is either public or you install a local copy of the validator)
Load the site in Firefox with Firebug installed. Fire up the "HTML" view and rightclick on the <html> node and select "copy HTML".
The closest you get using JavaScript:
var generatedHTML = document.documentElement.innerHTML;
//Retrieves everything within the (missing)HTML tags.
//The only missing parts are DOCTYPE and the <html> itself
var txt = document.createElement("textarea");
txt.style.cssText = "width:99%;height:99%;position:fixed;z-index:999;top:0;left:0";
txt.value = generatedHTML;
txt.ondblclick = function(){this.parentNode.removeChild(this)};
//Adding a simple function to easily remove the textarea once finished
document.body.appendChild(txt);
Bookmarklet (I have slightly adjusted the code to be compact):
javascript:void(function(){var t=document.createElement("textarea");t.style.cssText = "width:99%;height:99%;position:fixed;z-index:999;top:0;left:0";t.value=document.documentElement.innerHTML;txt.ondblclick=function(){t.parentNode.removeChild(t)};document.body.appendChild(t)})()
Focus the generated textarea
Manually add the DOCTYPE + <html> tags
Copy the contents of the textarea to the validator at: http://validator.w3.org/#validate-by-input
I have a javascript routine that dynamically creates an HTML page, complete with it's own head and script tags.
If I take the contents of the string and save it to a file, and view the file in a browser, all is well, but if I try document.write(newHTML), it doesn't behave the same. The javascript in the header of the dynamic newHTML is quite complicated, and I cannot include it here... But please believe me that it works great if I save it to a file, but not if I try to replace the current page with it using document.write. What possible pitfalls could be contributing to this that I'm not considering? Do I possibly need to delete the existing script tags in the existing header first? Do I need to manually re-call onLoad??
Again, it works great when the string is saved to, for example, 'sample.html' and browsed to, but if I set var Samp="[REAL HTML HERE]"; and then say document.write(Samp); document.close(); the javascript routines are not executing correctly.
Any hints as to what I could be missing?
Is there another/better way to dynamically replace the content of the page, other than document.write?
Could I somehow redirect to the new page despite the fact that doesn't exist on disk or on a server, but is only in a string in memory? I would hate to have to upload the entire file to my server simply to re-download again it to view it.
How can I, using javascript, replace the current content of the current page with entirely new content including complex client-side javascripting, dynamically, and always get exactly the same result as if I saved the string to the server as an html file and redirected to it?
How can I 'redirect' to an HTML file that only exists as a client-side string?
You can do this:
var win=window.open("") //open new window and write to it
var html = generate_html();
win.document.write(html)
win.document.close();
Maybe eval() function would help here? It's hard to give ansver without seeing the code.
Never tried this, but i think it should be possible. Some thoughts on what might make it work:
Make sure the document containing your js is sent with the correct headers / mimetype / doctype
Serve the javascript in a valid way, for example by sending a w3c valid page containing the script tag.
Maybe then it works. If not, try to erase the current html before writing the new one.
Also, it might be helpful to look how others managed to accomplish this task. If i remind it correctly, the google page is also essentially a short html page with a bunch of js.
I'm writing a web app that inserts and modifies HTML elements via AJAX using JQuery. It works very nicely, but I want to be sure everything is ok under the bonnet. When I inspect the source of the page in IE or Chrome it shows me the original document markup, not what has changed since my AJAX calls.
I love using the WC3 validator to check my markup as it occasionally reminds me that I've forgotten to close a tag etc. How can I use this to check the markup of my page after the original source served from the server has been changed via Javascript?
Thank you.
Use developer tool in chrome to explore the DOM : it will show you all the HTML you've added in javascript.
You can now copy it and paste it in any validator you want.
Or instead of inserting code in JQuery, give it to the console, the browser will then not be able to close tags for you.
console.log(myHTML)
Both previous answers make good points about the fact the browser will 'fix' some of the html you insert into the DOM.
Back to your question, you could add the following to a bookmark in your browser. It will write out the contents of the DOM to a new window, copy and paste it into a validator.
javascript:window.open("").document.open("text/plain", "").write(document.documentElement.outerHTML);
If you're just concerned about well-formedness (missing closing tags and such), you probably just want to check the structure of the chunks AJAX is inserting. (Once it's part of the DOM, it's going to be well-formed... just not necessarily the structure you intended.) The simplest way to do that would probably be to attempt to parse it using an XML library. (one with an HTML mode that can be made strict, if you're not using XHTML)
Actual validation (Testing the "You can't put tag X inside tag Y" rules which browsers generally don't care too much about) is a lot trickier and, depending on how much effort you're willing to put into it, may not be worth the trouble. (Because, if you validate them in isolation, you'll get a lot of "This is just a fragment" false positives)
Whichever you decide to use, you need to grab the AJAX responses before the browser parses them if you want a reliable test result. (While they're still just a string of text rather than a DOM tree)