Get the html of the javascript-rendered page (after interacting with it)

Get the html of the javascript-rendered page (after interacting with it) - javascript

I would like to be able to save the state of the html page after I've interacted with it.
Say I click a checkbox, or the javascript set the values of various elements.
How can I save the "javascript-rendered" page?
Thanks.

In Chrome (and apparently Firefox), there is a special copy() method that will copy the rendered content to the clipboard. Then you can do whatever you want by pasting it to your preferred text editor.
https://developers.google.com/chrome-developer-tools/docs/commandline-api#copyobject
Console Example:
copy(document.body.innerHTML);
Note: I noticed Chrome reports undefined after the method is run, however, it seems to execute correctly and the right content is in the clipboard.

That should do and will grab the ALL page not just the body
console.log(document.getElementsByTagName('html')[0].innerHTML);

document.body.innerHTML will get you the HTML representation of the current document body.
That will not necessarily include all internal state of DOM objects because the HTML contains the initial default state of objects, not necessarily the state that they may have been changed to. The only way to guarantee you get all that state is to make a list of what state you want to save and actually programmatically get that state.
To answer the part of your question about saving it, you'll have to describe more about what problem you're really trying to solve.

To get the equivalent of view source with javascript rendered, including doctype and html tags, copy the command into the chrome console:
console.log(new XMLSerializer().serializeToString(document.doctype) + document.getElementsByTagName('html')[0].outerHTML);
In the chrome console, hover at the end of the output and click on the copy link to copy to the pasteboard.

Related

Where are input and dom values stored?

This is a very basic question but I am not sure how to research it. Let's say I have an html file with an input field and a javascript file that contains a function to grab the values entered in the the input:
HTML:
<input type='text' id='value' onclick='getValue()'>
JS File:
var val;
function getValue(){
val = document.getElementById('value').value;
console.log(val)
}
When a user inputs a value, that value is stored in the DOM. I then grab the value from the DOM and store it in my script file loaded in the browser.
Since this is stored within my script file now. If I were to reload the page, all the values stored would be reset to their original files.
Is it accurate to say the value taken from the DOM is stored in my loaded javascript file? Or is that stored someplace else?

Th input value gets stored in the DOM tree (if you want to know where exactly: see the code, e.g.: of Chrome and Firefox). Th JavaScript code you posted makes a copy of that value. The copy is independent of the value stored in the DOM tree, you can delete the input-element and will still have the copy in JavaScript. So the answer is threefold:
the value is in the DOM-tree first
a copy of the input value is in the JavaScript Stack when you copy it and in the DOM-tree; at least I know of no one-step way to move it.
if you delete the DOM element the copy you made in JavaScript stays in JavaScript
That makes it possible for example to run JavaScript and DOM-parsing with two distinct programs. Chrome does it with their V8-machine which you can put into some thin wrap and run it separately. You may call the result "node" if you want.

Saying that it's stored "in [your] script file" is a little misleading, but technically accurate.
Any data for DOM elements is stored within the browser's memory for that tab, which it manages itself. Since your Javascript is just a part of that memory, it is in some way stored there. The actual fiddly bits of how that all works is hidden within the browser, and is probably different between browsers.

How does this JS copy trick work?

On this page almost anywhere on the page if you copy you'll get the string Read more at http:// added to the end of your copy. I was wondering how. After looking at the source (post-copypaste.js) and setting a breakpoint I didn't understand. That area seems to be firing when i select text.
I tried looking at the DOM (via view selected source in firefox) and I didn't see the text in the dom. So it must be a javascript trick. I can imagine catching a control C event (i dont know if that is what is happening) but i cant imagine how you can add or affect the text being compied in since it belongs to the dom. I don't see flickering or anything
How does that JS trick work or how do i debug it to figure it out?

But the awkward thing is the selection on the regular window/dom doesn't seem to be affected.
It is, but just not visible. What usually happens is there is a container somewhere else on the page (not necessarily visible). The content you have selected is being pasted in there, then extended, then copied and deleted from the container. It all needs a fraction of a second and by the time you paste it in somewhere, your clipboard is already storing the extended content.
If you look closely on the page you have linked in as an example, there is an empty div tag in the body with a class of pw-root. <div class='pw-root'></div> When you copy the text, for a second (visible in Firebug for instance) it changes as explained above then gets emptied again.

How can we a get a tab element when the document is known?

Before explaining the question, I want to explain my main goal (If there is a better way than my approach):
I have the document element available with me and ideally I wanted to get a browser element such that it identifies a tab uniquely. In my previous approach I used
gBrowser.getBrowserForDocument(doc);
This returned the browser which was indeed unique to the tab (in the sense that attributes stored in it persisted across pages).
If instead, I don't store the browser element, and after moving to another page in the same tab I try the above command again, then the browser is no longer the same one as before (in the sense that it has lost all the stored attributes).
Therein lies my main problem. I want to get hold of the tab browser which I am able to refer to using different documents loaded in the same tab.
I read about a similar function:
gBrowser.getBrowserForTab(tab);
I have a feeling this might work. But again, I am not able to understand where I can get the parameter "tab" from (given a document).
Note: I am using GWT for the development of the extension
EDIT: To clarify the intent of the question, here's the use case as well as my approach:
In my extension, I am interested in monitoring user behaviour on particular websites. In a way it can be thought of as a session which remains active until the user stays on the same website. During the session, I am often required to store various attributes specific to user behaviour. One of the attributes concerning the question in "isSessionActive":"Y" or "" (blank string stands for no)
To make the code more optimal, I do not instantiate a browser for all the tabs in the beginning. Instead, I wait for the cue using an onLoad function. : if a relevant website is visited
Once that happens, I make a call to get the browser using the current document element, see if it has a non empty value for the attribute isSessionActive. If it does not, I set the attributes value to "Y" and instantiate my class which handles the profiling after that.
If it has value "Y", I know that the session is still active and that I don't need to initialize.
The problem which I'm facing is that after the first instantiation, when I move to another page within the same tab, I expected that the call to
gBrowser.getBrowserforDocument(doc);
would get me the browser instantiated previously since it is basically the same tab.
This is not happening. Each time I get a new Browser instance which does not have the attribute isSessionActive as "Y" (probably because the new page has a new document element). Thus, at present all my code instantiates over and over again which is what I do not want.

If you're only working with the current tab (and not any background tabs), then you could just use gBrowser.selectedTab https://developer.mozilla.org/en/XUL/tabbrowser#p-selectedTab

Jquery (input/textarea).val(): how is it adding content without changing the DOM?

take a look at the JsFiddle here:
http://jsfiddle.net/ru2Fg/2/
Essentially, it starts with two textareas: one empty, one with stuff inside, and an input type=text. I was under the impression that to put stuff in an input you change it's value, and to put stuff in a textarea you add the text as a child to the node.
I perform a $(...).val(...) to change their contents. And their contents do change.
However, the DOM looks exactly the same! I'm printing out the 3 elements with console.log(); they seem unchanged. I look at them with chrome's inspect element: they seem unchanged.
I've looked at jQuery's val() method change doesn't seem to change the DOM, but that question concludes it's something funny with firebug not refreshing the HTML it displays. In this case, i'm quite sure inspect element displays the current html that exists on the page: i've seen the left attribute changing furiously when things are scrolling, for example. I'm also checking it using the console, which tells me the same thing: nothing changed.
My eyes, though, tell me something has changed, as I'm seeing "10, omg, moo" instead of "blank, hello world, 2000". What's going on?
EDIT: I posted the wrong jsFiddle. This should be the correct one now

There is a difference between the value attribute and the value property. When you type in the input box, you are changing the property, not the attribute. The attribute stays the same as when the document was loaded. Among other things, this means you can reset an input box to its default value with elem.value = elem.getAttribute('value');.
Similarly, if you have a drop-down <select> with one of the options having the selected attribute set, even if you choose a different option that attribute will still be there even though the selected property is now false.
The same applies to checkboxes and the checked attribute. The same also applies for the disabled attribute, and several other things too.

It is in-fact changing the DOM, other ways the 10 woulnd't have showed up in the text area anyway. The problem is in the firebug itself(at list the old one), I am not sure if it is still available in the new ones.
To verify, you can use the web console of firefox or console of chrome.

The DOM is completely loaded before anything jQuery happens, so technically the data inserted in the DOM isn't seen by debuggers. The debugging tools see only what is rendered so you won't be able to manipulate the "after the fact" data that arrives via jQuery. You could consider it "out of band" or fudging the DOM in a way. The same happens with AJAX. If you add in data or page content with AJAX methods like .load() you won't see it in the DOM.
An input box in jQuery has the val() method -- which is the value attribute, in the textarea, it is usually the html() method, what the textarea contains.

How to set a text value for document.activeElement?

Knowing that my document.activeElement is an input field (I don't know exactly the name of the component, but may be the Google's search input field, for example), how can I set a text on it programmatically?
--update
I'm trying it from a xul application, via javascript after the page is loaded. A paste command works fine, so I know the field have the focus. (and I didn't put the Xul tag becouse it's just about the javascript)

See the mozilla reference. This is the same type as document.getElementById()
document.activeElement.value = 'new value';

If you are sure it is a input text field, just set the value:
document.activeElement.value = 'value'

Without seeing your code and the context it is running in, I can only speculate. However, my guess is that you are calling document.activeElement from your XUL app, which means document is the chrome document, not the content page. In this case, the active element is likely to be the browser or iframe element you are using to display the content.

I think there's a little more trouble because I'm in a Xul app. Javascript was supposed to work like in the browsers, but it didn't.
What I did to make it work was (after put the content in the clipboard):
controller.doCommand('cmd_selectAll');
controller.doCommand('cmd_paste');

If you want the focused element wherever it may be relative to the given application window, e.g. it may be inside a <browser> element, use document.commandDispatcher.focusedElement.value which is the same as document.commandDispatcher.focusedWindow.document.activeElement.value. This gives you the element that cmd_paste operates on.

We Keep Coding

JavaScript is the programming language of the Web.