document.documentElement.outerHTML returns a different HTML - javascript

I'm trying to make a js app that reads and checks HTML/JS code inserted by the user. I'm using document.[some div].outerHTML to retrieve the full html contained in that element.
But, I noticed that IE10 (and other versions of IE) modify HTML returned by .outerHTML property. In particular, it changes the order of parameters of any element. For example:
Original code:
<a id="test" class="myclass" name="test" href="mylink">link</a>
Becames something like:
HTML returned by .outerHTML
<a name="test" href="mylink" id="test" class="myclass">link</a>
As you see, the name and href params are reordered.
The odd thing is that, if I inspect the code via "view source" on IE, the code is correct and not-reordered.
How come this happens only with .outerHTML? I tried with other browsers, and only IE seems to have this behavior.
Is there a way to retrieve same HTML as "view source" by JS?
Thanks in advance,

Related

How does a browser render this inline JavaScript within an encoded tag?

I was trying to perform a Reflective XSS attack on a tutorial website. The webpage basically consists of a form with an input field and a submit button. On submitting the form, the content of the input field are displayed on the same webpage.
I figured out that the website is blacklisting script tag and some of the JavaScript methods in order to prevent an XSS attack. So, I decided to encode my input and then tried submitting the form. I tried 2 different inputs and one of them worked and the other one didn't.
When I tried:
<body onload="&#97lert('Hi')"></body>
It worked and an alert box was displayed. However, I when encoded some characters in the HTML tag, something like:
&#60body onload="&#97lert('Hi')"&#62&#60/body&#62
It didn't work! It simply printed <body onload="alert('Hi')"></body> as it is on the webpage!
I know that the browsers execute inline JavaScript as they parse an HTML document (please correct me if I'm wrong). But, I'm not able to understand why did the browser show different behavior for the different inputs that I've mentioned.
-------------------------------------------------------------Edit---------------------------------------------------------
I tired the same with a more basic XSS tutorial with no XSS protection. Again:
<script>alert("Hi")</script> -> Worked!
&#60s&#99ript&#62&#97lert("Hi")&#60/s&#99ript&#62 -> Didn't work! (Got printed as string on the Web Page)
So basically, if I encode anything in JavaScript, it works. But if I'm encoding anything that is HTML, it's not executing the JavaScript within that HTML!
I can't come up with words to describe the properly, so i'll just give you an example. Lets say we have this string:
<div>Hello World! <span id="foo">Foobar</span></div>
When this gets parsed, you end up with a div element that contains the text:
Hello World! <span id="foo">Foobar</span>
Note, while there is something that looks like html inside the text, it is still just text, not html. For that text to become html, it would have to be parsed again.
Attributes work a little bit differently, html entities in attributes do get parsed the first time.
tl;dr:
if the service you are using is stripping out tags, there's nothing you can do about it unless the script is poorly written in a way that results in the string getting parsed twice.
Demo: http://jsfiddle.net/W6UhU/ note how after setting the div's inner html equal to it's inner text, the span becomes an html element rather than a string.
When an HTML page says &#60body It treats it the same as if it said <body
That is, it just displays the encoded characters, doesn't parse them as HTML. So you're not creating a new tag with onload attributes http://jsfiddle.net/SSfNw/1/
alert(document.body.innerHTML);
// When an HTML page says <body It treats it the same as if it said <body
So in your case, you're never creating a body tag, just content that ends up getting moved into the body tag http://jsfiddle.net/SSfNw/2/
alert(document.body.innerHTML)
// <body onload="alert('Hi')"></body>
In the case <body onload="&#97lert('Hi')"></body>, the parser is able to create the body tag, once within the body tag, it's also able to create the onload attribute. Once within the attribute, everything gets parsed as a string.

prevent auto tag generate in javascript innerHTML

Why does
theMenuSection.innerHTML='<ul>';
alert(theMenuSection.innerHTML);
Produce <ul> </ul> just i have the opening tag.can i get <ul> from innerHTML tag
thanks
Typically (this is per-browser dependent) when you interact with the DOM, or utilize innerHTML (which interacts with the DOM), you will see the browsers "corrected" interpretation of what the HTML should look like. Same reason why Firefox auto-injects "thead" elements into tables when your source code doesn't have any of that. The browser is trying to infer 'correct' HTML from what it sees.

Source code doesn't show up

Basically, if I rightclick in any browser and choose to view source the code won't show up, even though i can clearly see the content on the page (tried on IE, Firefox, Chrome)
If I use the "inspect element" feature of Chrome/Firefox, I can however view the code
This is the respective code of my index.html:
<!-- [TABLE] -->
<div id="centercol" align="center">
<table id="table">
</table>
</div>
I'm using appendChild() to add the tr/td's in my javascript
InspectElement : http://i.imgur.com/pZBb5.png
View Source : http://i.imgur.com/W7pXm.png
Why does this happen?
Viewing source code sees the hard code / static code, inspecting DOM shows dynamic code as it's generated. You can get the generated source code using innerHTML.
The source code is the original document, unmodified by JavaScript
Inspect element shows you the serialization of the DOM, which is basically the markup that is visually represented on screen.
The "source code" is the original response body sent from server. When you inspect element, it represents the live state of the page in serialized form.
For instance, literally just sending this from server:
<script>
Might become this in inspector as the above is parsed and serialized:
<html><head><script></script></head><body></body></html>
It's happening because "viewsource" doesn't run JavaScript.
If your entire page is JS, then you'll just see the non-JS elements.

Get raw HTML from a div using js?

I'm working on a website where users can create and save their own HTML forms. Instead of inserting form elements and ids one by one in the database I was thinking to use js (preferably jquery) to just get the form's HTML (in code source format) and insert it in a text row via mysql.
For example I have a form in a div
<div class="new_form">
<form>
Your Name:
<input type="text" name="something" />
About You:
<textarea name=about_you></textarea>
</form>
</div>
With js is it possible to get the raw HTML within the "new_form" div?
To get all HTML inside the div
$(".new_form").html()
To get only the text it would be
$(".new_form").text()
You might need to validate the HTML, this question might help you (it's in C# but you can get the idea)
Yes, it is. You use the innerHTML property of the div. Like this:
var myHTML = document.getElementById('new_form').innerHTML;
Note when you use innerHTML or html() as above you won't get the exact raw HTML you put in. You'll get the web browser's idea of what the current document objects should look like serialised into HTML.
There will be browser differences in the exact format that comes out, in areas like name case, spacing, attribute order, which characters are &-escaped, and attribute quoting. IE, in particular, can give you invalid HTML where attributes that should be quoted aren't. IE will also, incorrectly, output the current values of form fields in their value attributes.
You should also be aware of the cross-site-scripting risks involved in letting users submit arbitrary HTML. If you are to make this safe you will need some heavy duty HTML ‘purification’.

How is it possible to dynamically change the flash file using swfobject through javascript

I'm looking to dynamically change the flash files based from an hyperlink on the page without the page having to reloading. Is this possible through javascript?
Yes, it's possible. See this tutorial:
http://learnswfobject.com/advanced-topics/load-a-swf-using-javascript-onclick-event/
Do you mean you want to change the destination of a hyperlink on the page through javascript? Something like this should do that:
<body>
<a id="test" href="http://www.google.com">Go to Google</a>
<input type="button" onclick="document.getElementById('test').href = 'http://www.yahoo.com'" value="Go to Yahoo"/>
</body>
Or are you trying to do something different? Like changing what Flash is displayed by clicking a hyperlink? Haven't tried it, but the above approach might work there, too, with a little tweaking. Remember that the attributes of an element (like the href above) are available in javascript as properties once you have a reference to the element, so you should be able to change whatever attribute of the element you need to.

Categories