I recently was reading a JavaScript book and discovered using innerHTML to pass plain text poses a security risk, so I was wondering does using the html() jQuery method pose these same risks? I tried to research it but I could not find anything.
For Example:
$("#saveContact").html("Save"); //change text to Save
var saveContact = document.getElementById("saveContact");
saveContact.innerHTML = "Save"; //change text to Save
These do the same thing from what I know, but do they both pose the same security risk of someone being able to inject some JavaScript and execute it?
I am not very knowledgeable in security, so I apologize in advance if anything is incorrect or explained incorrectly.
From the JQuery documentation:
Additional Notes:
By design, any jQuery constructor or method that
accepts an HTML string — jQuery(), .append(), .after(), etc. — can
potentially execute code. This can occur by injection of script tags
or use of HTML attributes that execute code (for example, ). Do not use these methods to insert strings obtained from
untrusted sources such as URL query parameters, cookies, or form
inputs. Doing so can introduce cross-site-scripting (XSS)
vulnerabilities. Remove or escape any user input before adding content
to the document.
So, for example, if the user were to pass an HTML string that contains a <script> element, then that script would be executed:
$("#input").focus();
$("#input").on("blur", function(){
$("#output").html($("#input").val());
});
textarea { width:300px; height: 100px; }
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="input"><script>alert("The HTML in this element contains a script element that was processed! What if the script contained malicious content?!")</script></textarea>
<div id="output">Press TAB</div>
But, if we escape the string's contents before we pass it, we're safer:
$("#input").focus();
$("#input").on("blur", function(){
$("#output").html($("#input").val().replace("<", "<").replace(">", ">"));
});
textarea { width:300px; height: 100px; }
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="input"><script>alert("This time the < and > characters (which signify an HTML tag are escaped into their HTML entity codes, so they won't be processed as HTML.")</script></textarea>
<div id="output">Press TAB</div>
Finally, the best way to avoid processing a string as HTML is not to pass it to .innerHTML or .html() in the first place. That's why we have .textContent and .text() - they do the escaping for us:
$("#input").focus();
$("#input").on("blur", function(){
// Using .text() escapes the HTML automatically
$("#output").text($("#input").val());
});
textarea { width:300px; height: 100px; }
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="input"><script>alert("This time nothing will be processed as HTML.")</script></textarea>
<div id="output">Press TAB</div>
From the .html() docs:
By design, any jQuery constructor or method that accepts an HTML string — jQuery(), .append(), .after(), etc. — can potentially execute code. This can occur by injection of script tags or use of HTML attributes that execute code (for example, ). Do not use these methods to insert strings obtained from untrusted sources such as URL query parameters, cookies, or form inputs. Doing so can introduce cross-site-scripting (XSS) vulnerabilities. Remove or escape any user input before adding content to the document.
This is why .innerHTML is bad and why .html() is also not good to use on strings from untrusted sources, say if you make an ajax request to get some data from an untrusted third party. You should use one of the numerous methods here or better still, a proven library function.
Related
I am trying to get the EXACT html content of a div.
When using the html() function from jQuery, the result does not match the actual content.
Please check this fiddle and click on the black square:
http://jsfiddle.net/qRska/6/
The code:
<div id="mydiv" style="width:100px; height: 100px; background-color:#000000; cursor:pointer;">
<div id="INSIDE" style="background-color:#ffffff; border-style:none;"></div>
</div>
$('#mydiv').click(function() {
alert($(this).html());
});
jQuery change the color to RGB format and remove the border-style attribute.
How can I solve this problem?
The browser consumes the HTML, generates a DOM, then discards the HTML. innerHTML (which is what .html() eventually hits) gives a serialisation of the DOM back to HTML.
If you want to get the raw HTML, then you'll need to use XMLHttpRequest to fetch the source code of the current URL and then process it yourself.
What you want to do is unfortunately not possible. The original HTML is not available after it is parsed by the browser, so you have to jump through some hoops to prevent the browser from processing it.
One possible solution that I've used before is to wrap the HTML in comment tags, which would remain unchanged by the browser. You can then extract the comment using jQuery's .text() method; strip out the comment tags with string replacement; make the necessary changes to the markup; and then inject it back into the document.
The other alternative is to use AJAX to load the HTML. Make sure you set the contentType to 'text' so it doesn't get processed by the browser.
I have some script written using the jQuery framework.
var site = {
link: $('#site-link').html()
}
This gets the html in the div site-link and assigns it to link. I later save link to the DB.
My issue is I don't want the html as I see this as being to dangerous, maybe?
I have tried:
link: $('#site-link').val()
... but this just gives me a blank value.
How can I get the value inside the div without any markup?
Try doing this:
$('#site-link').text()
From the jQuery API Documentation:
Get the combined text contents of each element in the set of matched
elements, including their descendants, or set the text contents of the
matched elements.
Use the .text() jquery method like this:
var site = {
link: $('#site-link').text()
}
Here is an example of what .val(), .html() and .text() do: jsfiddle example
Use the text() method.
Get the combined text contents of each element in the set of matched elements, including their descendants, or set the text contents of the matched elements.
Use the .text() function of jQuery to get the only text.
var site = {
link: $('#site-link').text()
}
to avoid html, you will be required to use text() method of jquery.
var site = {
link: $('#site-link').text()
}
http://api.jquery.com/text/
If you are planning to store the result in the database and you are concerned about HTML, than using something like .text() rather than .html() is just an illusion of security.
NEVER EVER trust anything that comes from the client side!
Everything on the client side is replaceble, hijackable by the client rather easily. With the Tamper Data firefox plugin for example, even my mother could change the data sent to the server. She could send in anything in place of the link. Like malicious scripts, whole websites, etc...
It is important that before saving the "link" to the database you validate it on the server side. You can write a regex to check if a string is a valid url, or just replace everything that is html.
It's also a good idea to html encode it before outputting. This way even if html gets into your database, after encoding it will be just a harmless string (well there are other stuff to be aware of like UTF-7, but the web is a dangerous place).
I'm using varnish+esi to return external json content from a RESTFul API.
This technique allows me to manage request and refresh data without using webserver resources for each request.
e.g:
<head>
....
<script>
var data = <esi:include src='apiurl/data'>;
</script>
...
After include the esi varnish will return:
var data = {attr:1, attr2:'martin'};
This works fine, but if the API returns an error, this technique will generate a parse error.
var data = <html><head><script>...api js here...</script></head><body><h1 ... api html ....
I solved this problem using a hidden div to parse and catch the error:
...
<b id=esi-data style=display:none;><esi:include src='apiurl/data'></b>
<script>
try{
var data = $.parseJSON($('#esi-data').html());
}catch{ alert('manage the error here');}
....
I've also tried using a script type text/esi, but the browser renders the html inside the script tag (wtf), e.g:
<script id=esi-data type='text/esi'><esi:include src='apiurl/data'></script>
Question:
Is there any why to wrap the tag and avoid the browser parse it ?
Let me expand upon the iframe suggestion I made in my comment—it's not quite what you think!
The approach is almost exactly the same as what you're doing already, but instead of using a normal HTML element like a div, you use an iframe.
<iframe id="esi-data" src="about:blank"><esi:include src="apiurl/data"></iframe>
var $iframe = $('#esi-data');
try {
var data = $.parseJSON($iframe.html());
} catch (e) { ... }
$iframe.remove();
#esi-data { display: none; }
How is this any different from your solution? Two ways:
The data/error page are truly hidden from your visitors. An iframe has an embedded content model, meaning that any content within the <iframe>…</iframe> tags gets completely replaced in the DOM—but you can still retrieve the original content using innerHTML.
It's valid HTML5… sort-of. In HTML5, markup inside iframe elements is treated as text. Sure, you're meant to be able to parse it as a fragment, and it's meant to contain only phrasing content (and no script elements!), but it's essentially just treated as text by the validator—and by browsers.
Scripts from the error page won't run. The content gets parsed as text and replaced in the DOM with another document—no chance for any script elements to be processed.
Take a look at it in action. If you comment out the line where I remove the iframe element and inspect the DOM, you can confirm that the HTML content is being replaced with an empty document. Also note that the embedded script tag never runs.
Important: this approach could still break if the third party added an iframe element into their error page for some reason. Unlikely as this may be, you can bulletproof the approach a little more by combining your technique with this one: surround the iframe with a hidden div that you remove when you're finished parsing.
Here I go with another attempt.
Although I believe you already have the possibly best solution for this, I could only imagine that you work around it with a fairly low-performance method of calling esi:insert in a separate HTML window, then retrieve the contents as if you were using AJAX on the server. Perhaps similar to this? Then check the contents you retrieved, maybe by using json_decode and on success generate an error JSON string.
The greatest downside I see to this is that I believe this would be very consuming and most likely even delays your requests as the separate page is called as if your server yourself was a client, parsed, then sent back.
I'd honestly stick to your current solution.
this is a rather tricky problem with no real elegant solution, if not with no solution at all
I asked you if it was an HTML(5) or XHTML(5) document, because in the later case a CDATA section can be used to wrap the content, changing slightly your solution to something like this :
...
<b id='esi-data' style='display:none;'>
<![CDATA[ <esi:include src='apiurl/data'> ]]>
</b>
<script>
try{
var data = $.parseJSON($('#esi-data').html());
}catch{ alert('manage the error here');}
....
Of crouse this solution works if :
you're using XHTML5 and
the error contains no CDATA section (because CDATA section nesting is impossible).
I don't know if switching from one serialization to the other is an option, but I wanted to clarify the intent of my question. It will hopefully help you out :).
Can't you simply change your API to return JSON { "error":"error_code_or_text" } on error? You can even do something meaningful in your interface to alert user about error if you do it that way.
<script>var data = 999;</script>
<script>
data = <esi:include src='apiurl/data'>;
</script>
<script>
if(data == 999) alert("there was an error");
</script>
If there is an error and "data" is not JSON, then a javascript error will be thrown. The next script block will pick that up.
When someone posts a link to another page on my website, I'd like to shorten the a href text from something like: http://mywebsite.com/posts/8 to /posts/8 or http://mywebsite.com/tags/8 to /tags/8. Since I'm learning javascript I don't want to depend on a library like prototype or jquery. Is it recommended to use javascript's replace method?
I found w3schools' page here but my code was replacing all instances of the string, not just the href text.
Here's what I have so far:
<script type="text/javascript" charset="utf-8">
var str="http://www.mywebsite.com";
document.write(str.replace("http://www.", ""));
</script>
str = str.replace(/^http:\/\/www.mywebsite.com/, "");
someElement.appendChild(document.createTextNode(str));
Note that you're introducing a Cross-Site Scripting vulnerability by directly calling document.write with user input (you could also say you're not treating the URL http://<script>alert('XSS');</script> correctly).
Instead of using document.write, replace someElement in the above code with an element in your code that should contain the user content. Notice that this code can not be at the JavaScript top level, but should instead called when the load event fires.
Prototype's Template class allows you to easily substitute values into a string template. Instead of declaring the Template source-string in my code, I want to extract the source-string from the DOM.
For example, in my markup I have an element:
<div id="template1">
<img src="#{src}" title="#{title}" />
</div>
I want to create the template with the inner contents of the div element, so I've tried something like this:
var template = new Template($('template1').innerHTML);
The issue is that Internet Explorer's representation of the innerHTML omits the quotes around the attribute value when the value has no spaces. I've also attempted to use Element#inspect, but in Internet Explorer I get back a non-recursive representation of the element / sub-tree.
Is there another way to get a Template-friendly representation of the sub-tree's contents?
Looks like you can embed the template source inside a textarea tag instead of a div and retrieve it using Element#value.
Certainly makes the markup a little weird, but it still seems reasonably-friendly to designers.
Additionally, as Jason pointed out in a comment to the original question, including the img tag in the textarea prevents a spurious request for an invalid image.
Resig to the rescue:
You can also inline script:
<script type="text/html" id="user_tmpl">
<% for ( var i = 0; i < users.length; i++ ) { %>
<li><%=users[i].name%></li>
<% } %>
</script>
Quick tip: Embedding scripts in your
page that have a unknown content-type
(such is the case here - the browser
doesn't know how to execute a
text/html script) are simply ignored
by the browser - and by search engines
and screenreaders. It's a perfect
cloaking device for sneaking templates
into your page. I like to use this
technique for quick-and-dirty cases
where I just need a little template or
two on the page and want something
light and fast.
and you would use it from script like
so:
var results = document.getElementById("results");
results.innerHTML = tmpl("item_tmpl", dataObject);