HTML Encoded strings recognized by the javascript engine, how's it possible?

HTML Encoded strings recognized by the javascript engine, how's it possible? - javascript

Well. This night was a very strange night to me. I am sorry to create a new question after creating two other questions previously, but this is another argument at all. If I get an answer here, I'll get an answer to those questions too so please somebody listen to me and try to understand.
It all began with a simple script JS to be generated through an aspx codebehind file.
On a control, I had to put a JavaScript in this way:
this.MyTxtBox.Attributes["onfocus"] = "windows.alert('Hello World!');";
OK. You might think, where's the problem? The problem is that ASP.NET 4.0 encodes everything, and I say everything in order to avoid XSS to be performed on a site. Well this might not seem a problem but if you look at the rendered page you'll make a jump on the chair like I did:
<textarea id="..." onfocus="windows.alert('Hello World!');"></textarea>
As you can see the html, the final html is a bit odd... JavaScript engine should not accept this situation.
So I started this questions:
ASP.NET quote character encoding causes problems when setting a control's property
Asp.Net encoding configuration
Well I still haven't got any answer YES we could not understand what the hell it is necessary to modify in the .net configuration in order not to let this situation happen.
But now I consider one thing, one important thing: JavaScript engine works!
Even with that odd code that should not be interpreted...
I hope everything was clear until now... The question now comes:
Is this a normal situation for the JavaScript engine?
Does every browser will correctly interpret a JavaScript having quotes replaced with their encoded strings?
If this is true I have to suppose that the .net does not provide a mechanism to avoid encoding just for this reason!

Re:
<textarea id="..." onfocus="windows.alert('Hello World!');"></textarea>
There's nothing odd about that (other than your using windows.alert instead of window.alert). It should work fine (and does; example). The HTML parser parses HTML attribute values, and handles processing entities like '. The JavaScript source code it eventually hands to the JavaScript interpreter will have quotes in it. The browser doesn't hand the literal characters & # 3 9 ; to the JavaScript interpreter.
It's just the same as:
<input type='text' value="This is a 'funny' value too">
The HTML parser processes the entities, and the actual value assigned to the input is This is a "funny" value too.
Incidentally, this is also why this seemingly-innocent HTML is actually wrong and will fail validation (although most browsers will allow it):
<a href='http://www.google.com/search?q=foo&hl=en'>Search for foo</a>
More correctly, that should be:
<a href='http://www.google.com/search?q=foo&hl=en'>Search for foo</a>
<!-- ^^^^^--- difference here -->
...because the HTML parser parses the value, then assigns the parsed result to the href attribute. And of course, an & introduces a character entity and so to literally get an & you must use & everywhere in HTML. (Again, most browsers will let you get away with it if what follows the & doesn't look like an entity. But that can and will bite you.)

Related

Exploit an XSS when injected Javascript code is returned capitalized

I have found an XSS vulnerability in a piece of code, as I'm able to inject Javascript code in it.
I want to generate the simple alert PoC, but I'm not able to do so as the JS code returned by the server is always capitalized. For example, when I inject the following code:
Text sample <script>alert(document.cookie)</script>
The server respond with the page containing the following:
Text sample <script>ALERT(DOCUMENT.COOKIE)</script>
Which obviously does not print the cookie as JS is case sensitive.
Is there a way to transform the code injected in lowercase before it gets rendered or a similar solution?
Note: Javascript is enabled and if I modify manually the code in the browser console transforming it in lowercase, I'm getting the cookie printed.

No, you do not have control over the transformation and you cannot somehow change it back to lowercase before execution.
However, you can inject JavaScript code which is not affected by the capitalisation of the characters. See jsfuck, which doesn't need alphanumeric source characters at all, and use a similar approach (you can actually use digits and some characters).

Bookmarklet that uses current url in post request

Sorry if my question displays a lack of fundamental Javascripting knowledge as I have next to none.
I Frankensteined together the following bookmarklet during hours of obsessive Googling for a way to do what I want to do:
javascript:'<body onload="document.forms[0].submit()"><form method="post" action="https://generic.web.proxy/request.php?do=go"><input type="text" name="get" value=???></form>'
(Mostly based on this.)
Basically, it mostly does what it should: performing a post request on that web proxy's page with whatever string is used for value. The problem is what I want to use for value, namely the URL of the current Firefox tab. In other bookmarklet examples I've seen this seems to be accomplished with location.href, but if I do it like this
value=location.href
it simply assumes that it's the string "location.href". I assume that's because I'm stupidly trying to directly use a Javascript thingie in the html part of the script, but what is the alternative?

Oh boy, I think I just figured it out. Since the javascript treats the html like any other string, I can simply use normal string manipulation on it:
'<beginningofhtml'+location.href+'endofhtml>'
Applied to my bookmarklet:
javascript:'<body onload="document.forms[0].submit()"><form method="post" action="https://generic.web.proxy/request.php?do=go"><input type="text" name="get" value='+location.href+'></form>'
And it works!
(the fake example web proxy url still needs to be replaced with the correct one of corse; it works with a site called Proxfree)

Secure database entry against XSS

I'm creating an app that retrieves the text within a tweet, store it in the database and then display it on the browser.
The problem is that I'm thinking if the text has PHP tags or HTML tags it might be a security breach there.
I looked into strip_tags() but saw some bad reviews. I also saw suggestions to HTML Purifier but it was last updated years ago.
So my question is how can I be 100% secure that if the tweet text is "<script> something_bad() </script>" it won't matter?
To state the obvious the tweets are sent to the database from users so I don't want to check all individually before displaying them.

You are NEVER 100% secure, however you should take a look at this. If you use ENT_QUOTES parameter too, currently there are no ways to inject ANY XSS on your website if you're using valid charset (and your users don't use outdated browsers). However, if you want to allow people to only post SOME html tags into their "Tweet" (for example <b> for bold text), you will need to take a deep look at EACH whitelisted tag.

You've passed the first stage which is to recognise that there is a potential issue and skipped straight to trying to find a solution, without stopping to think about how you want to deal the scenario of the content. This is a critical pre-cusrsor to solving the problem.
The general rule is that you validate input and escape output
validate input
- decide whether to accept or reject it it in its entirety)
if (htmlentities($input) != $input) {
die "yuck! that tastes bad";
}
escape output
- transform the data appropriately according to where its going.
If you simply....
print "<script> something_bad() </script>";
That would be bad, but....
print JSONencode(htmlentities("<script> something_bad() </script>"));
...then you'd would have done something very strange at the front end to make the client susceptivble to a stored XSS attack.

If you're outputting to HTML (and I recommend you always do), simply HTML encode on output to the page.
As client script code is only dangerous when interpreted by the browser, it only needs to be encoded on output. After all, to the database <script> is just text. To the browser <script> tells the browser to interpret the following text as executable code, which is why you should encode it to <script>.
The OWASP XSS Prevention Cheat Sheet shows how you should do this properly depending on output context. Things get complicated when outputting to JavaScript (you may need to hex encode and HTML encode in the right order), so it is often much easier to always output to a HTML tag and then read that tag using JavaScript in the DOM rather than inserting dynamic data in scripts directly.
At the very minimum you should be encoding the < & characters and specifying the charset in metatag/HTTP header to avoid UTF7 XSS.

You need to convert the HTML characters <, > (mainly) into their HTML equivalents <, >.
This will make a < and > be displayed in the browser, but not executed - ie: if you look at the source an example may be <script>alert('xss')</script>.
Before you input your data into your database - or on output - use htmlentities().
Further reading: https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet

Output script tags without jQuery, avoiding execution

I have JS calling remote server through AJAX. The response contains something similar to this
<script>alert(document.getElementById('some_generated_id').innerHTML; ... </script>
User copies the response and uses for own purposes. Now I need to make sure that not a single browser runs the code when I do this:
var response = '<scrip.....';
document.getElementById('output_box').innerHTML = response;
Same should apply to any HTML tags. I know that .text() from jQuery will do exactly what I need:
var response = '<scrip.....';
$('#output_box').text(response);
I am looking for any solutions, including, but not limited to: escaping special characters, however displaying them correctly; adding zero-width space to tags (has to be efficient); outputting in parts. Has to be pure JS.

If you're using a server-side language there is probably a method to escape special characters.
In PHP you could use htmlspecialchars(), it will convert certain characters that have significance in HTML to HTML entities (i.e. & to &).
They will still display correctly and you'll be able to copy and paste the text, but the javascript shouldn't run.
If you need a pure javascript solution for this, someone has answered that here https://stackoverflow.com/a/4835406/15000

Is it a bad idea to auto generate javascript code from the server?

I'm developing a facebook app right now all by my lonesome. I'm attempting to make a javascript call on an onclick event. In this onclick event, I'm populating some arguments (from the server side in php) based on that item that is being linked. I'm inserting a little bit of JSON and some other stuff with funky characters.
Facebook expects all the attribute fields of an anchor to be strictly alphanumeric. No quotes, exclamation marks, anything other than 0-9a-Z_. So it barfs on the arguments I want to pass to my javascript function (such as JSON) when the user clicks that link.
So I thought, why don't I use my templating system to just autogenerate the javascript? For each link I want to generate, I generate a unique javascript function (DoItX where X is a unique integer for this page). Then instead of trying to pass arguments to my javascript function via onclick, I will insert my arguments as local variables for DoX. On link "X" I just say onclick="DoX()".
So I did this and viola it works! (it also helps me avoid the quote escaping hell I was in earlier). But I feel icky.
My question is, am I nuts? Is there an easier way to do this? I understand the implications that somehow somebody was able to change my templated local variable, ie:
var local = {TEMPLATED FIELD};
into something with a semicolon, inserting arbitrary javascript to the client. (and I'm trying to write code to be paranoid of this).
When is it ok (is it ever ok) to generate javascript from the server? Anything I should look out for/best practices?

Depending on your application generating JavaScript in your templating language can save a lot of time but there are pitfalls to watch out for. The most serious one being that it gets really hard to test your JavaScript when you don't have your full templating stack available.
One other major pitfall is that it becomes tempting to try and 'abstract' JavaScript logic to some higher level classes. Usually this is a sign that you will be shaving yaks in your project. Keep JavaScript login in JavaScript.
Judging from the little bit of information you have given it your solution seems sensible.

If you must generate javascript, I would suggest only generating JSON and having all functions be static.
It more cleanly separates the data, and it also makes it easier to validate to prevent XSS and the like.

JS generated from server is used in lots of areas. The following is the sample from a ASP.NET page where the JS script is generated by the framework:
<script src="/WebResource.axd?d=9h5pvXGekfRWNS1g8hPVOQ2&t=633794516691875000" type="text/javascript"></script>
Try to have reusable script functions that don't require regeneration; and 'squeeze' out the really dynamic ones for server-side generation.

If you want to feel better about it, make sure that most of your JavaScript is in separate library files that don't get generated, and then, when you generate code, generate calls to those libraries rather than generating extensive amounts of JavaScript code.

it's fine to generate JS from the server. just bear in mind not to fine too big a page from the server.

Generally speaking I avoid ever automatically generating JavaScript from a server-side language, though I do however; create JavaScript variables that are initialized from server-side variables that my JavaScript will use. This makes testing and debugging much simpler.
In your case I may create local variables like the following which is easy to test:
<script type='text/javascript' language='javascript'>
<!--
var FUNC_ARG_X = <%= keyX %>;
var FUNC_ARG_Y = <%= keyY %>;
var FUNC_ARG_Z = <%= keyZ %>;
//-->
</script>
<script type='text/javascript' language='javascript'>
<!--
function DoCleanCall(arg) {
// Whatever logic here.
}
//-->
</script>
now in your markup use:
<a href='#' onclick='DoCleanCall(FUNC_ARG_X);'>Test</a>
Now of course you could have embedded the server-side variable on the <a/> tag, however it is sometimes required that you refer to these values from other parts of your JavaScript.
Notice also how the generated content is in it's own <script> tag, this is deliberate as it prevents parsers from failing and telling you that you have invalid code for every reference you use it in (as does ASP.NET), it will still fail on that section only however.

We Keep Coding

JavaScript is the programming language of the Web.