Accurately unescape HTML entities in javascript - javascript

In javascript, I need to take a string and HTML un-escape it.
This question over here asks the same question, and the most popular answer involves populating a temporary div.
I've used this as well, but I think I've found a bug.
Simple example, correct behavior
If you have this string: Cats>Dogs
Unescaped, it should be: Cats>Dogs
Malformed example, wrong behavior
If you remove the semicolon and use this instead:Cats&gtDogs
You will get this as a result: Cats>Dogs
Isn't that wrong?
This struck me as odd. From what I understand, an escaped string requires the presence of a terminating semicolon, otherwise it's not escaped. After all, what if I had a store called guitars&amps? For all we know, this company exists but gets no business because it causes null reference exceptions everywhere it has records.
Any ideas on how I could perform escaping while knowingly avoiding escaping when the semicolon is missing? Currently, all I can think to do is perform the unescaping myself.
(The WYSIWYG preview in StackOverflow exhibits a similar unusual behavior, by the way. Try entering &ampgt;, this renders as >!)

Isn't that wrong?
Successful HTML parsers are tolerant. This is one of the things distinguishing them from, say, XML parsers. They don't necessarily stick to strict rules about markup, for the simple reason that there's a lot of incorrect markup out there. So they try to figure out what the markup is meant to represent. &gtDogs is more likely to mean >Dogs than &gtDogs, so that's what the parser goes with.

Related

encodeForHTMLAttribute vs encodeForJavaScript

I'm trying to identify the difference between encodeForHTMLAttribute and encodeForJavaScript. Still, I couldn't find a scenario where untrusted data is used as javascript data values, which broke the code when escaped with encodeForHTMLAttribute, but works securely after escaped using encodeForJavaScript.
I know that for all javascript, its recommended to use encodeForJavaScript. But I like to see the difference.
The answer to your question is boring: The fact is that passing an HTML Entity encoded string to JavaScript is largely an unspecified operation.* JavaScript expects data passed to it to be escaped for JavaScript, with the exception of some API methods, it has no idea what you're sending it.
HTML and JavaScript are different languages. If you'll note, they both have different reserved characters--some the same, meaning they each have reserved characters that make up the language that have to be treated specially when using them in that language.
The correct way to ensure that javascript will always treat incoming code as data, is to escape for JavaScript. Passing it an HTMLEntity encoded String, might work, but we have to say might because that behavior is unspecified. One reason that the question I linked at the beginning partially answers your question is that it is common for JavaScript frameworks to DO that kind of processing on your input... so you better be sure that if that happens, the data is appropriately escaped for JavaScript. Otherwise it will unwrap your HTMLEntity encoding, render and execute script code.
I saved the damning part for last: You can write legal JavaScript with only six characters. And none of these characters are commonly escaped by HTML encoders. Here's an entire guide on using JavaScript escaping like this to evade XSS filters. And for proof that it works, look here.
Or run it yourself in an HTML file on your computer:
<script>[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]+(![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]+[+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((!![]+[])[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+([][[]]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+!+[]]+(+[![]]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+!+[]]]+([][[]]+[])[+[]]+([][[]]+[])[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(+(!+[]+!+[]+[+!+[]]+[+!+[]]))[(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(+![]+([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([![]]+[][[]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(+![]+[![]]+([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]](!+[]+!+[]+!+[]+[+!+[]])[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]])()([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]]((!![]+[])[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+([][[]]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+!+[]]+(+[![]]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+!+[]]]+(!![]+[])[!+[]+!+[]+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(+(!+[]+!+[]+[+!+[]]+[+!+[]]))[(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(+![]+([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([![]]+[][[]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(+![]+[![]]+([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]](!+[]+!+[]+!+[]+[+!+[]])[+!+[]]+(!![]+[])[!+[]+!+[]+!+[]])()(([]+[])[([![]]+[][[]])[+!+[]+[+[]]]+(!![]+[])[+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+([![]]+[][[]])[+!+[]+[+[]]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[!+[]+!+[]+!+[]]]()[+[]])[+[]]+[!+[]+!+[]+!+[]]+(+(+!+[]+[+!+[]]))[(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(+![]+([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([![]]+[][[]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(+![]+[![]]+([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+([][[]]+[])[+!+[]]+(![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[+!+[]]+([][[]]+[])[+[]]+([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]])[!+[]+!+[]+[+[]]]](!+[]+!+[]+[+[]])))()</script>
However, if you escape what's between the <script> tags for Javascript... it will be treated purely as data and will not be executed.
*Except for those JavaScript functions that are designed to take HTML-encoded strings as input.

Should I worry that using GET in a form element doesn't automatically URL-encode angle brackets?

So I decided to use GET in my form element, point it to my cshtml page, and found (as expected) that it automatically URL encodes any passed form values.
I then, however, decided to test if it encodes angle brackets and surprisingly found that it did not when the WebMatrix validator threw a server error warning me about a potentially dangerous value being passed.
I said to myself, "Okay, then I guess I'll use Request.Unvalidated["searchText"] instead of Request.QueryString["searchText"]. Then, as any smart developer who uses Request.Unvalidated does, I tried to make sure that I was being extra careful, but I honestly don't know much about inserting JavaScript into URLs so I am not sure if I should worry about this or not. I have noticed that it encodes apostrophes, quotations, parenthesis, and many other JavaScript special characters (actually, I'm not even sure if an angle bracket even has special meaning in JavaScript OR URLs, but it probably does in one, if not both. I know it helps denote a List in C#, but in any event you can write script tags with it if you could find a way to get it on the HTML page, so I guess that's why WebMatrix's validator screams at me when it sees them).
Should I find another way to submit this form, whereas I can intercept and encode the user data myself, or is it okay to use Request.Unvalidated in this instance without any sense of worry?
Please note, as you have probably already noticed, my question comes from a WebMatrix C#.net environment.
Bonus question (if you feel like saving me some time and you already know the answer off the top of your head): If I use Request.Unvalidated will I have to URL-decode the value, or does it do that automatically like Request.QueryString does?
---------------------------UPDATE----------------------------
Since I know I want neither a YSOD nor a custom error page to appear simply because a user included angle brackets in their "searchText", I know I have to use Request.Unvalidated either way, and I know I can encode whatever I want once the value reaches the cshtml page.
So I guess the question really becomes: Should I worry about possible XSS attacks (or any other threat for that matter) inside the URL based on angle brackets alone?
Also, in case this is relevant:
Actually, the value I am using (i.e. "searchText") goes straight to a cshtml page where the value is ran through a (rather complex) SQL query that queries many tables in a database (using both JOINS and UNIONS, as well as Aliases and function-based calculations) to determine the number of matches found against "searchText" in each applicable field. Then I remember the page locations of all of these matches, determine a search results order based on relevance (determined by type and number of matches found) and finally use C# to write the search results (as links, of course) to a page.
And I guess it is important to note that the database values could easily contain angle brackets. I know it's safe so far (thanks to HTML encoding), but I suppose it may not be necessary to actually "search" against them. I am confused as to how to proceed to maximum security and functional expecations, but if I choose one way or the other, I may not know I chose the wrong decision until it is much too late...
URL and special caracters
The url http://test.com/?param="><script>alert('xss')</script> is "benign" until it is read and ..
print in a template : Hello #param. (Potential reflected/persisted XSS)
or use in Javascript : divContent.innerHTML = '<a href="' + window.location.href + ... (Potential DOM XSS)
Otherwise, the browser doesn't evaluate the query string as html/script.
Request.Unvalidated/Request.QueryString
You should use Request.Unvalidated["searchText"] if you are expecting to receive special caracters.
For example : <b>User content</b><p>Some text...</p>
If your application is working as expected with QueryString["searchText"], you should keep it since it validate for potential XSS.
Ref: http://msdn.microsoft.com/en-us/library/system.web.httprequest.unvalidated.aspx

Preventing DOM XSS

We recently on-boarded someone else's code which has since been tested, and failed, for DOM XSS attacks.
Basically the url fragments are being passed directly into jQuery selectors and enabling JavaScript to be injected, like so:
"http://website.com/#%3Cimg%20src=x%20onerror=alert%28/XSSed/%29%3E)"
$(".selector [thing="+window.location.hash.substr(1)+"]");
The problem is that this is occurring throughout their scripts and would need a lot of regression testing to fix e.g. if we escape the data if statements won't return true any more as the data won't match.
The JavaScript file in question is concatenated at build time from many smaller files so this becomes even more difficult to fix.
Is there a way to prevent these DOM XSS attacks with some global code without having to go through and debug each instance.
I proposed that we add a little regular expression at the top of the script to detect common chars used in XSS attacks and to simply kill the script if it returns true.
var xss = window.location.href.match(/(javascript|src|onerror|%|<|>)/g);
if(xss != null) return;
This appears to work but I'm not 100% happy with the solution. Does anyone have a better solution or any useful insight they can offer?
If you stick to the regular expression solution, which is far from ideal but may be the best choice given your constraints:
Rather than defining a regular expression matching malicious hashes (/(javascript|src|onerror|%|<|>)/g), I would define a regular expression matching sound hashes (e.g. /^[\w_-]*$/).
It will avoid false-positive errors (e.g. src_records), make it clear what is authorized and what isn't, and block more complex injection mechanisms.
Your issue is caused by that jQuery's input string may be treated as HTML, not only as selector.
Use native document.querySelector() instead of jQuery.
If support for IE7- is important for you, you can try Sizzle selector engine which likely, unlike jQuery and similar to native querySelector(), does not interpret input string as something different from a selector.

Javascript odd StackOverflow error

I was wondering about the working of parentheses in Javascript, so I wrote this code to test:
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
4+4
))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
Which consists in:
( x1174
4+4
) x1174
I tested the code above on Google Chrome 20 (Win64), and it gives me the right answer (8).
But if I try the same code, but with 1175 parentheses (on both sides), I get a stackoverflow error.
You can check this code in JSFiddle (Note: in JSFiddle it stops working with 1178 parentheses)
So, my questions are:
Why does it happen?
Why does it stop working with 1178 parentheses on JSFiddle but with only 1175 on my blank page?
Does this error depend of the page/browser/os?
Often languages are parsed by code designed along a pattern called recursive descent. I don't know for sure that that's the case here, but certainly the "stack overflow" error is a big piece of evidence.
The idea is that to parse an expression, you approach the syntax by looking at what an expression can be. A parenthesized expression is like an "expression within an expression". Thus, for a parser to systematically parse some expression in code it's seeing for the first time (which, for a parser, is its eternal fate), a left parenthesis means "ok - hold on to what you're doing (on the stack), and go start from the beginning of what an expression can possibly be and parse a fresh, complete expression, and come back when you see the matching close paren".
Thus, a string of a thousand or more parenthesis triggers an equivalent cascade of that same activity: put what we've got on a shelf; dive in and get a sub-expression, and then resume when we know what it looks like.
Now this is not the only way to parse something, it should be noted. There are many ways. I personally am a huge fan of recursive descent parsing, but there's nothing particularly special about it (except that I think it'll someday result in me seeing a real unicorn).
The behavior is different in different browsers because they have different implementations of Javascript. The language doesn't specify how something like this should fail, so each implementation fails in a different way.
The difference between JSFiddle and your blank page is because JSFiddle itself uses a few stack frames to establish the environment in which to run your code.

How Do I Sanitize JS eval Input?

a="79 * 2245 + (79 * 2 - 7)";
b="";
c=["1","2","3","4","5","6","7","8","9","0","+","-","/","*"];
for (i=1;i<a.length;i++){
for (ii=1;i<c.length;i++){
b=(a.substring(0,i))+(c[ii])+(a.substring(i+1,a.length));
alert(eval(b.replace(" ","")));
}
}
I need to find out how to make it so that when I use eval, I know that the input will not stop the script, and if it would normally crash the script to just ignore it. I understand that eval is not a good function to use, but I want a quick and simple method by which I can solve this. The above code tries to output all of the answers with all of the possible replacements for any digit, sign or space in the above. i represents the distance through which it has gone in the string and ii represents the symbol that it is currently checking. a is the original problem and b is the modified problem.
Try catching the exception eval might throw, like this:
try{
alert(eval(b.replace(" ","")));
} catch (e){
//alert(e);
}
You can check for a few special cases and avoid some behaviors with regex or the like, but there is definitely no way to 'if it would normally crash just ignore it'
That is akin to the halting problem, as mellamokb refers to. And theres no way to know ipositively f a script runs to completion besides running it.
One should be very careful to vet any strings that go to eval, and keep user input out of them as much as possibl except for real simple and verifiable things like an integer value. If you can find a way around eval altogether than all the better.
For the calculation example you show its probably best to parse it properly into tokens and go from there rather than evaluate in string form.
PS - if you really want to check out these one-off's to the expression in a, it is a somewhat interesting use of eval eespite its faults. cam you explain why you are trimming the whitespace imediately before evaluation? i dont believe i can think of a situation where it effects the results. for (at least most) valid expressions it makes no difference, and while it might alter some of the invalid cases i cant think of a case where it does so meaningfully

Categories