What is the correct way to escape JavaScript in C# / ASP.NET?
For example:
<script>
var abc = '<%= def%>';
</script>
<div onclick="myfunction('<%= Xyz%>')" />
There are surely questions about this, but listing different options. There are
System.Web.HttpUtility.JavaScriptStringEncode
System.Web.Util.HttpEncoder.JavaScriptStringEncode
Microsoft.Security.Application.Encoder.JavaScriptEncode
Microsoft.JScript.GlobalObject.escape
System.Text.Encodings.Web.JavaScriptEncoder (Core)
any other?
Results from these methods are not always the same and their documentation does not seem to clearly describe the use case.
In case of the latter example we probably should employ both HTML and JS encoding, I was able to exploit System.Web.HttpUtility.JavaScriptStringEncode when used without HTML encoding. However Microsoft.Security.Application.Encoder.JavaScriptEncode is so thorough that while I would still add HTML encoding to be proper, I can't see a way how it can be exploited.
JSFiddle: https://jsfiddle.net/1afn5dky/
Does each method have a preferred use case?
The best answer for your specific usecase can be derived from OWASP Cheatsheet Rule #3
In general, this boils down to 1) the correct encoding actions and their order and 2) implementation.
Encoding actions and order here is fairly simple - you need to do javascript encoding. Beware the onclick use case though. If the function you mention is designed to execute code, you will probably need to come up with additional layer of sanitization. More information in this SO thread
As for the implementation, the owasp cheatsheet mentions now obsolete AntiXSS, which has been superseded by AntiXssEncoder class now present in .NET Framework and .NET Core. This can be used by HttpUtility class behind the scenes if you follow this setup in web.config, like so:
<httpRuntime ...
encoderType="System.Web.Security.AntiXss.AntiXssEncoder,System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />
Without this setting, HttpUtility class uses HttpEncoder behind the scenes. (In your specific use case, JavaScriptStringEncode method is actually the same for both classes though).
In your first scenario:
<script>
var abc = '<%= def%>';
</script>
If def will evaluate to data rule 3.1 specifically recommends putting them into a separate element, evaluating them as a json string and then using json parser to read the data:
<div id="init_data" style="display: none">
<%= html_encode(def.to_json) %>
</div>
// external js file
var dataElement = document.getElementById('init_data');
// decode and parse the content of the div
var abc = JSON.parse(dataElement.textContent);
If this is however impossible because def evaluates to a javascript function call, this approach is not possible. In this scenario, I think you need to change your design as codegen is usually a bad idea and you could have either do the full codegen on backend side or frontend (using an appropriate frontend framework). If for some strange reason you just cannot do any of these options, than you need to encode nested context (JS in HTML), and on top of that you also need to do sanitization of the resulting javascript calls.
Second example <div onclick="myfunction('<%= Xyz%>')" /> should be used with respect to rule 3, so HttpUtility.JavaScriptStringEncode should suffice. However, if your function executes the parameter, you should also sanitize it.
Related
I'm doing a forum like web app. Users are allowed to submit rich html text to server such as p tag, div tag, etc. In order to keep the format, server will write these tags back to the users' browser directly(without html encoded). So, I must do a potential dangerous script check to avoid XSS. Any JavaScript code is supposed to be dangerous and not allowed. So, How to detect them or any other better solution?
dangerous example 1:
<script>alert('1')</script>
dangerous example 2:
<script src="..."></script>
dangerous example 3:
click me
Use an HTML Parser
Your requirements are straightforward:
You must disallow all <script> tags, but keep certain rich HTML tags.
You must be able to escape inline Javascript in links. i.e. stringify it or strip the unsafe attributes altogether.
The correct way to handle all of these is to employ a modern standards-compliant HTML parser that is able to syntactically analyse the structure of the rich HTML sent over, identifying the tags sent over and discovering the raw values in attributes. This is, in fact, how sanitisation, as one of the comments mentions, is done.
There are a number of pre-existing HTML parsers that are designed to target XSS-unsafe input. The npm library js-xss, for example, appears to be able to do exactly what you want:
Whitelisting only specific tags
Modify unsafe attributes to return a default value
You can even run this server-side as a command line utility.
Similar libraries already exist for most languages, and you should do a thorough search of your preferred language's package repository. Alternatively, you can launch a subprocess and collect your results directly from js-xss from the command line.
Avoid using regular expressions to parse HTML naively - while it is true most HTML parsers end up using regular expressions under the hood, they do so in a fairly limited fashion for strictly well-defined grammars after correctly lexing them.
Use this regex
<script([^'"]|"(\\.|[^"\\])*"|'(\\.|[^'\\])*')*?<\/script>
for detecting all types of <script> tag
but I suggest using a iframe in sandbox mode to show ALL html code, by doing that you prevent javascript code from being able to do anything bad.
http://www.w3schools.com/tags/att_iframe_sandbox.asp
I hope this helps!
I stumbled upon https://codereview.stackexchange.com/questions/10610/refactoring-javascript-into-pure-functions-to-make-code-more-readable-and-mainta and I don't understand the answer since the user uses an # symbol in a way I've never seen before. What does it do when attached to the if keyword? Can you attach it to other keywords?
It's not a JavaScript thing. It's a symbol used by whatever templating system that answer was referring to. Note that the <script> element has type text/html, which will prevent browsers from paying any attention to its contents.
Some other JavaScript code will find that script and fetch its innerHTML in order to merge the template with some data to create ... well whatever that template makes.
#: syntax in Razor
Said by #StriplingWarrior at Using Razor within JavaScript
It is razor code, not javascript, if you are interested in razor check:
http://weblogs.asp.net/scottgu/archive/2010/07/02/introducing-razor.aspx
Well. This night was a very strange night to me. I am sorry to create a new question after creating two other questions previously, but this is another argument at all. If I get an answer here, I'll get an answer to those questions too so please somebody listen to me and try to understand.
It all began with a simple script JS to be generated through an aspx codebehind file.
On a control, I had to put a JavaScript in this way:
this.MyTxtBox.Attributes["onfocus"] = "windows.alert('Hello World!');";
OK. You might think, where's the problem? The problem is that ASP.NET 4.0 encodes everything, and I say everything in order to avoid XSS to be performed on a site. Well this might not seem a problem but if you look at the rendered page you'll make a jump on the chair like I did:
<textarea id="..." onfocus="windows.alert('Hello World!');"></textarea>
As you can see the html, the final html is a bit odd... JavaScript engine should not accept this situation.
So I started this questions:
ASP.NET quote character encoding causes problems when setting a control's property
Asp.Net encoding configuration
Well I still haven't got any answer YES we could not understand what the hell it is necessary to modify in the .net configuration in order not to let this situation happen.
But now I consider one thing, one important thing: JavaScript engine works!
Even with that odd code that should not be interpreted...
I hope everything was clear until now... The question now comes:
Is this a normal situation for the JavaScript engine?
Does every browser will correctly interpret a JavaScript having quotes replaced with their encoded strings?
If this is true I have to suppose that the .net does not provide a mechanism to avoid encoding just for this reason!
Re:
<textarea id="..." onfocus="windows.alert('Hello World!');"></textarea>
There's nothing odd about that (other than your using windows.alert instead of window.alert). It should work fine (and does; example). The HTML parser parses HTML attribute values, and handles processing entities like '. The JavaScript source code it eventually hands to the JavaScript interpreter will have quotes in it. The browser doesn't hand the literal characters & # 3 9 ; to the JavaScript interpreter.
It's just the same as:
<input type='text' value="This is a 'funny' value too">
The HTML parser processes the entities, and the actual value assigned to the input is This is a "funny" value too.
Incidentally, this is also why this seemingly-innocent HTML is actually wrong and will fail validation (although most browsers will allow it):
<a href='http://www.google.com/search?q=foo&hl=en'>Search for foo</a>
More correctly, that should be:
<a href='http://www.google.com/search?q=foo&hl=en'>Search for foo</a>
<!-- ^^^^^--- difference here -->
...because the HTML parser parses the value, then assigns the parsed result to the href attribute. And of course, an & introduces a character entity and so to literally get an & you must use & everywhere in HTML. (Again, most browsers will let you get away with it if what follows the & doesn't look like an entity. But that can and will bite you.)
Has anyone implemented John Resig's micro-templating in production code? In particular I'd like to know if embedding the templates with <script type="text/html"> has caused any problems.
A bit more information: here's the sort of thing I'm worried about:
My users are mostly in corporate environments - could an unusual proxy server mangle or strip out the templates
The embedded templates seem to fail W3C validation. What if, for instance, the next version of IE decides it doesn't like them?
I abandoned inlining templates through scripttags as my view layer would create redundant duplicate script templates. Instead I placed the templates back into the JavaScript as string:
var tpl = ''.concat(
'<div class="person">',
'<span class="name">${name}</span>',
'<span class="lastname">${lastName}</span>',
'</div>'
);
I used the string concat trick so I could make the template string readable. There are variations to the trick like an array join or simply additive concatenation. In any case, inline script templates works and works well, but in a php/jsp/asp view layer, chances are you will create redundant duplicate script templates unless you do even more work to avoid it.
Furthermore, these templates become rather complex the more logic you have to add to them, so I looked further and found mustache.js which imo. is far superior and keeps the logic (conditions and dynamic variable definitions) in the JavaScript scope.
Another option would be to retreive template strings through ajax, in which case you can put each template inside it's own file and simply grant it a .tpl extention. The only thing you have to worry about is the http request roundtrip, which should not take too long for small .tpl files and is imo. insignificant enough.
While I haven't used #jeresig's micro-templating, I did roll my own which uses <script type="text/html">. I've used it in a couple of (albeit basic) production sites without problems. The HTML5 spec itself refers to something similar (near the end of the section I've linked to). AFAIK the block only executes when the MIME type is a recognized script type, otherwise the block is just part of the document.
I have actually, and it works great. Though only for webkit powered browsers so can't vouch for others. But what problems are you expecting? The approach is simple and I can't think of how it might break.
If you're using jQuery I can recommend jQuery templates instead:
http://github.com/nje/jquery-tmpl
I think it's supposed to be introduced officially in jQuery 1.5
BTW. I think both Resigs script and jQuery templates relies on using innerHTML, so as long as that works in your browser you should be OK.
I'm developing a facebook app right now all by my lonesome. I'm attempting to make a javascript call on an onclick event. In this onclick event, I'm populating some arguments (from the server side in php) based on that item that is being linked. I'm inserting a little bit of JSON and some other stuff with funky characters.
Facebook expects all the attribute fields of an anchor to be strictly alphanumeric. No quotes, exclamation marks, anything other than 0-9a-Z_. So it barfs on the arguments I want to pass to my javascript function (such as JSON) when the user clicks that link.
So I thought, why don't I use my templating system to just autogenerate the javascript? For each link I want to generate, I generate a unique javascript function (DoItX where X is a unique integer for this page). Then instead of trying to pass arguments to my javascript function via onclick, I will insert my arguments as local variables for DoX. On link "X" I just say onclick="DoX()".
So I did this and viola it works! (it also helps me avoid the quote escaping hell I was in earlier). But I feel icky.
My question is, am I nuts? Is there an easier way to do this? I understand the implications that somehow somebody was able to change my templated local variable, ie:
var local = {TEMPLATED FIELD};
into something with a semicolon, inserting arbitrary javascript to the client. (and I'm trying to write code to be paranoid of this).
When is it ok (is it ever ok) to generate javascript from the server? Anything I should look out for/best practices?
Depending on your application generating JavaScript in your templating language can save a lot of time but there are pitfalls to watch out for. The most serious one being that it gets really hard to test your JavaScript when you don't have your full templating stack available.
One other major pitfall is that it becomes tempting to try and 'abstract' JavaScript logic to some higher level classes. Usually this is a sign that you will be shaving yaks in your project. Keep JavaScript login in JavaScript.
Judging from the little bit of information you have given it your solution seems sensible.
If you must generate javascript, I would suggest only generating JSON and having all functions be static.
It more cleanly separates the data, and it also makes it easier to validate to prevent XSS and the like.
JS generated from server is used in lots of areas. The following is the sample from a ASP.NET page where the JS script is generated by the framework:
<script src="/WebResource.axd?d=9h5pvXGekfRWNS1g8hPVOQ2&t=633794516691875000" type="text/javascript"></script>
Try to have reusable script functions that don't require regeneration; and 'squeeze' out the really dynamic ones for server-side generation.
If you want to feel better about it, make sure that most of your JavaScript is in separate library files that don't get generated, and then, when you generate code, generate calls to those libraries rather than generating extensive amounts of JavaScript code.
it's fine to generate JS from the server. just bear in mind not to fine too big a page from the server.
Generally speaking I avoid ever automatically generating JavaScript from a server-side language, though I do however; create JavaScript variables that are initialized from server-side variables that my JavaScript will use. This makes testing and debugging much simpler.
In your case I may create local variables like the following which is easy to test:
<script type='text/javascript' language='javascript'>
<!--
var FUNC_ARG_X = <%= keyX %>;
var FUNC_ARG_Y = <%= keyY %>;
var FUNC_ARG_Z = <%= keyZ %>;
//-->
</script>
<script type='text/javascript' language='javascript'>
<!--
function DoCleanCall(arg) {
// Whatever logic here.
}
//-->
</script>
now in your markup use:
<a href='#' onclick='DoCleanCall(FUNC_ARG_X);'>Test</a>
Now of course you could have embedded the server-side variable on the <a/> tag, however it is sometimes required that you refer to these values from other parts of your JavaScript.
Notice also how the generated content is in it's own <script> tag, this is deliberate as it prevents parsers from failing and telling you that you have invalid code for every reference you use it in (as does ASP.NET), it will still fail on that section only however.