How can I sandbox untrusted user-submitted JavaScript content? - javascript

I need to serve user-submitted scripts on my site (sort of like jsfiddle). I want the scripts to run on visitors browsers in a safe manner, isolated from the page they are served on. Since the code is submitted by users, there is no guarantee it is trustworthy.
Right now I can think of three options:
Serve the user-submitted content in an iframe from a different domain, and rely on the same-origin policy. This would require setting up an additional domain which I'd like to avoid if possible. I believe this is how jsfiddle does it. The script can still do some damage, changing top.location.href for example, which is less than ideal. http://jsfiddle.net/PzkUw/
Use the sandbox attribute. I suspect this is not well supported across browsers.
Sanitize the scripts before serving them. I would rather not go there.
Are there any other solutions, or recommendations on the above?
Update
If, as I suspect, the first option is the best solution, what can a malicious script do other than change the top window location, and how can I prevent this? I can manipulate or reject certain scripts based on static code analysis but this is hard given the number of ways objects can be accessed and the difficulty analysing javascript statically in general. At the very least, it would require a full-blown parser and a number of complex rules (some, but I suspect not all, of which are present in JSLint).

Create a well defined message interface and use JavaScript Web Worker for the code you want to sandbox. HTML5 Web Workers
Web Workers do not have access to the following DOM objects.
The window object
The document object
The parent object
So they can't redirect your page or alter data on it.
You can create a template and a well defined messaging interface so that users can create web worker scripts, but your script would have the final say on what gets manipulated.
EDIT Comment by Jordan Gray plugging a JavaScript library that seems to do what I described above. https://github.com/eligrey/jsandbox

Some ideas of tools that could be helpful in your application - they attack the problem from two different directions: Caja compiles the untrusted JavaScript code to something that is safe while AdSafe defines a subset of JavaScript that is safe to use.
Caja
Caja
The Caja Compiler is a tool for making third party HTML, CSS and JavaScript safe to embed in your website. It enables rich interaction between the embedding page and the embedded applications. Caja uses an object-capability security model to allow for a wide range of flexible security policies, so that your website can effectively control what embedded third party code can do with user data.
AdSafe
AdSafe
ADsafe makes it safe to put guest code (such as third party scripted advertising or widgets) on a web page. ADsafe defines a subset of JavaScript that is powerful enough to allow guest code to perform valuable interactions, while at the same time preventing malicious or accidental damage or intrusion. The ADsafe subset can be verified mechanically by tools like JSLint so that no human inspection is necessary to review guest code for safety. The ADsafe subset also enforces good coding practices, increasing the likelihood that guest code will run correctly.

As mentioned, the sandbox attribute of the iframe is already supported by major browsers, but I would additionally suggest a mixed solution: to start a web-worker inside the sandboxed iframe. That would give a separate thread, and protect event the sandboxed iframe's DOM from the untrusted code. That is how my Jailed library works. Additionally you may workaround any restrictions by exporting any set of functions into the sandbox.

If you want to sandbox some piece of code by removing it's access to say the window, document and parent element you could achieve it by wrapping it in a closure where these are local empty variables:
(function(window, document, parent /* Whatever you want to remove */){
console.log(this); // Empty object
console.log(window); // undefined
console.log(document); // undefined
console.log(parent); // undefined
}).call({});
Calling it with an empty object is important because otherwise this will point to the window object

Related

Closing access to global variables javascript

I'm coding a js API that is going to be used by external customers and executed by my customers in their web browsers.
As this is potentially harmful for my web users (security holes etc.), I'd like to allow or disallow access to, at least, the document global variable and others like the XMLHTTPRequest API.
How can I do this?
EDIT:
I was thinking on doing things like document = null on a wrapper to the functions the API users write, but it doesn't work. Also, using this kind of approach it is difficult to be thorough as there are too many workarounds and too many possibilities to take in account all of them.
The answer is simple: You can't.
This may not be answer want, but those global variables can't be modified.
Try for yourself:
window = 1;
console.log(window) // Window {top: Window, window: Window, ...
document = false;
console.log(document) // #document (as in the document object)
document = null;
console.log(document) // #document (same)
window.document = false;
console.log(window.document) // #document
However, this does seem to be possible for XMLHttpRequest:
XMLHttpRequest = null
console.log(XMLHttpRequest) // null
console.log(window.XMLHttpRequest) // null
So, you might be able to disable individual functions.
However, messing with native functionality like this is a bad idea, since it can have unintended side effects. For example, jQuery uses XMLHttpRequest for it's ajax functions.
Caja
The Caja Compiler is a tool for making third party HTML, CSS and JavaScript safe to embed in your website. It enables rich interaction between the embedding page and the embedded applications. Caja uses an object-capability security model to allow for a wide range of flexible security policies, so that your website can effectively control what embedded third party code can do with user data.
ADSafe
JavaScript, the programming language of the web browser, is not a secure language. Any script in a page has intimate access to all of the information and relationships of the page. This makes use of mashups and scripted advertising unacceptably risky.
ADsafe makes it safe to put guest code (such as third party scripted advertising or widgets) on a web page. ADsafe defines a subset of JavaScript that is powerful enough to allow guest code to perform valuable interactions, while at the same time preventing malicious or accidental damage or intrusion. The ADsafe subset can be verified mechanically by tools like JSLint so that no human inspection is necessary to review guest code for safety. The ADsafe subset also enforces good coding practices, increasing the likelihood that guest code will run correctly.

Does javascript "fake privacy" pose a security risk?

Javascript doesn't let you give private data or methods to objects, like you can in C++. Oh, well actually, yes it does, via some workarounds involving closure. But coming from a Python background, I am inclined to believe that "pretend privacy" (via naming conventions and documentation) is good enough, or maybe even preferable to "enforced privacy" (enforced by Javascript itself). Sure, I can think of situations where this is not true -- e.g. people interface with my code without RTFM but I get blamed -- but I'm not in that situation.
But, something gives me pause. Javascript guru Douglas Crockford, in "Javascript: The Good Parts" and elsewhere, repeatedly refers to fake-privacy as a "security" issue. For example, "an attacker can easily access the fields directly and replace the methods with his own".
I'm confused by this. It seems to me that if I follow minimal security practices (validate, don't blindly trust, data sent from a browser to my server; don't include third-party scripts on my site without inspecting them) then there is no situation where pretend-privacy is less "secure" than enforced privacy. Is that right? If not, what's a situation where pretend-privacy versus enforced-privacy has security implications?
Not in itself. However, it does mean you cannot safely load untrusted JavaScript code into your HTML documents, as Crockford points out. If you really need to run such untrusted JavaScript code in the browser (e.g. for user-submitted widgets in social networking sites), consider iframe sandboxing.
As a Web developer, your security problem is often that major Internet advertising brokers do not support (or even prohibit) framing their ad code. Unfortunately, you have to trust Google to not deliver malicious JavaScript, whether intentionally or unintentionally (e.g. they get hacked).
Here is a short description of iframe sandboxing I had posted as an answer to another question:
Set up a completely separate domain name (e.g. "exampleusercontent.com") exclusively for user-submitted HTML, CSS, and JavaScript. Do not allow this content to be loaded through your main domain name. Then embed the user content in your pages using iframes.
If you need tighter integration than simple framing, window.postMessage() may help, allowing scripts in different frames to communicate with each other in a controlled manner.
It seems the answer is "No, fake privacy is fine". Here are some elaborations:
In javascript as it exists today, you cannot include an unknown and untrusted third-party script on your webpage. It can wreak havoc: It can rewrite all the HTML on the page, it can prompt the user for his password and then send it to an evil server, etc. etc. Javascript coding style makes no difference to this basic fact. See PleaseStand's answer for a discussion of methods to deal with this.
An incompetent but not evil script might unintentionally mess things up through name conflicts. This is a good argument against creating lots of global variables with common names, but has nothing to do with whether to avoid fake-private variables. For example, my banana-selling website might use the fake-private variable window.BANANA_STORE_MODULE.cart.__cart_item_array. It is not completely impossible that this variable would be accidentally overwritten by a third-party script, but it's extraordinarily unlikely.
There are ideas floating around for a future modification of javascript that would provide a controlled environment where untrusted code can act in prescribed ways. I could let the untrusted third-party javascript interact with my javascript through specific exposed methods, and block the third-party script from accessing the HTML, etc. If this ever exists, it could be a scenario where private variables are essential for security. But it doesn't exist yet.
Writing clear and bug-free code is always, obviously, helpful for security. Insofar as truly-private variables and methods make it easier or harder to write clear and bug-free code, there's a security implication. Whether they are helpful or not will always be a matter of debate and taste, and whether your background is, say, C++ (where private variables are central) versus Python (where private variables are nonexistent). There are arguments in both directions, including the famous blog post Javascript Private Variables are Evil.
For my part, I will keep using fake privacy: A leading underscore (or whatever) indicates to myself and my collaborators that some property or method is not part of the publicly-supported interface of a module. My fake-privacy code is more readable (IMO), and I have more freedom in structuring it (e.g. a closure cannot span two files), and I can access those fake-private variables while I debug and experiment. I'm not going to worry that these programs are somehow more insecure than any other javascript program.

what is better? using iframe or something like jquery to load an html file in external website

I want my customers create their own HTML on my web application and copy and paste my code to their website to showing the result in the position with customized size and another options in page that they want. the output HTML of my web application contain HTML tags and JavaScript codes (for example is a web chart that created with javascript).
I found two way for this. one using iframe and two using jquery .load().
What is better and safer? Is there any other way?
iframe is better - if you are running Javascript then that script shouldn't execute in the same context as your user's sites: you are asking for a level of trust here that the user shouldn't need to accede to, and your code is all nicely sandboxed so you don't have to worry about the parent document's styles and scripts.
As a front-end web developer and webmaster I've often taken the decision myself to sandbox third-party code in iframes. Below are some of the reasons I've done so:
Script would play with the DOM of the document. Once a third-party widget took it upon itself to introduce buggy and performance-intensive PNG fix hacks for IE across every PNG used in img tags and CSS across our site.
Many scripts overwrite the global onload event, robbing other scripts of their initialisation trigger.
Reading local session info and sending it back to their own repositories.
Loading any number of resources and perform CPU-intensive processes, interrupting and weighing down my site's core experience.
The above are all examples of short-sightedness or malice on the part of the third parties you may see yourself as above, but the point is that as one of your service's users I shouldn't need to take a gamble. If I put your code in an iframe, I know it can happily do its own thing and not screw with my site or its users. I can also choose to delay load and execution to a moment of my choosing (by dynamically loading the iframe at a moment of choice).
To argue the point in terms of your convenience rather than the users':
You don't have to worry about any of the trust issues associated with XSS. You can honestly tell your users they're not exposing themselves to any unnecessary worry by running your tool.
You don't have to make the extra effort to circumvent the effects of CSS and JS on your users' sites.

In Browser Javascript Editor and Execution

I am developing an Enyo web application and would like to allow users to write their Javascript code in the browser and execute it.
I can do this by using window.eval. However, I have read about the evils of eval.
Is there anyone that could shed some light on how examples like http://learn.knockoutjs.com/, http://jsfiddle.net, etc do in browser execution safely and what the best practices are?
Eval is considered evil for all but one specific case, which is your case of generating programs during runtime (or metaprogramming). The only alternative would be to write your parser/interpreter (which can be done relatively easily in javascript, but rather for a simpler language than javascript itself - I did it and it was fun). Thus using eval() function here is legitimate (for making a browser-side compiler to a reasonably fast code, you need to use eval for generated compiled javascript anyway).
However, problem with eval is security, because evaluated code has the same privileges and access to its environment as your script that runs it. This is a topic quite hot recently and EcmaScript 5 was designed to partially address this issue by introducing strict mode, because the strict-mode code can be statically analyzed for dangerous operations.
This is usually not enough (or problematic for backward compatibility reasons), so there are approaches like Caja that solves security by analyzing the code on a server and allows only strict safe subset of javascript be used.
Another often used approach is protect the user, but not protecting from malicious attacks using running the user generated javascript in an <iframe> element embedded in the parent page (usually used by sites like jsfiddle). But it is not secure for the iframe can access its parent page and get to its content.
Even in this iframe approach there has been some progress recently e.g. in chrome to make it less vulnerable by using sandbox attribute
<iframe src="sandboxedpage.html" sandbox="allow-scripts"></iframe>
where you can even specify different privileges.
Hopefully, we will have an easy way to use safe and easy metaprogramming soon, but we are not there yet.

Make programming langugage for your web app in JS that compiles to JS w/ PHP to ensure thorough filtering of user-uploaded html5 canvas animations?

A persistent follow-up of an admittedly similar question I had asked: What security restrictions should be implemented in allowing a user to upload a Javascript file that directs canvas animation?
I like to think I know JS decent enough, and I see common characters in all the XSS examples I've come accoss, which I am somewhat familiar with. I am lacking good XSS examples that could bypass a securely sound, rationally programmed system. I want people to upload html5 canvas creations onto my site. Any sites like this yet? People get scared about this all the time it seems, but what if you just wanted to do it for fun for yourself and if something happens to the server then oh well it's just an animation site and information is spread around like wildfire anyway so if anyone cares then i'll tell them not to sign up.
If I allow a single textarea form field to act as an IDE using JS for my programming language written in JS, and do string replacing, filtering, and validation of the user's syntax before finally compiling it into JS to be echoed by PHP, how bad could it get for me to host that content? Please show me how you could bypass all of my combined considerations, with also taking into account the server-side as well:
If JavaScript is disabled, preventing any POST from getting through, keeping constant track of user session.
Namespacing the Class, so they can only prefix their functions and methods with EXAMPLE.
Making instance
Storing my JS Framework in an external (immutable in the browser?) JS file, which needs to be at the top of the page for the single textarea field in the form to be accepted, as well as a server-generated key which must follow it. On the page that hosts the compiled user-uploaded canvas game/animation (1 per page ONLY), the server will verify the correct JS filename string before echoing the rest out.
No external script calls! String replacing on client and server.
Allowing ONLY alphanumeric characters, dashes and astericks.
Removing alert, eval, window, XMLHttpRequest, prototyping, cookie, obvious stuff. No native JS reserved words or syntax.
Obfuscating and minifying another external JS file that helps to serve the IDE and recognize the programming language's uniquely named Canvas API methods.
When Window unloads, store the external JS code in to two dynamically generated form fields to be checked by the server in POST. All the original code will be cataloged in the DB thoroughly for filtering purposes.
Strict variable naming conventions ('example-square1-lengthPROPERTY', 'example-circle-spinMETHOD')
Copy/Paste Disabled, setInterval to constantly check if enabled by the user. If so, then trigger a block to the database, change window.location immediately and check the session ID through POST to confirm in case JS becomes disabled between that timeframe.
I mean, can I do it then? How can one do harm if they can't use HEX or ASCII and stuff like that?
I think there are a few other options.
Good places to go for real-life XSS tests, by the way, are the XSS Cheat Sheet and HTML5 Security Cheetsheet (newer). The problem with that, however, is that you want to allow Javascript but disallow bad Javascript. This is a different, and more complex, goal than the usual way of preventing XSS, by preventing all scripts.
Hosting on a separate domain
I've seen this referred to as an "iframe jail".
The goal with XSS attacks is to be able to run code in the same context as your site - that is, on the same domain. This is because the code will be able to read and set cookies for that domain, intiate user actions or redress your design, redirect, and so forth.
If, however, you have two separate domains - one for your site, and another which only hosts the untrusted, user-uploaded content, then that content will be isolated from your main site. You could include it in an iframe, and yet it would have no access to the cookies from your site, no access to redress or alter the design or links outside its iframe, and no access to the scripting variables of your main window (since it is on a different domain).
It could, of course, set cookies as much as it likes, and even read back the ones that it set. But these would still be isolated from the cookies for your site. It would not be able to affect or read your main site's cookies. It could also include other code which could annoy/harrass the user, such as pop-up windows, or could attempt to phish (you'd need to make it visually clear in your out-of-iframe UI that the content served is not part of your site). However, this is still sandboxed from your main site, where you own personal payload - your session cookies and the integrity of your overarching page design and scripts, is preserved. It would carry no less but no more risk than any site on the internet that you could embed in an iframe.
Using a subset of Javascript
Subsets of Javascript have been proposed, which provide compartmentalisation for scripts - the ability to load untrusted code and have it not able to alter or access other code if you don't give it the scope to do so.
Look into things like Google CAJA - whose aim is to enable exactly the type of service that you've described:
Caja allows websites to safely embed DHTML web applications from third parties, and enables rich interaction between the embedding page and the embedded applications. It uses an object-capability security model to allow for a wide range of flexible security policies, so that the containing page can effectively control the embedded applications' use of user data and to allow gadgets to prevent interference between gadgets' UI elements.
One issue here is that people submitting code would have to program it using the CAJA API. It's still valid Javascript, but it won't have access to the browser DOM, as CAJA's API mediates access. This would make it difficult for your users to port some existing code. There is also a compilation phase. Since Javascript is not a secure language, there is no way to ensure code cannot access your DOM or other global variables without running it through a parser, so that's what CAJA does - it compiles it from Javascript input to Javascript output, enforcing its security model.
htmlprufier consists of thousands of regular expressions that attempt "purify" html into a safe subset that is immune to xss. This project is bypassesed very few months, because it isn't nearly complex enough to address the problem of XSS.
Do you understand the complexity of XSS?
Do you know that javascript can exist without letters or numbers?
Okay, they very first thing I would try is inserting a meta tag that changes the encoding to I don't know lets say UTF-7 which is rendered by IE. Within this utf-7 enocded html it will contain javascript. Did you think of that? Well guess what there is somewhere between a hundred thousand and a a few million other vectors I didn't think of.
The XSS cheat sheet is so old my grandparents are immune to it. Here is a more up to date version.
(Oah and by the way you will be hacked because what you are trying to do fundamentally insecure.)

Categories