I understand the term sandbox. But my limited skills in JS is unable to help me understand what is sandboxing in JS. So, what actually is sandboxing? Apart from security, why do we need to sandbox JS?
the javascript sandbox does exactly what you've said. It limits the scope of what a script can do. There are also benefits in terms of virtualising the resources the script can call on. This allows the sandbox host to marshal those resources for better performance and say, stop an endlessly looping script bringing the whole browser crashing down.
Sandboxing is the act of creating a scope in which no other part of the application can operate (unless given an opportunity to). More specifically, this is usually a function scope that exposes a limited subset of what's actually going on within it.
One library that's founded on the idea of sandboxes is YUI3. The basic unit of the application is a YUI instance sandbox:
var Y = YUI(); // creates a configurable YUI instance
// Creates a sandbox for one part of your application,
// including the 'node' module.
Y.use('node', function(Z) {
// Z is a YUI instance that's specific to this sandbox.
// Operations inside it are protected from outside code
// unless exposed explicitly. Any modules you request in
// use statement will be separately instanced just for
// this sandbox (in this case, the 'node' module)
//
// That way, if another part of your application decides
// to delete Z.Node (or worse, replace it with a
// malicious proxy of Z.Node) the code you've written
// here won't be affected.
});
The advantages of sandboxes are primarily to reduce application complexity: since sandboxes are immutable, they're much easier to reason about and verify. They also improve runtime security, since a well-designed sandbox should be able to operate as a black box to other scripts running on the page. It does not prevent against all possible attacks, but it protects against many of the simple ones.
Sandboxing creates a limited scope for the script to use. Assuming you're coding for a website, t's worth sandboxing to avoid making edits to a live site when you are uncertain about whether they will work exactly as you expect - and it's impossible to be truly certain without testing. Even if it works properly, if there's a chance of you making a series of alterations to the JS until you've got it tweaked the way you like, you could easily disrupt anyone attempting to use the site while you're updating it.
It's also much easier to tell what's broken when you break things, because of the limited nature of the sandbox.
Related
I have a tough issue to solve, and I've done a lot of research and I don't think I've found a perfect answer yet.
I am making a game, and I plan on writing a pretty big level editor. In my head, this level editor might be powerful enough for me to write the entire game in.
That means, I'd only need to add new features to that game editor to add features to the game.
So far, so good. However, my game is written in Javascript, and I plan on making the level editor inside the same game engine, so I can actually ship it with the game, and let players make their own levels.
Inside that level editor, I plan on adding scripting features to extend the capabilities of the game.
This means, I'd like the level editor to be able to run Javascript. So I want users to be able to write and run JS in the game, and also be able to share their levels and have other people play these levels.
Now, this sounds really bad. This is me basically allowing users to do XSS through my game. I've thought of many ways, and basically, I haven't found any solution to allow users to do this safely.
I want them to be able to access a handful of JS objects I give them and NOTHING ELSE. I don't want them to be able to access ANY other variable, like window, navigator, console etc. I only want them to be able to access my game's runtime object and whatever utils functions I give them.
So, how can I accomplish that safely?
If the custom script runs on the client, the first step would be to at least run it inside a sandbox rather than on the parent page. You could create a web worker from the user-provided script and then pass a message to the worker including only the information you want it to be able to use. Workers run in a separate environment and do not provide access to the original page, nor to many of the built-in browser objects. But some built-in objects will still be accessible, like window.console and window.fetch. If that's acceptable, you could run the user's script in the worker, and have it post a message back to the original page, and then discard the worker.
You might be able to do shenanegans like using with and new Function in an attempt to prevent the user from accessing certain global properties, but that'd be tedious and I wouldn't be confident in being able to do is successfully.
A better approach would be to run the script in a truly isolated world - send the code to your server, and use Node to run the script with vm.runInNewContext, and have the server send the results back to the client. This provides the isolation you need, because:
When you use this function it creates a VERY limited context. You'll need to pass anything you want into the sandbox object which become global objects. For example: You will need to include console in the sandbox object if you want your untrusted code to write to the console.
That said, in most cases, I'd first consider if running the user's JS is absolutely necessary. I'd much prefer allowing the user to customize behavior by defining a text format that they input text into. But if you need to provide unlimited flexibility given the game, creating such an (essentially) programming language could be quite time-consuming if the game is complicated.
It looks like I'm asking about a tricky problem that's been explored a lot over the past decades without a clear solution. I've seen Is It Possible to Sandbox JavaScript Running In the Browser? along with a few smaller questions, but all of them seem to be mislabeled - they all focus on sandboxing cookies and DOM access, and not JavaScript itself, which is what I'm trying to do; iframes or web workers don't sound exactly like what I'm looking for.
Architecturally, I'm exploring the pathological extreme: not only do I want full control of what functions get executed, so I can disallow access to arbitrary functions, DOM elements, the network, and so forth, I also really want to have control over execution scheduling so I can prevent evil or poorly-written scripts from consuming 100% CPU.
Here are two approaches I've come up with as I've thought about this. I realize I'm only going to perfectly nail two out of fast, introspected and safe, but I want to get as close to all three as I can.
Idea 1: Put everything inside a VM
While it wouldn't present a JS "front", perhaps the simplest and most architecturally elegant solution to my problem could be a tiny, lightweight virtual machine. Actual performance wouldn't be great, but I'd have full introspection into what's being executed, and I'd be able to run eval inside the VM and not at the JS level, preventing potentially malicious code from ever encountering the browser.
Idea 2: Transpilation
First of all, I've had a look at Google Caja, but I'm looking for a solution itself written in JS so that the compilation/processing stage can happen in the browser without me needing to download/run anything else.
I'm very curious about the various transpilers (TypeScript, CoffeeScript, this gigantic list, etc) - if these languages perform full tokenization->AST->code generation that would make them excellent "code firewalls" that could be used to filter function/DOM/etc accesses at compile time, meaning I get my performance back!
My main concern with transpilation is whether there are any attacks that could be used to generate the kind code I'm trying to block. These languages' parsers weren't written with security in mind, after all. This was my motivation behind using a VM.
This technique would also mean I lose execution introspection. Perhaps I could run the code inside one or more web workers and "ping" the workers every second or so, killing off workers that [have presumably gotten stuck in an infinite loop and] don't respond. That could work.
We have an app that sits behind a firewall and behind a CAS authentication layer. It has a feature that allows users with a special role to customize the way the app works by writing JavaScript functions that get inserted into the application at runtime, and which can be fired by events such as button clicks and page load and the like. (The JS is not "eval"'d - it is written into the page server-side.)
Needless to say, this feature raises security concerns!
Are there recommendations beyond what's being done already to secure this, that is beyond a) firewall, b) robust authentication and c) authorization.
EDIT:
In response to questions in comments:
1. Does the injected code become part of the application, or it is executed as an independent application (separated context)?
Yes, it becomes a part of the application. It currently gets inserted, server-side, into a script tag.
Does inserted JavaScript run on clients' browsers other than the original writer?
Yes. It gets persisted, and then gets inserted into all future requests.
(The application can be thought of as an "engine" for building custom applications against a generic backend data store which is accessed by RESTful calls. Each custom application can have its own set of custom these JavaScripts)
You really shouldn't just accept arbitrary JavaScript. Ideally, what should happen is that you tokenize whatever JavaScript is sent and ensure that every token is valid JavaScript, first and foremost (this should apply in all below scenarios).
After that, you should verify that whatever JavaScript is sent does not access sensitive information.
That last part may be extremely difficult or even impossible to verify in obfuscated code, and you may need to consider that no matter how much verification you do, this is an inherently unsafe practice. As long as you understand that, below are some suggestions for making this process a little safer than it normally is:
As #FDavidov has mentioned, you could also restrict the JavaScript from running as part of the application and sandbox it in a separate context much like Stack Snippets do.
Another option is to restrict the JavaScript to a predefined whitelist of functions (some of which you may have implemented) and globals. Do not allow it to interact directly with DOM or globals except of course primitives, control flow, and user-defined function definitions. This method does have some success depending on how robustly enforced the whitelist is. Here is an example that uses this method in combination with the method below.
Alternatively, if this is possible with what you had in mind, do not allow the code to run on anyone's machine other than the original author of the code. This would basically be moving a Userscript-like functionality into the application proper (which I honestly don't see the point), but it would definitely be safer than allowing it to run on any client's browser.
Today crypto libraries for JavaScript exists sjcl and hence there may be the situation that
a password/key/secret/sensitivedata is stored somewhere in a variable in JavaScript.
I do not want to risk that this sensitve data is leaked/disclosed and hence I would very much like to know if there is a way to reliably wipe a variable in Javascript so that the memory used by the JavaScript Engine will not have any remaining info about he data? I would for instance not want to rely on some GC to wipe the data lazily etc.
An answer might feature an example code that kills/wipes a variable and also an explanation when (and if there are differences on what JavaScript implementation Browsers type /Nodejs) it makes sence to trust that the data has been deleted?
Else if the task is impossible I would appreciate a explanation why this is so as well and also accept this as an answer
The goal is not to protect the webpage user from accessing the script variable (this cannot be done I guess). The goal is more to guarantee that the memory of the javascript engine does not keep shadow/cached copies of the data, after the point necessary. I do want to have the data be gone so that no-one (attacker software) can get the secret data via looking at the memory been associated with the Javascript Variables.
JavaScript is garbage collected. In addition, there is no mechanism for deterministic resource management built in. You can make one, but the resource would have to be external.
Even if you build such a mechanism (with a C++ external module in Node for example), engines don't give you strong guarantees on when their copy of the memory is cleared. You would have to manually assign to the same variable parts of the resource data and replace it with junk yourself. That would likely work but there is still no guarantee at the engine level.
This is simply not a problem JavaScript implementations are built to do well at this point. There is no SecureString. That said - smart people are working on variants of ECMAScript (the JS standard) that give you much stronger guarantees. That's a good first step towards addressing the problem (but no such guarantee yet).
I don't even want to get started on browsers, where browser extensions can easily get better hooks than you and write over Function.prototype.call and hook on every function call, JavaScript has quite powerful AOP capabilities built in, for worse in this instance.
One possible solution would be to run the whole program within a VM that uses encrypted RAM, but I'm against rolling your own crypto like that. Generally, an attacker should not have access to your program's RAM in the first place, if they do, they can install a browser extension :)
I have a legacy code to maintain and while trying to understand the logic behind the code, I have run into lots of annoying issues.
The application is written mainly in Java Script, with extensive usage of jQuery + different plugins, especially Accordion. It creates a wizard-like flow, where client code for the next step is downloaded in the background by injecting a result of a remote AJAX request. It also uses callbacks a lot and pretty complicated "by convention" programming style (lots of events handlers are created on the fly based on certain object names - e.g. current page name, current step name).
Adding to that, the code is very messy and there is no obvious inner structure - the functions are scattered in the code, file names do not reflect the business role of the code, lots of functions and code snippets are most likely not used at all etc.
PROBLEM: How to approach this code base, so that the inner flow of the code can be sort-of "reverse engineered" using a suite of smart debugging tools.
Ideally, I would like to be able to attach to the running application and step through the code, breaking on each new function call.
Also, it would be nice to be able to create a "diagram of calls" in the application (i.e. in order to run a particular page logic, this particular flow of function calls was executed in a particular order).
Not to mention to be able to run a coverage analysis, identifying potentially orphaned code fragments.
I would like to stress out once more, that it is impossible to understand the inner logic of the application just by looking at the code itself, unless you have LOTS of spare time and beer crates, which I unfortunately do not have :/ (shame...)
An IDE of some sort that would aid in extending that code would be also great, but I am currently looking into possibility to use Visual Studio 2010 to do the job, as the site itself is a mix of Classic ASP and ASP.NET (I'd say - 70% Java Script with jQuery, 30% ASP).
I have obviously tried FireBug, but I was unable to find a way to define a breakpoint or step into the code, which is "injected" into the client JS using AJAX calls (i.e. the application retrieves the code by invoking an URL and injects it to the client local code). Venkman debugger had similar issues.
Any hints would be welcome. Feel free to ask additional questions.
You can try to use "dynaTrace Ajax" to create a diagram of calls, but it is not debugger. If you have access to that application you can use keyword "debugger" to define a breakpoint explicitly in javascript files. Visual Studio theoretically shows all of evaluated and loaded javascript code in Solution Explorer when you've attached to IE.
I'd start with Chrome's developer tools and profiling single activities to find functions to set breakpoints on. Functions can be expanded to get their call stack.
That may or may not be helpful as it's entirely possible to get code that's too -- uh, unique, shall we say -- to make much sense of.
Good luck?