Safely run user written Javascript in my game

Safely run user written Javascript in my game - javascript

I have a tough issue to solve, and I've done a lot of research and I don't think I've found a perfect answer yet.
I am making a game, and I plan on writing a pretty big level editor. In my head, this level editor might be powerful enough for me to write the entire game in.
That means, I'd only need to add new features to that game editor to add features to the game.
So far, so good. However, my game is written in Javascript, and I plan on making the level editor inside the same game engine, so I can actually ship it with the game, and let players make their own levels.
Inside that level editor, I plan on adding scripting features to extend the capabilities of the game.
This means, I'd like the level editor to be able to run Javascript. So I want users to be able to write and run JS in the game, and also be able to share their levels and have other people play these levels.
Now, this sounds really bad. This is me basically allowing users to do XSS through my game. I've thought of many ways, and basically, I haven't found any solution to allow users to do this safely.
I want them to be able to access a handful of JS objects I give them and NOTHING ELSE. I don't want them to be able to access ANY other variable, like window, navigator, console etc. I only want them to be able to access my game's runtime object and whatever utils functions I give them.
So, how can I accomplish that safely?

If the custom script runs on the client, the first step would be to at least run it inside a sandbox rather than on the parent page. You could create a web worker from the user-provided script and then pass a message to the worker including only the information you want it to be able to use. Workers run in a separate environment and do not provide access to the original page, nor to many of the built-in browser objects. But some built-in objects will still be accessible, like window.console and window.fetch. If that's acceptable, you could run the user's script in the worker, and have it post a message back to the original page, and then discard the worker.
You might be able to do shenanegans like using with and new Function in an attempt to prevent the user from accessing certain global properties, but that'd be tedious and I wouldn't be confident in being able to do is successfully.
A better approach would be to run the script in a truly isolated world - send the code to your server, and use Node to run the script with vm.runInNewContext, and have the server send the results back to the client. This provides the isolation you need, because:
When you use this function it creates a VERY limited context. You'll need to pass anything you want into the sandbox object which become global objects. For example: You will need to include console in the sandbox object if you want your untrusted code to write to the console.
That said, in most cases, I'd first consider if running the user's JS is absolutely necessary. I'd much prefer allowing the user to customize behavior by defining a text format that they input text into. But if you need to provide unlimited flexibility given the game, creating such an (essentially) programming language could be quite time-consuming if the game is complicated.

Related

Could webassembly be a way to enforce drm?

The Idea of using compiled languages on might be a great way to increase performance significantly.
But could it used to set a drm?
For example: Some website offers browser games and doesn't want the source code to be used by others. Would webassembly-script tied deep into the game mechanics be used to detect if it is used on another site and lock it down with no way to decompile and bypass it?
I don't want to sound like a pirate with this, but it might concern adblock-users, that also block trackers.
When for example a Audio Context Fingerprinting-script is run behind without being detected, how can it be blocked?

Other than performance, Wasm does not provide any new capability that the Web doesn't already have. You will be able to see what Wasm modules are loaded, you will be able to see what's being called, and you can inspect and step through Wasm code in text form -- and arguably, Wasm code is no more obscure than e.g. minified asm.js code. Moreover, Wasm is only able to interact with the browser and the web page by calling into JavaScript.
So, no, web sites won't be able to use it to do anything behind your back other than in ways they can use already.

How to secure an app that allows users to insert JavaScript?

We have an app that sits behind a firewall and behind a CAS authentication layer. It has a feature that allows users with a special role to customize the way the app works by writing JavaScript functions that get inserted into the application at runtime, and which can be fired by events such as button clicks and page load and the like. (The JS is not "eval"'d - it is written into the page server-side.)
Needless to say, this feature raises security concerns!
Are there recommendations beyond what's being done already to secure this, that is beyond a) firewall, b) robust authentication and c) authorization.
EDIT:
In response to questions in comments:
1. Does the injected code become part of the application, or it is executed as an independent application (separated context)?
Yes, it becomes a part of the application. It currently gets inserted, server-side, into a script tag.
Does inserted JavaScript run on clients' browsers other than the original writer?
Yes. It gets persisted, and then gets inserted into all future requests.
(The application can be thought of as an "engine" for building custom applications against a generic backend data store which is accessed by RESTful calls. Each custom application can have its own set of custom these JavaScripts)

You really shouldn't just accept arbitrary JavaScript. Ideally, what should happen is that you tokenize whatever JavaScript is sent and ensure that every token is valid JavaScript, first and foremost (this should apply in all below scenarios).
After that, you should verify that whatever JavaScript is sent does not access sensitive information.
That last part may be extremely difficult or even impossible to verify in obfuscated code, and you may need to consider that no matter how much verification you do, this is an inherently unsafe practice. As long as you understand that, below are some suggestions for making this process a little safer than it normally is:
As #FDavidov has mentioned, you could also restrict the JavaScript from running as part of the application and sandbox it in a separate context much like Stack Snippets do.
Another option is to restrict the JavaScript to a predefined whitelist of functions (some of which you may have implemented) and globals. Do not allow it to interact directly with DOM or globals except of course primitives, control flow, and user-defined function definitions. This method does have some success depending on how robustly enforced the whitelist is. Here is an example that uses this method in combination with the method below.
Alternatively, if this is possible with what you had in mind, do not allow the code to run on anyone's machine other than the original author of the code. This would basically be moving a Userscript-like functionality into the application proper (which I honestly don't see the point), but it would definitely be safer than allowing it to run on any client's browser.

How do you extract tightly-coupled code into an external library?

This is a generic question about paradigms, and I apologize if this is an inappropriate place to ask. Polite recommendations on the correct place to ask this will be appreciated :)
I'm working for a company that has a separate codebase for each of its websites. I've been asked to take a sizable piece of functionality out of one codebase and put it into an external library, so that multiple codebases can make use of it.
The problem is that the code is tightly coupled to the codebase it was built in, and I'm having a difficult time extracting it. I've approached this problem from multiple angles, and restarted from scratch each time. Each time, I start running into complexities, and it feels like I'm approaching the problem the wrong way. I was wondering if anyone else has had experience doing this, or if there is a recommeneded way to proceed?
Here's what I've tried:
I copied the relevant files into a new project, carefully replacing each reference to the old codebase with vanilla javascript. This has been a laborious process, and I keep running into issues I can't solve
I placed a very basic HTML file in the old codebase, as well as a blank javascript file. I've been cut and pasting functions one at a time into that javascript file, and calling them in the old codebase as well as the basic HTML file.
I created another new project, and copy and pasted functions one at a time into the new project.
Each approach has presented me with its own challenges, but I can't get around the fact that the original code is so tightly coupled to the original codebase that progress is very slow, and I'm beginning to question whether any of the code is salvageable.

The old code may not be salvageable, and it's more than reasonable to reach a point where you go back and say so.
The typical goal I have in cases such as these, cases where nearly all of the old code is unsalvageable but something new needs to not only take over for it, but quickly be used by old and new codebases alike, is to refactor the code into models, services, and components (less MVC and more 'data, how you get and change data, and how you view and interact with data').
In cases where you are building something to replicate the old, but get to write it from scratch, I treat it like it's brand new, and start from the interfaces, first. By knowing what the outer-edges should look like, and by keeping the internal code clean, and leaning on DI (the principle, not any wrapper in particular), I build the system I think I should be able to have, such that new projects/products can happily integrate with the right thing.
...for projects which need to have a product revamped, inside of a busted old system, I take nearly the same tack; I design the interface that I want, I make sure that everything is DI friendly (this becomes more important, here), and then I build a facade that looks exactly like how the old bustedness is called and used, and inside of that facade, I instantiate the sane system, I transform whatever the old, awful data points were, into our new models, I do whatever it is my system needs to do, and on the way out of the system, I transform our awesome new models into the terrifying results that the old system was responsible for making.
The latest such thing is a new platform which hosts new APIs.
The APIs, however, talk to awful, old, stateful, session-based web-services, which make horizontal-scaling absolutely impossible (not what you want to hear, when your goal is to distribute the new platform on Node, on AWS).
The solution was to build the APIs exactly as we expect them; get the interface to look as nice and be as useful as possible, while serving the actual needs of the API clients.
Then, we made sure that the modules which provided the APIs used DI for the service that acts as a connector to the old back-end.
That's so that we can simply swap out that service, when it comes time to connect to a better implementation of the system.
That service, however, needs transformers.
It needs one transformer to convert our awesome new request objects, into the scary old ball of mud that just kept growing.
Then it needs another transformer to turn the output from the ugly old data, into our new models, that our whole system uses.
Those transformers don't necessarily need to be injected into the service, because their implementation details are tied pretty tightly to the place that they're calling, and any update to the service, or any new service called will require transformer work for that specific service's implementation details.
Then there are problems on the front-end side, where communication used to take too much for granted, when talking to a server.
We now have transformers on the client side, which are used at the last possible second (actually, we wrote client-side services) to convert the old way of doing things to talk to the new form.
Any magic global data, which was randomly called in the middle of a process was factored into the service, the transform, and the API in general, if it serves a specific / reusable enough purpose.
Any of those magically grabbed pieces of information are now explicitly passed in. Some are client-only, and thus are either config data for instantiation, or are parameters for particular methods on services.
Session data is now explicitly passed back from the client, in the form of tokens/ids on each request that requires them (for now).
So the new platform stays 100% stateless (and thus scales wonderfully, from that aspect).
So long as all of that magical data gets pulled out of the internals, and passed through, the system can keep being refactored, without too much worry.
As soon as state and state-management exist on the inside of your system, it starts getting harder to work with, and harder to refactor (but you already know that).
Doing a refactor of a product which never leaves the page (ie: involves no APIs/services, or at least none that are tightly coupled to your front-end), isn't really much different.
Remove global state, by explicitly forcing it to be passed into your system (build time, call-time, whenever fits the data's purpose).
If there are async race conditions with moving parts that touch too many things, see if you can resolve them with promises, to get you out of nested callback hell.
My team is now largely using set-based programming (.map, .filter, .reduce, over arrays) and functional programming, in general, to simplify much of the code we look at and write, as each new function may only be 3-5 lines long (some being one-liners).
So our services will tend to be structured in an OOP sort of way, but as much as possible, will remain pure (no outer state modified by/for function calls), and the internals of those calls will typically look much more like chained or composed functional programming.
This has less to do with the overall refactor, and more to do with the micro refactors, as we build our systems.
For the macro-level, it's really your interface, and the facade you wrap the old stuff in, and the removal of all global state (which functional helps with) which make the difference.
The other alternative, of course, is to copy and paste the whole file/page, and start erasing things that you know aren't going to break, until you get to the things that might break, and continue from there. It's not pretty, it's not reusable, but I've been forced to do it a few times in my life, and regretted it every time.

Is there a way to reliably delete/wipe a variable (i.e. key/password) in JavaScript?

Today crypto libraries for JavaScript exists sjcl and hence there may be the situation that
a password/key/secret/sensitivedata is stored somewhere in a variable in JavaScript.
I do not want to risk that this sensitve data is leaked/disclosed and hence I would very much like to know if there is a way to reliably wipe a variable in Javascript so that the memory used by the JavaScript Engine will not have any remaining info about he data? I would for instance not want to rely on some GC to wipe the data lazily etc.
An answer might feature an example code that kills/wipes a variable and also an explanation when (and if there are differences on what JavaScript implementation Browsers type /Nodejs) it makes sence to trust that the data has been deleted?
Else if the task is impossible I would appreciate a explanation why this is so as well and also accept this as an answer
The goal is not to protect the webpage user from accessing the script variable (this cannot be done I guess). The goal is more to guarantee that the memory of the javascript engine does not keep shadow/cached copies of the data, after the point necessary. I do want to have the data be gone so that no-one (attacker software) can get the secret data via looking at the memory been associated with the Javascript Variables.

JavaScript is garbage collected. In addition, there is no mechanism for deterministic resource management built in. You can make one, but the resource would have to be external.
Even if you build such a mechanism (with a C++ external module in Node for example), engines don't give you strong guarantees on when their copy of the memory is cleared. You would have to manually assign to the same variable parts of the resource data and replace it with junk yourself. That would likely work but there is still no guarantee at the engine level.
This is simply not a problem JavaScript implementations are built to do well at this point. There is no SecureString. That said - smart people are working on variants of ECMAScript (the JS standard) that give you much stronger guarantees. That's a good first step towards addressing the problem (but no such guarantee yet).
I don't even want to get started on browsers, where browser extensions can easily get better hooks than you and write over Function.prototype.call and hook on every function call, JavaScript has quite powerful AOP capabilities built in, for worse in this instance.
One possible solution would be to run the whole program within a VM that uses encrypted RAM, but I'm against rolling your own crypto like that. Generally, an attacker should not have access to your program's RAM in the first place, if they do, they can install a browser extension :)

what actually is sandboxing in JavaScript?

I understand the term sandbox. But my limited skills in JS is unable to help me understand what is sandboxing in JS. So, what actually is sandboxing? Apart from security, why do we need to sandbox JS?

the javascript sandbox does exactly what you've said. It limits the scope of what a script can do. There are also benefits in terms of virtualising the resources the script can call on. This allows the sandbox host to marshal those resources for better performance and say, stop an endlessly looping script bringing the whole browser crashing down.

Sandboxing is the act of creating a scope in which no other part of the application can operate (unless given an opportunity to). More specifically, this is usually a function scope that exposes a limited subset of what's actually going on within it.
One library that's founded on the idea of sandboxes is YUI3. The basic unit of the application is a YUI instance sandbox:
var Y = YUI(); // creates a configurable YUI instance
// Creates a sandbox for one part of your application,
// including the 'node' module.
Y.use('node', function(Z) {
// Z is a YUI instance that's specific to this sandbox.
// Operations inside it are protected from outside code
// unless exposed explicitly. Any modules you request in
// use statement will be separately instanced just for
// this sandbox (in this case, the 'node' module)
//
// That way, if another part of your application decides
// to delete Z.Node (or worse, replace it with a
// malicious proxy of Z.Node) the code you've written
// here won't be affected.
});
The advantages of sandboxes are primarily to reduce application complexity: since sandboxes are immutable, they're much easier to reason about and verify. They also improve runtime security, since a well-designed sandbox should be able to operate as a black box to other scripts running on the page. It does not prevent against all possible attacks, but it protects against many of the simple ones.

Sandboxing creates a limited scope for the script to use. Assuming you're coding for a website, t's worth sandboxing to avoid making edits to a live site when you are uncertain about whether they will work exactly as you expect - and it's impossible to be truly certain without testing. Even if it works properly, if there's a chance of you making a series of alterations to the JS until you've got it tweaked the way you like, you could easily disrupt anyone attempting to use the site while you're updating it.
It's also much easier to tell what's broken when you break things, because of the limited nature of the sandbox.

We Keep Coding

JavaScript is the programming language of the Web.