I am currently considering issues of running user-supplied code in node. I have two issues:
The user script must not read or write global state. For that, I assume I can simply spawn of a new process. Are there any other considerations? Do I have to hide the parent process from the child somehow, or is there no way a child can read, write or otherwise toy with the parent process?
The user script must not do anything funky with the system. So, I am thinking of disallowing any system calls. How do I achieve this? (Note that if I can disallow the process module, point 1 should be fixed as well, no?)
You are looking for the runInNewContext function from the vm module (vm documentation).
When you use this function it creates a VERY limited context. You'll need to pass anything you want into the sandbox object which become global objects. For example: You will need to include console in the sandbox object if you want your untrusted code to write to the console.
Another thing to consider: Creating a new context is a VERY expensive operation - takes extra time and memory to do. Seriously consider if you absolutely need this. Also seriously consider how often this is going to happen.
Example:
var vm = require('vm');
var sandbox = {
console: console,
msg: "this is a test",
};
vm.runInNewContext('console.log(msg);', sandbox, 'myfile.vm');
// this is a test
More to consider: You will want to create a new process to run this in. Even though it's in a new context it's still in the same process that it's being called from. So a malicious user could simply set a never ending for loop so that it never exits. You'll need to figure out logic to know when something like this happens so that you can kill the process and create a new one.
Last thought: A new context does not have setTimeout or setInterval. You may or may not want to add these. However, if you create a setInterval in the untrusted code and the untrusted code never stops it then it will continue on forever. You'll need to figure a way to end the script, it's probably possible I just haven't looked into it.
Related
I use mocha/chai to test my graphql endpoints for a nodejs/express server - which works fine. However, in the first test, I check if any .env variables have been set correctly.
If not, all further tests will be affected anyways. So I would like to terminate all further tests when any test in this block fails (preferably finishing the block to capture all 'missing' variables).
So can I somehow terminate the entire test, when a certain describe-block fails?
Note: I found the bail flag/config option, but this determines the whole test on any error, which is not what I'm looking for.
You can try add after hook and add a condition there
after('Fail for some describe', function () {
if(this.test.parent.title.startsWith('Describe title that should fail') && this.test.parent.isFailed())
this.test.parent._bail = true
});
This could be tricky since this context may contain several layers of parents so if not sure you can always log the keys for current this.test object. Also, bear in mind that the following describe might have different this context with overriden _bail value. This solution is good when it comes to tracking specific critical it failures.
the answer to this question will probably be "it cannot be done with out writing a custom javaScript parser" but here i go:
while developing some lightweight scripting tool i ran into a need to do some automatic cleanup behind the scenes when some block variables goes out of scope.
(this tool has no serious parsing done under the hood..., i just take normal javascript text source code, augment it a bit, add some dependencies on context and create a function out of it and execute it)
i would like it to be done automatically with out the end user that wrote the script even realizing it is happening behind the scenes.
for example
i got this block
{
const activatingElement=await $getElement('//input[#class="button" and #type="submit"]');
await $wait(defaultWaitTime);
let transaction = $startTransaction({customTransactionName:"LoginAndLoad_page"});
await activatingElement.click();
await $waitFor({query:'(//tbody[#class="ui-datatable-data ui-widget-content"]//tr[#role="row"])[1]'});
$endTransaction(transaction);
console.log(this);
await activatingElement.dispose();
}
i would like to cause the last line await activatingElement.dispose(); to be executed automatically just before the const activatingElement defined in the block goes out of scope. i control the method that defines the value for this variable ($getElement) so i can change it as i want. but i don't want to just edit the source code and write the dispose before the block ends, since it will be brittle in case of nested blocks. so i would just like this method to be automatically called just before this variable goes out of scope. is there any trick i can use to do it?
this is needed to clear some browser elements for GC, they are not elements inside my own nodejs code execution, they are handles for a code that executes on a browser and unless i call dispose() on them then necessary resources might be unavailable for cleaning.
is there any way to do it? i thought of using traps with proxy, but don't think it can be done on out of scope to be variables (only on property deletion)
I'm wondering if there's a way to cause JavaScript to wait for some variable-length code execution to finish before continuing using events and loops. Before answering with using timeouts, callbacks or referencing this as a duplicate, hear me out.
I want to expose a large API to a web worker. I want this API to feel 'native' in the sense that you can access each member using a getter which gets the information from the other thread. My initial idea was to compile the API and rebuild the entire object on the worker. While this works (and was a really fun project), it's slow at startup and cannot show changes made to the API without it being sent to the worker again after modification. Observers would solve part of this, and web workers transferrable objects would solve all, but they aren't adopted widely yet.
Since worker round-trip calls happen in a matter of milliseconds, I think stalling the thread for a few milliseconds may be an alright solution. Of course I would think about terminating in cases where calls take too long, but I'm trying to create a proof of concept first.
Let's say I want to expose the api object to the worker. I would define a getter for self.api which would fetch the first layer of properties. Each property would then be another getter and the process would continue until the final object is found.
worker.js
self.addEventListener('message', function(event) {
self.dataRecieved = true;
self.data = event.data; // would actually build new getters here
});
Object.defineProperty(self, 'api', {
get: function() {
self.dataRecieved = false;
self.postMessage('request api first-layer properties');
while(!self.dataRecieved);
return self.data; // whatever properties were received from host
}
});
For experimentation, we'll do a simple round-trip with no data processing:
index.html (only JS part)
var worker = new Worker("worker.js");
worker.onmessage = function() {
worker.postMessage();
};
If onmessage would interrupt the loop, the script should theoretically work. Then the worker could access objects like window.document.body.style on the fly.
My question really boils down to: is there a way to guarantee that an event will interrupt an executing code block?
From my understanding of events in JavaScript, I thought they did interrupt the current thread. Does it not because it's executing a blank statement over and over? What if I generated code to be executed and kept doing that until the data returned?
is there a way to guarantee that an event will interrupt an executing code block
As #slebetman suggests in comments, no, not in Javascript running in a browser's web-worker (with one possible exception that I can think of, see suggestion 3. below).
My suggestions, in decreasing order of preference:
Give up the desire to feel "native" (or maybe "local" might be a better term). Something like the infinite while loop that you suggest also seems to be very much fighting agains the cooperative multitasking environment offered by Javascript, including when thinking about a single web worker.
Communication between workers in Javascript is asynchronous. Perhaps it can fail, take longer than just a few milliseconds. I'm not sure what your use case is, but my feeling is that when the project grows, you might want to use those milliseconds for something else.
You could change your defined property to return a promise, and then the caller would do a .then on the response to retrieve the value, just like any other asynchronous API.
Angular Protractor/Webdriver has an API that uses a control flow to simulate a synchronous environment using promises, by always passing promises about. Taking the code from https://stackoverflow.com/a/22697369/1319998
browser.get(url);
var title = browser.getTitle();
expect(title).toEqual('My Title');
By my understanding, each line above adds a promise to the control flow to execute asynchronously. title isn't actually the title, but a promise that resolves to the title for example. While it looks like synchronous code, the getting and testing all happens asynchronously later.
You could implement something similar in the web worker. However, I do wonder whether it will be worth the effort. There would be a lot of code to do this, and I can't help feeling that the main consequence would be that it would end up harder to write code using this, and not easier, as there would be a lot of hidden behaviour.
The only thing that I know of that can be made synchronous in Javascript, is XMLHttpRequest when setting the async parameter to false https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest#Parameters. I wonder if you could come up with some sort of way to request to the server that maintains a connection with the main thread and pass data along that way. I have to say, my instinct is that this is quite an awful idea, and would be much slower than just requesting data from the main thread.
For what I know, there is not something native in JS to do this but it is relatively easy to do something similar. I made one some time ago for myself: https://github.com/xpy/whener/blob/master/whener.js .
You use it like when( condition, callback ) where condition is a function that should return true when your condition is met, and callback is the function that you want to execute at that time.
It seems there's no way to completely hide source/encrypt something to prevent users from inspecting the logic behind a script.
Aside from viewing the source, then, is it possible to manipulate every variables, objects while a script is running?
It seems it is possible to some degree: by using Chrome's developer tools or Firebug, you can easily edit variables or even invoke functions on the global scope.
Then what about variables, functions inside of an instantiated objects or self invoked anonymous functions? Here is an example:
var varInGlobal = 'On the global scope: easily editable';
function CustomConstructor()
{
this.exposedProperty = 'Once instantiated, can be easily manipulated too.';
this.func1 = function(){return func1InConstructor();}
var var1InConstructor = 'Can be retrived by invoking func1 from an instantiated object';
// Can it be assigned a new value after this is instantiated?
function func1InConstructor()
{
return var1InConstructor;
}
}
var customObject = new CustomConstructor();
After this is ran on a browser:
// CONSOLE WINDOW
varInGlobal = 'A piece of cake!';
customObject.exposedProperty = 'Has new value now!';
customObject.var1InConstructor; // undefined: the variable can't be access this way
customObject.func1(); // This is the correct way
At this stage, is it possible for a user to edit the variable "var1InConstructor" in customObject?
Here's another example:
There is a RPG game built on Javascript. The hero in the game has two stats: strength and agility. the character's final damage is calculated by combining these two stats. It is clear that players can find out this logic by inspecting the source.
Let's assume the entire script is self invoked and stats/calculate functions are inside of objects' constructors so they can't be reached by normally after instantiated. My question is, can the players edit the character's str and agi while the game is running(by using Firebug or whatever) so they can steamroll everything and ruin the game?
The variable var1InConstructor cannot be re-bound under normal ECMAScript rules as it is visible only within the lexical scope. However, as alex (and others) rightly say, the client should not be trusted.
Here are some ways the user can exploit the assumption that the variable is read-only:
Use a JavaScript debugger (e.g. FireBug) and re-assign the variable while stopped at a breakpoint within the applicable scope.
Copy and paste the original source code, but add a setter with access to the variable. The user could even copy the entire program invalidating almost every assumption about execution.
Modify or inject a value at a usage site: an exploitation might be possible without ever actually updating the original variable (e.g. player.power = function () { return "godlike" }).
In the end, with a client-side program, there is no way to absolutely prevent a user from cheating without a centralized authority (read: server) auditing every action - and even then it still might be possible to cheat by reading additional game state, such as enemy positions.
JavaScript, being easy to read, edit, and execute dynamically is even easier to hack/fiddle with than a compiled application. Obfuscation is possible but, if someone wants to cheat, they will.
I don't think this constitutes an answer, it could be seen as anecdotal, but it's a bit long for a comment.
Everything you do when it comes to the integrity of your coding on this issue has to revolve around needing to verify that the data hasn't changed outside of the logic of your game.
My experience with game development (via flash, primarily...but could be compared to javascript) is that you need to think about everything being a handshake where possible. When you are expecting data to come to the server from the client you want to make sure that you have some form of passage of communication that lessens the chance of someone simply sending false data. Store data on the server side as much as possible and use the client side code to call for it when it's needed, and refresh this data store often.
You'll find that HTML games tend to do a lot of abstraction of the logic to the server side, even for menial tasks. Attacking an enemy, picking up an item, these are calls to functions within server-side code, and is why the game animation could carry on in some of these games while the connection times out in the background, causing error messages to pop up and refresh the interface to the server's last known valid state.
Flash was easier in this regard as you didn't have any access to alter any data or corrupt it unless it left the flash environment
Yes, anything ran on the client should be untrusted if you're using the data from it to update a server side state.
As you suggested, you can't hide the logic/client-side code. You can make it "harder" for people to read the source by obfuscating it, but it's very trivial to undo.
Assuming you're making a game from your example, the first rule of networked games is "never trust the client". You need to either run all the game logic on a server, or you need to validate all the input on a server. Never update the game state based on input from a client without validating it first.
You can't hide any variable.
Also, if the user is so good in javascript, he can easily edit your script, without editing the variables value through the console.
JS code that is injected into an HTML using Ajax is pretty darn difficult to get your hands on, but it also has it's limitations. Most notably, you can't use JS includes in injected HTML . . . only inline JS.
I've been working with some of that recently actually and it's a real pain to debug. You can't see it, step into it, or add breakpoints to it in any way that I can figure out . . . in Firebug or Chrome's built-in tool.
But, as others have said . . . I still wouldn't consider it trusted.
I have seen this link: Implementing Mutual Exclusion in JavaScript.
On the other hand, I have read that there are no threads in javascript, but what exactly does that mean?
When events occur, where in the code can they interrupt?
And if there are no threads in JS, do I need to use mutexes in JS or not?
Specifically, I am wondering about the effects of using functions called by setTimeout() and XmlHttpRequest's onreadystatechange on globally accessible variables.
Javascript is defined as a reentrant language which means there is no threading exposed to the user, there may be threads in the implementation. Functions like setTimeout() and asynchronous callbacks need to wait for the script engine to sleep before they're able to run.
That means that everything that happens in an event must be finished before the next event will be processed.
That being said, you may need a mutex if your code does something where it expects a value not to change between when the asynchronous event was fired and when the callback was called.
For example if you have a data structure where you click one button and it sends an XmlHttpRequest which calls a callback the changes the data structure in a destructive way, and you have another button that changes the same data structure directly, between when the event was fired and when the call back was executed the user could have clicked and updated the data structure before the callback which could then lose the value.
While you could create a race condition like that it's very easy to prevent that in your code since each function will be atomic. It would be a lot of work and take some odd coding patterns to create the race condition in fact.
The answers to this question are a bit outdated though correct at the time they were given. And still correct if looking at a client-side javascript application that does NOT use webworkers.
Articles on web-workers:
multithreading in javascript using webworkers
Mozilla on webworkers
This clearly shows that javascript via web-workers has multithreading capabilities. As concerning to the question are mutexes needed in javascript? I am unsure of this. But this stackoverflow post seems relevant:
Mutual Exclusion for N Asynchronous Threads
Yes, mutexes can be required in Javascript when accessing resources that are shared between tabs/windows, like localStorage.
For example, if a user has two tabs open, simple code like the following is unsafe:
function appendToList(item) {
var list = localStorage["myKey"];
if (list) {
list += "," + item;
}
else {
list = item;
}
localStorage["myKey"] = list;
}
Between the time that the localStorage item is 'got' and 'set', another tab could have modified the value. It's generally unlikely, but possible - you'd need to judge for yourself the likelihood and risk associated with any contention in your particular circumstances.
See the following articles for a more detail:
Wait, Don't Touch That: Mutual Exclusion Locks & JavaScript - Medium Engineering
JavaScript concurrency and locking the HTML5 localStorage - Benjamin Dumke-von der Eh, Stackoverflow
As #william points out,
you may need a mutex if your code does something where it expects a
value not to change between when the asynchronous event was fired and
when the callback was called.
This can be generalised further - if your code does something where it expects exclusive control of a resource until an asynchronous request resolves, you may need a mutex.
A simple example is where you have a button that fires an ajax call to create a record in the back end. You might need a bit of code to protect you from trigger happy users clicking away and thereby creating multiple records. there are a number of approaches to this problem (e.g. disable the button, enable on ajax success). You could also use a simple lock:
var save_lock = false;
$('#save_button').click(function(){
if(!save_lock){
//lock
save_lock=true;
$.ajax({
success:function()
//unlock
save_lock = false;
}
});
}
}
I'm not sure if that's the best approach and I would be interested to see how others handle mutual exclusion in javascript, but as far as i'm aware that's a simple mutex and it is handy.
JavaScript is single threaded... though Chrome may be a new beast (I think it is also single threaded, but each tab has it's own JavaScript thread... I haven't looked into it in detail, so don't quote me there).
However, one thing you DO need to worry about is how your JavaScript will handle multiple ajax requests coming back in not the same order you send them. So, all you really need to worry about is make sure your ajax calls are handled in a way that they won't step on eachother's feet if the results come back in a different order than you sent them.
This goes for timeouts too...
When JavaScript grows multithreading, then maybe worry about mutexes and the like....
JavaScript, the language, can be as multithreaded as you want, but browser embeddings of the javascript engine only runs one callback (onload, onfocus, <script>, etc...) at a time (per tab, presumably). William's suggestion of using a Mutex for changes between registering and receiving a callback should not be taken too literally because of this, as you wouldn't want to block in the intervening callback since the callback that will unlock it will be blocked behind the current callback! (Wow, English sucks for talking about threading.) In this case, you probably want to do something along the lines of redispatching the current event if a flag is set, either literally or with the likes of setTimeout().
If you are using a different embedding of JS, and that executes multiple threads at once, it can get a bit more dicey, but due to the way JS can use callbacks so easily and locks objects on property access explicit locking is not nearly as necessary. However, I would be surprised if an embedding designed for general code (eg, game scripting) that used multi threading didn't also give some explicit locking primitives as well.
Sorry for the wall of text!
Events are signaled, but JavaScript execution is still single-threaded.
My understanding is that when event is signaled the engine stops what it is executing at the moment to run event handler. After the handler is finished, script execution is resumed. If event handler changed some shared variables then resumed code will see these changes appearing "out of the blue".
If you want to "protect" shared data, simple boolean flag should be sufficient.