Data-tainting in JavaScript

Data-tainting in JavaScript - javascript

While reading about navigator() object in JavaScript I run into taintEnabled() function description, as good as similar taint() and untaint() functions, referring to something called "data-tainting".
Googling around net and StackOverflow show some possible reference to Perl language, but none about JavaScript. I wonder, what is data-taining and how to use these functions?

Data Tainting (or Taint Checking) is a language feature wherein user-input data is flagged as tainted, a flag that propagates to all data derived from this input. As a result, code can implement runtime assertions to ensure security critical code is not being called using tainted data (ie prevent SQLi, XSS type attacks).
Whilst Netscape implemented it in the browser in v3 and v4, support for it sadly never materialized elsewhere, so #trejder is absolutely right that it should be avoided in JavaScript.

As mentioned, there aren't many sources in the Internet about data-tainting, as it seems to be a long forgotten, deprecated technique and topic. But I found out an interesting reading on this on findmeat.org. For the Navigator.taintEnabled() method it says that (various parts cited, some text shortened):
The data-tainting support was a short-lived means of sending data back to a server. The security implications became unworkable and the whole data tainting idea was deprecated. The functionality was removed in JavaScript version 1.2. This method is only supported in order to prevent scripts from crashing. This functionality is highly deprecated and you can expect it to cause run-time exceptions in future. You should seek to try and remove it to prevent run-time errors in the future.
It seems that nowadays few browsers support this function (and similar, mentioned) and that it should not be used under any circumstance. Even if a browser implements this at all, it should return the value false for this method, always.

Lack of information on this concept makes new learners to make more effort. Here is my finding to help them.
When tainting should be enabled and when not, here are the two important points worth considering:
When data tainting is enabled, JavaScript in one window can see properties of another window, no matter what server the other window's document was loaded from. However, the author of the other window taints (marks) property values or other data that should be secure or private, and JavaScript cannot pass these tainted values on to any server without the user's permission.
When data tainting is disabled, a script cannot access any properties of a window on another server.
A useful resource is http://www.aisystech.com/resources/advtopic.htm#1009533

Related

security considerations when using (end-user-defined) JavaScript code inside Java

I am working on a Java project. In it, we want to enable an end-user to define variables which are calculated based on a set of given variables of primitive types or strings. At some point, all given variables are set to specific values, and then the calculations should be carried out. All resulting calculated variables must then be sent to Java.
I am in the process of evaluating ways for the end-user to define his calculations. The (current) idea is to let him write JavaScript and let that code be interpreted/executed inside the Java program. I know of two ways for this to be done: Either use the javax.scripting API or GraalVM/Truffle. In both, we would do it like this:
The given variables are given into the script. In javax.scripting via ScriptEngine.put, in Graal/Truffle via Value.putMember.
The end-user can define variables in the global context (whose names must not collide with the ones coming from Java). How he sets their values is up to him - he can set them directly (to a constant, to one of the given variables, to the sum of some of them ...) or define objects and functions and set the values by calling those.
When the time comes where the given variables have a fixed value, the script is executed.
All variables that were defined in the global context by the script will be sent to Java. In javax.scripting via ScriptEngine.get, in Graal/Truffle via Value.getMember.
NOTE: We would not grant the script access to any Java classes or methods. In javax.scripting via check if the script contains the string Java.type (and disallow such a script), in Graal/Truffle via using the default Context (which has allowAllAccess=false).
The internet is full of hints and tips regarding JavaScript security issues and how to avoid them. On the one hand, I have the feeling that none of them apply here (explanation below). On the other hand, I don't know JavaScript well - I have never used it for anything else than pure, side-effect-free calculations.
So I am looking for some guidance here: What kind of security issues could be present in this scenario?
Why I cannot see any security issues in this scenario:
This is pure JavaScript. It does not even allow creating Blobs (which are part of WebAPI, not JavaScript) which could be used to e.g. create a file on disk. I understand that JavaScript does not contain any functionality to escape its sandbox (like file access, threads, streams...), it is merely able to manipulate the data that is given into its sandbox. See this part of https://262.ecma-international.org/11.0/#sec-overview:
ECMAScript is an object-oriented programming language for performing
computations and manipulating computational objects within a host
environment. ECMAScript as defined here is not intended to be
computationally self-sufficient; indeed, there are no provisions in
this specification for input of external data or output of computed
results. Instead, it is expected that the computational environment of
an ECMAScript program will provide not only the objects and other
facilities described in this specification but also certain
environment-specific objects, whose description and behaviour are
beyond the scope of this specification except to indicate that they
may provide certain properties that can be accessed and certain
functions that can be called from an ECMAScript program.
The sandbox in our scenario only gets some harmless toys (i.e. given variables of primitive types or strings) put into it, and after the child has played with them (the script has run), the resulting buildings (user-defined variables) are taken out of it to preserve them (used inside Java program).

(1) Code running in a virtual machine might be able to escape. Even for well known JS implementations such as V8 this commonly happens. By running untrusted code on your server, whenever such a vulnerability becomes known, you are vulnerable. You should definitely prepare for that, do a risk assessment, e.g. which other data is accessible on the (virtual) machine the engine runs on (other customers data?, secrets?), and additionally harden your infrastructure against that.
(2) Does it halt? What happens if a customer runs while(true); ? Does that crash your server? One can defend against that by killing the execution after a certain timeout (don't try to validate the code, this will never work reliably).
(3) Are the resources used limited (memory)? With a = ""; while(true) a += "memory"; one can easily allocate a lot of memory, with negative impact on other programs. One should make sure that also the memory usage is limited in such a way that the program is killed before resources are exhausted.

Just some thoughts. You're essentially asking if you can trust your sandbox/vitual machine, for that you should either assume that you're using a good one or the only way to be really sure is to read through all its source code yourself. If you choose a trusted and well known sandbox, I'd guess you can just trust it (javascript shouldn't be able to affect file system stuff outside of it).
On the other hand why aren't you just doing all this calculations client side and then sending the result to your backend, it seems like a lot of setup just to be able to run javascript server side. If the argument for this is "not cheating" or something similar, then you can't avoid that even if your code is sent to the server (you have no idea who's sending you that javascript). In my opinion doing this setup just to run it server side doesn't make sense, just run it client side.
If you do need to use it server side then you need to consider if your java is running with root permissions (in which case it will likely also invoke the sandbox with root permissions). On my setup my nodejs is executing under ~/home so even if a worst case happens and someone manages to delete everything the worst they can do is wipe out the home directory. If you're running javascript server side then I'd strongly suggest at the very least never do so under root. It shouldn't be able to do anything outside that sandbox but at least then even in the worst case it can't wipe out your server.
Something else I'd consider (since I have no idea what your sandbox allows or limits) is whether you can request and make API calls with javascript in that sandbox (or anything similar), because if it's running under root and allows that it would give someone root access to your infrastructure (your infrastructure thinking it's your server making requests when it's actually malicious JS code).
You could also make a mistake or start up your VM with an incorrect argument or missing config option and it suddenly allows a vulnerability without you being aware of it, so you'll have to make sure you're setting it up correctly.
Something else is that if you ever store that JS in some database, instead of just executing it, then you have to make sure that it's not made directly available to any other users without checking it otherwise you'd have XSS happening. For example you build an app for "coding tests" and store the result of their test in a database, then you want to show that result to a potential employer, if you just directly display that result to them you'll execute malicious code in their browser.
But I don't really see a reason why you should care about any of this, just run it client side.

privilege dropping possible in Javascript?

I wonder if it would be possible to drop privileges in Javascript?
function takeAwaySetTimeout()
{
var oldSetTimeout = window.setTimeout;
window.setTimeout = function()
{
console.log("not working anymore!");
};
}
setTimeout("console.log('this works');",0); // "this works!"
takeAwaySetTimeout();
setTimeout("console.log('this works');",0); // "not working anymore!"
unfortunatelly it seesm to me complicated as a simple delete window.setTimeout will bring back the priviledge! So for me this seems indicative to the fact that unfortunatelly Javascript would not provide for taken away privileges.
I am aware that the term privilege is somewhat borrowed.
It is the background to the question that I would conceive any possiblity to workingly remove a [Native Code] function (= that the method .toSource() indicates its an function provided by the javascript engine) from being accessible in some part as a way to secure the code subjected to this limitation (the privileges are dropped) from being less of a safety concern.
clarification
I welcome your request for more clarity, and hope your will respond to it responsibly and de-hold "free" the question!
Please also consider that having received indeed already two helpful answers shows that, there are people who were able to understand the question. Yet sure, if possible (something simply need background....) I also strive for broadening the understanding.
"It's a bit unclear how restricting access to native functions would give more security. Is
this server-side or client-side JavaScript?"
1) it does not very much matter if client or server-side. Maybe a tiny little bit more important seems of course server-side. Because there is likely likely more functionality (i.e. writing to files, access files.....), more then maybe the more limited Javascript inside of Browser would be able to do (but consider new API's power ...and risks!)
2) maybe the choice for window.setTimeout() [Native function] is not perfect (for clarity), as maybe no direct security relationship is obvious. It has been used because it is well known and it is placeholder. see (3)
3) my reasoning is that each functionality that is provided to Javascript code is ambivalent. On the pro side, it enriches "what it can do?" positively and well-meaning code will use it responsible. Yet on the con side a functionality can mean access to things which when abused can cause security related stuff. An example would be that an external Javacript would do a XHR and post information to the server, potentially data that has private data (i.e. customers health state). If then for example it was possible to take away the XHR object better window.XMLHttpRequest the chances for such an abuse would be limited. Plainly "you cannot shoot somebody, not having a gun!". XHR for instance (maybe more clearly than setTimeout) is such a gun. If the "untrusted code" is not really needing XHR, then it is just good sense to take the risk away, by dropping this privilege/functionality.
4) I think (also in the context of the replies) this question has evolved. I think it is clear, yet please post comment if not so. While initially Juhana said:
It's a bit unclear[...]
I understand that it was not totally unclear, and hence now it might have reached enough clarity (please consider the helpful answers) to allow for "deholding"/"freeing the Question". Also if you found the question interesting enough to hold it, then now would be the time to find it interesting enough to upvote it ;)

There are so many ways to get the original function and blacklisting can't work because you'll miss something e.g.
Window.prototype.setTimeout.call(window,'alert(1)');
One solution to this problem is to create a whitelisted sandbox. I've created such a sandbox called MentalJS. This sandbox rewrites all your code with a suffix of $ for example alert(1) becomes alert$(1) this allows you to choose which functions/objects are allowed within the sandbox.
The code is available here:
http://code.google.com/p/mentaljs/
and a demo:
http://businessinfo.co.uk/labs/MentalJS/MentalJS.html

Google uses the Caja compiler in these situations. The unprivileged code is then run either in a server-side sandbox or (in browsers that support ES5) a client-side strict mode sandbox.

Easily detect when user alters DOM

Is it possible to easily detect DOM manipulation by the user?
When a user uses the console in any modern browser, he/she can manipulate the DOM in ways the developer did not intend.
I have a web app that is very much tied to the DOM being in certain states and should the user do anything to the DOM via a console, I'd like to be notified.
The answer:
Doesn't need to be browser agnostic
Doesn't need to be perfect. I fully understand that most, if not all, methods could be circumvented, but I'd like a good general solution.
Can't be too convoluted. I'm not interested in registering an event handler with all DOM events that checks some flag set when my code performs an DOM manipulation
Edit:
There appears to be some confusion in the answers I've received thus far. As pointed out in #2 above, I understand that most, if not all, methods can be circumvented.
In addition, this is an internal tool and thus is protect by a VPN. Further more, there is server-side checking. However, there are reasons, which I cannot elaborate upon, for me wanting to know when a user (who are few in number) manipulated the DOM.
To be clear, this isn't for security reasons. I'm not trying to stop malicious users here. Think of this more as out of curiosity.

Don't do that. Code your web site to not trust user input and then don't care what the user does. If invalid input is submitted then reject it. Everyone is happy.
It's easy to think that you own the user's browser. You don't. It's serving you but only at the whim of the user.
If you really must know when the DOM is modified--and this seems a really fragile design--then just do what amounts to calculating checksums. After each legitimate step of the site's approved function, traverse the DOM elements you care about and record their positions, values, or whatever you are concerned with. At intervals, validation time, or a next UI interaction, compare. This is the only comprehensive, cross-browser (including old browsers) way to detect DOM changes. Modern browsers offer DOM mutation events (see Tim Down's answer for more detail) but have limited support and will apparently be replaced with yet another new thing, anyway.
Ultimately, nothing you do can stop someone determined to defeat your scheme. If anything, the user can copy the browser's POST request using Firebug, tweak it, and write a tiny program to submit his own malicious POST request. It is more important to protect your server from malicious input than it is to make your web page supposedly bullet-proof (because it won't be).

DOM mutation events work in current versions of all major browsers and do what you want. The following will cover common DOM modifications within the whole document:
function handleDomChange(evt) {
console.log("DOM changed via event of type " + evt.type);
}
document.addEventListener("DOMNodeInserted", handleDomChange, false);
document.addEventListener("DOMNodeRemoved", handleDomChange, false);
document.addEventListener("DOMCharacterDataModified", handleDomChange, false);
DOM mutation events will eventually be replaced by mutation observers, which are implemented in recent Mozilla and WebKit browsers.

Relying on a script to prevent or counteract malicious edits to the DOM is not the right approach. What exactly are you doing that depends on the DOM not being touched? Seems like that's a huge red flag in and of itself.

This is a pretty interesting question, and I think DOM mutation events may be a best solution. One thing I was initially thinking I might do is run a timed function that checks the DOM for specific modules, based on data- attributes or IDs. If I was building my page entirely client-side through JS, I would have a build configuration object for each module (DOM element like:
<div id='weather-widget' data-module-type='widget'>
<h1 data-module-name='weather'>Weather</h1>
<!-- etc etc -->
</div>
Anyhow, my config object would contain all of these things like module type, module name, etc, etc:
//Widget configuration object
var weatherWidgetConfig = {
type: 'widget',
name: 'weather'
}
and I would inspect the DOM element and all of its children to make sure the data- attributes still matched the configuration object, that they existed, and that they have not been changed. If they have, I would call a module.destroy() and module.build() again with the correct configuration.

I've received a lot of answers in which the respondent delivers advice about how to build a web app. While that may be useful to some readers, that isn't answering the question. Some, however, have attempted to answer. The closest I seen to a complete answer was given by #Keith. The only problem is that it fails the 'easy' test.
It appears that the correct answer, as some have said, is NO - it isn't possible to easily detect DOM manipulation by a user.

I recently discovered "Selector Listener", a technique that relies on css to detect DOM changes. It doesn't work in IE 9-. Applying it to the whole DOM doesn't sound like a good idea, the intent is rather to work with specific selectors.
More details can be found in this blog post.

How do I safely "eval" user code in a webpage?

I'm working on a webapp to teach programming concepts. Webpages have some text about a programming concept, then let the user type in javascript code into a text editor window to try to answer a programming problem. When the user clicks "submit", I analyse the text they've typed to see if they have solved the problem. For example, I ask them to "write a function named f that adds three to its argument".
Here's what I'm doing to analyse the user's text:
Run JSLint on the text with strict settings, in particular without assuming browser or console functions.
If there are any errors, show the errors and stop.
eval(usertext);
Loop through conditions for passing the assignment, eval(condition). An example condition is "f(1)===4". Conditions come from trusted source.
Show passing/failing conditions.
My questions: is this good enough to prevent security problems? What else can I do to be paranoid? Is there a better way to do what I want?
In case it is relevant my application is on Google App Engine with Python backend, uses JQuery, has individual user accounts.

So from what I can tell if you are eval'ing a user's input only for them, this isn't a security problem. Only if their input is eval'd for other users you have a problem.
Eval'ing a user's input is no worse than them viewing source, looking at HTTP headers, using Firebug to inspect JavaScript objects, etc. They already have access to everything.
That being said if you do need to secure their code, check out Google Caja http://code.google.com/p/google-caja/

This is a trick question. There is no secure way to eval() user's code on your website.

Not clear if the eval() occurs on client or server side. For client side:
I think it's possible to eval safely in an well configured iframe (https://www.html5rocks.com/en/tutorials/security/sandboxed-iframes/)
This should be 100% safe, but needs a couple of libraries and has some limitations (no es6 support): https://github.com/NeilFraser/JS-Interpreter
There are lighter alternatives but not 100% safe like https://github.com/commenthol/safer-eval.
Alternatively, I think something similar can be implemented manually wrapping code in a with statement, overriding this, globals and arguments. Although it will never be 100% safe maybe is viable in your case.

It can't be done. Browsers offer no API to web pages to restrict what sort of code can be executed within a given context.
However, that might not matter. If you don't use any cookies whatsoever on your website, then executing arbitrary Javascript may not be a problem. After all, if there is no concept of authentication, then there's no problem with forging requests. Additionally, if you can confirm that the user meant to execute the script he/she sent, then you should also be protected from attackers, e.g., if you will only run script typed onto the page and never script submitted via GET or POST data, or if you include some kind of unique token with those requests to confirm that the request originated with your website.
Still, the answer to the core question is that it pretty much is that it can't be done, and that user input can never be trusted. Sorry :/

Your biggest issue will always be preventing infinite loops for occurring in user-provided code. You may be able to hide "private" references by running eval in the right context, e.g.:
let userInput = getUserInput();
setTimeout(() => {
let window = null;
let global = null;
let this = null;
// ... set any additional references to `null`
eval(userInput);
}, 0);
And you could wrap the above code in a try/catch to prevent syntax and logic errors from crashing outside of the controlled eval scope, but you will (provably) never be able to detect whether incoming user input defines an infinite loop that will tie up javascript's single thread, rendering its runtime context completely stalled. The only solution to a problem like this is to define your own javascript interpreter, use it to process the user's input, and provide a mechanism to limit the number of steps your javascript interpreter is willing to take. That would be a lot of trouble!

How Can I Modify/"Spoof" Standard Browser JS DOM Objects (Window.location) at Runtime?

I'd like to dynamically change some of the standard JS DOM objects from within a web browser.
For instance, when I execute:
var site = location;
I want to specify a new value for my browser's "window.location" object other than the "correct" one (the URL used to access the requested page) at run time, either through a debugger-like interface or even programmatically if need be.
Although Firebug advertises the capability to do something similar via its "DOM Inspector," whenever I try to modify any of the DOM values while I've paused the Javascript via its debugger, it simply ignores the new value I enter. After doing some research, it seems that this is a known issue according to this bug report: http://code.google.com/p/fbug/issues/detail?id=1707 .
Theoretically, I could write a program to simply open up an HTTP socket and emulate a browser "user agent," but this seems like a lot of trouble for my purposes. While I'm asking, does anyone know a good Java/C# library with functions/objects that emulate HTTP headers and parse the received HTML/JS? I've long dreamt about the existence of such a library but most of the ones I've tried (Java's Apache HttpClient, C#'s System.Net.HttpWebRequest) are far too low-level to make anything worthwhile with minimal planning and a short period of time.
Thanks in advance for recommendations and advice you can provide!

Not sure if I understand you correctly, but if you want to change the loaded URL you can do that by setting window.location.href.
If your intent is to replace DOM buildins then you will be sad to hear, that most build-in objects (host objects) aren't regular JavaScript objects and their behaviour is not clearly defined. Some browsers may allow you to replace and/or extend some objects while in other browsers they won't be replaceable/extendable at all.
If you want to "script a browser" using JavaScript, you should definitly have a look at node.js and it's http module. There's also a thirdparty module called html5 that simulates the DOM in node.js and even allows the usage of jQuery.

We Keep Coding

JavaScript is the programming language of the Web.