How Do I Sanitize JS eval Input?

How Do I Sanitize JS eval Input? - javascript

a="79 * 2245 + (79 * 2 - 7)";
b="";
c=["1","2","3","4","5","6","7","8","9","0","+","-","/","*"];
for (i=1;i<a.length;i++){
for (ii=1;i<c.length;i++){
b=(a.substring(0,i))+(c[ii])+(a.substring(i+1,a.length));
alert(eval(b.replace(" ","")));
}
}
I need to find out how to make it so that when I use eval, I know that the input will not stop the script, and if it would normally crash the script to just ignore it. I understand that eval is not a good function to use, but I want a quick and simple method by which I can solve this. The above code tries to output all of the answers with all of the possible replacements for any digit, sign or space in the above. i represents the distance through which it has gone in the string and ii represents the symbol that it is currently checking. a is the original problem and b is the modified problem.

Try catching the exception eval might throw, like this:
try{
alert(eval(b.replace(" ","")));
} catch (e){
//alert(e);
}

You can check for a few special cases and avoid some behaviors with regex or the like, but there is definitely no way to 'if it would normally crash just ignore it'
That is akin to the halting problem, as mellamokb refers to. And theres no way to know ipositively f a script runs to completion besides running it.
One should be very careful to vet any strings that go to eval, and keep user input out of them as much as possibl except for real simple and verifiable things like an integer value. If you can find a way around eval altogether than all the better.
For the calculation example you show its probably best to parse it properly into tokens and go from there rather than evaluate in string form.
PS - if you really want to check out these one-off's to the expression in a, it is a somewhat interesting use of eval eespite its faults. cam you explain why you are trimming the whitespace imediately before evaluation? i dont believe i can think of a situation where it effects the results. for (at least most) valid expressions it makes no difference, and while it might alter some of the invalid cases i cant think of a case where it does so meaningfully

Related

Replacement of "-' with "_" using a loop in javascript

I have applied the same method to replace "-" with "_" in c++ and it is working properly but in javascript, it is not replacing at all.
function replace(str)
{
for(var i=0;i<str.length;i++)
{
if(str.charAt(i)=="-")
{
str.charAt(i) = "_";
}
}
return str;
}

It's simple enough in javascript, it's not really worth making a new function:
function replace(str){
return str.replace(/-/g, '_') // uses regular expression with 'g' flag for 'global'
}
console.log(replace("hello-this-is-a-string"))
This does not alter the original string, however, because strings in javascript are immutable.
If you are dead set on avoiding the builtin(maybe you want to do more complex processing), reduce() can be useful:
function replace(str){
return [...str].reduce((s, letter) => s += letter == '-' ? '_' : letter , '')
}
console.log(replace("hello-this-is-a-string"))

This is yet another case of "make sure you read the error message". In the case of
str.charAt(i) = "_";
the correct description of what happens is not "it is not replacing at all", as you would have it; it is "it generates a run-time JavaScript error", which in Chrome is worded as
Uncaught ReferenceError: Invalid left-hand side in assignment
In Firefox, it is
ReferenceError: invalid assignment left-hand side
That should have given you the clue you needed to track down your problem.
I repeat: read error messages closely. Then read them again, and again. In the great majority of cases, they tell you exactly what the problem is (if you only take the time to try to understand them), or at least point you in right direction.
Of course, reading the error message assumes you know how to view the error message. Do you? In most browsers, a development tools window is available--often via F12--which contains a "console", displaying error messages. If you don't know about devtools, or the console, then make sure you understand them before you write a single line of code.
Or, you could have simply searched on Stack Overflow, since (almost) all questions have already been asked there, especially those from newcomers. For example, I searched in Google for
can i use charat in javascript to replace a character in a string?
and the first result was How do I replace a character at a particular index in JavaScript?, which has over 400 upvotes, as does the first and accepted answer, which reads:
In JavaScript, strings are immutable, which means the best you can do is create a new string with the changed content, and assign the variable to point to it.
As you learn to program, or learn a new languages, you will inevitably run into things you don't know, or things that confuse you. What to do? Posting to Stack Overflow is almost always the worst alternative. After all, as you know, it's not a tutorial site, and it's not a help site, and it's not a forum. It's a Q&A site for interesting programming questions.
In the best case, you'll get snarky comments and answers which will ruin your day; in the worst case, you'll get down-voted, and close-voted, which is not just embarrassing, but may actually prevent you from asking questions in the future. Since you want to make sure you are able to ask questions when you really need to, you are best off taking much more time doing research, including Google and SO searches, on simple beginner questions before posting. Or find a forum which is designed to help new folks. Or ask the person next to you if there is one. Or run through one or more tutorials.
But why write it yourself at all?
However, unless you are working on this problem as a way of teaching yourself JavaScript, as a kind of training exercise, there is no reason to write it at all. It has already been written hundreds, or probably thousands, of times in the history of computing. And the overwhelming majority of those implementations are going to be better, faster, cleaner, less buggy, and more featureful than whatever you will write. So your job as a "programmer" is not to write something that converts dashes to underscores; it's to find and use something that does.
As the wise man said, today we don't write algorithms any more; we string together API calls. Our job is to find, and understand, the APIs to call.
Finding the API is not at all hard with Google. In this case, it could be helpful if you knew that strings with underscores are sometimes called "snake-cased", but even without knowing that you can find something on the first page of Google results with a query such as "javascript convert string to use underscores library".
The very first result, and the one you should take a look at, is underscore.string, a collection of string utilities written in the spirit of the versatile "underscore" library, and designed to be used with it. It provides an API called underscored. In addition to dealing with "dasherized" input (your case), it also handles other string/identifier formats such as "camelCase".
Even if you don't want to import this particular library and use it (which you probably should), you would be much better off stealing or cloning its code, which in simplified form is:
str
.replace(/([a-z\d])([A-Z]+)/g, '$1_$2')
.replace(/[-\s]+/g, '_')
This is not as complicated as it looks. The first part is to deal with camelCase input, so that "catsAndDogs" becomes "cats-and-dogs". the second line is what handles the dashes and spaces). Look closely--it replaces runs of multiple dashes with a single underscore. That could easily be something that you want to do too, depending on who is doing what with the transformed string downstream. That's a perfect example of something that someone else writing a professional-level library for this task has thought of that you might not when writing your home-grown solution.
Note that this well-regarded, finely-turned library does not use the split/join trick to do global replaces in strings, as another answer suggests. That approach went out of use almost a decade ago.
Besides saving yourself the trouble of writing it yourself, and ending up with better code, if you take time time to understand what it's doing, you will also end up knowing more about regular expressions.

You can easily replace complete string using .split() and .join().
function replace(str){
return str.split("-").join("_");
}
console.log(replace("hello-this-is-a-string"))

Javascript safely use user regex

I want to safely use regexp that the user inputs in order to parse text. I don't want to use a sandbox or anything. Can I just plug-in the regexp into String.match or would that cause problems? If not how might this be avoided?
The usecase is that of a writer who has a lot of text. The writer will transform that text with various regexes. Other users will run that author's regexes in order to get the intended output.

This should not cause any security errors. Worse case scenario you are using an invalid regEx string. Unless I am misunderstanding the question.
EDIT Worse case scenario is locking up your own browser. Thanks Anirudh for bringing this point up.

User Expression -> Same User
If you have the user input in a string, you can pass that directly to the constructor.
demo
var s = "This is my text";
$('input#expression').keyup(function(){
var e = new RegExp($(this).val());
$('#match').text(
JSON.stringify(s.match(e))
);
}).trigger("keyup");
It's much better than eval. There aren't any security problems that I know of. If there are any, it's probably a bug in the browser.
User Expression -> Other User(s)
As pointed out in the comments, it would be very complicated to differentiate between safe, and computationally intense/infeasible expressions in JavaScript.
If you don't trust the users to play nice, don't let them run expressions on each other's computers. At the very least, make sure user data is saved, in the case of the page needing to be refreshed, and don't run them without them being explicitly being invoked by the user.

If you are using Javascript it is running on their machine - so the sandbox is not involved.
They are free to mess up their machine as much as they want! Besides the worst that can happen is that the script breaks on their browser.

Javascript odd StackOverflow error

I was wondering about the working of parentheses in Javascript, so I wrote this code to test:
((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((
4+4
))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
Which consists in:
( x1174
4+4
) x1174
I tested the code above on Google Chrome 20 (Win64), and it gives me the right answer (8).
But if I try the same code, but with 1175 parentheses (on both sides), I get a stackoverflow error.
You can check this code in JSFiddle (Note: in JSFiddle it stops working with 1178 parentheses)
So, my questions are:
Why does it happen?
Why does it stop working with 1178 parentheses on JSFiddle but with only 1175 on my blank page?
Does this error depend of the page/browser/os?

Often languages are parsed by code designed along a pattern called recursive descent. I don't know for sure that that's the case here, but certainly the "stack overflow" error is a big piece of evidence.
The idea is that to parse an expression, you approach the syntax by looking at what an expression can be. A parenthesized expression is like an "expression within an expression". Thus, for a parser to systematically parse some expression in code it's seeing for the first time (which, for a parser, is its eternal fate), a left parenthesis means "ok - hold on to what you're doing (on the stack), and go start from the beginning of what an expression can possibly be and parse a fresh, complete expression, and come back when you see the matching close paren".
Thus, a string of a thousand or more parenthesis triggers an equivalent cascade of that same activity: put what we've got on a shelf; dive in and get a sub-expression, and then resume when we know what it looks like.
Now this is not the only way to parse something, it should be noted. There are many ways. I personally am a huge fan of recursive descent parsing, but there's nothing particularly special about it (except that I think it'll someday result in me seeing a real unicorn).

The behavior is different in different browsers because they have different implementations of Javascript. The language doesn't specify how something like this should fail, so each implementation fails in a different way.
The difference between JSFiddle and your blank page is because JSFiddle itself uses a few stack frames to establish the environment in which to run your code.

How to let user enter a JavaScript function securely?

What I'm doing is creating a truth table generator. Using a function provided by the user (e.g. a && b || c), I'm trying to make JavaScript display all combinations of a, b and c with the result of the function.
The point is that I'm not entirely sure how to parse the function provided by the user. The user could namely basically put everything he wants in a function, which could have the effect of my website being changed etc.
eval() is not secure at all; neither is new Function(), as both can make the user put everything to his liking in the function. Usually JSON.parse() is a great alternative to eval(), however functions don't exist in JSON.
So I was wondering how I can parse a custom boolean operator string like a && b || c to a function, whilst any malicious code strings are ignored. Only boolean operators (&&, ||, !) should be allowed inside the function.

Even if you check for boolean expressions, I could do this:
(a && b && (function() { ruinYourShit(); return true; })())
This is to illustrate that your problem is not solvable in the general case.
To make it work for your situation, you'd have to impose severe restrictions on variable naming, like requiring that all variables are a single alphabet letter, and then use a regular expression to kick it back if anything else is found. Also, to "match" a boolean expression, you'd actually have to develop a grammar for that expression. You are attempting to parse a non-regular language (javascript) and therefore you cannot write a regular expression that could match every possible boolean expression of arbitrary complexity. Basically, the problem you are trying to solve is really, really hard.
Assuming you aren't expecting the world to come crashing down if someone ruins your shit, you can develop a good enough solution by simply checking for the function keyword, and disallowing any logical code blocks contained in { and }.

I don't see the problem with using eval() and letting the user do whatever (s)he wants, as long as you have proper validation/filters on any of your server-side scripts that expect input. Sure, the user could do something that ruins your page, but (s)he can do that already. The user can easily enough run any javascript (s)he wants on your page already, using any number built-in browser features or add-ons.

What I would do would be to first split the user input into tokens (variable names, operators and possibly parentheses for grouping), then build an expression tree from that and then generate the relevant output from the expression tree (possibly after running some simplification on it).
So, say, break the string "a & !b" into the token sequence "a" "&" "!" "b", then go through and (eventually) build something like: booland(boolvar("a"), boolnot(boolvar("b"))) and you then have a suitable data structure to run your (custom, hopefully injection-free) evaluator over.

I ended up with a regular expression:
if(!/^[a-zA-Z\|\&\!\(\)\ ]+$/.test(str)) {
throw "The function is not a combination of Boolean operators.";
return;
}

Do you recommend using semicolons after every statement in JavaScript?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
In many situations, JavaScript parsers will insert semicolons for you if you leave them out. My question is, do you leave them out?
If you're unfamiliar with the rules, there's a description of semicolon insertion on the Mozilla site. Here's the key point:
If the first through the nth tokens of a JavaScript program form are grammatically valid but the first through the n+1st tokens are not and there is a line break between the nth tokens and the n+1st tokens, then the parser tries to parse the program again after inserting a virtual semicolon token between the nth and the n+1st tokens.
That description may be incomplete, because it doesn't explain #Dreas's example. Anybody have a link to the complete rules, or see why the example gets a semicolon? (I tried it in JScript.NET.)
This stackoverflow question is related, but only talks about a specific scenario.

Yes, you should use semicolons after every statement in JavaScript.

An ambiguous case that breaks in the absence of a semicolon:
// define a function
var fn = function () {
//...
} // semicolon missing at this line
// then execute some code inside a closure
(function () {
//...
})();
This will be interpreted as:
var fn = function () {
//...
}(function () {
//...
})();
We end up passing the second function as an argument to the first function and then trying to call the result of the first function call as a function. The second function will fail with a "... is not a function" error at runtime.

Yes, you should always use semicolons. Why? Because if you end up using a JavaScript compressor, all your code will be on one line, which will break your code.
Try http://www.jslint.com/; it will hurt your feelings, but show you many ways to write better JavaScript (and one of the ways is to always use semicolons).

What everyone seems to miss is that the semi-colons in JavaScript are not statement terminators but statement separators. It's a subtle difference, but it is important to the way the parser is programmed. Treat them like what they are and you will find leaving them out will feel much more natural.
I've programmed in other languages where the semi-colon is a statement separator and also optional as the parser does 'semi-colon insertion' on newlines where it does not break the grammar. So I was not unfamiliar with it when I found it in JavaScript.
I don't like noise in a language (which is one reason I'm bad at Perl) and semi-colons are noise in JavaScript. So I omit them.

I'd say consistency is more important than saving a few bytes. I always include semicolons.
On the other hand, I'd like to point out there are many places where the semicolon is not syntactically required, even if a compressor is nuking all available whitespace. e.g. at then end of a block.
if (a) { b() }

JavaScript automatically inserts semicolons whilst interpreting your code, so if you put the value of the return statement below the line, it won't be returned:
Your Code:
return
5
JavaScript Interpretation:
return;
5;
Thus, nothing is returned, because of JavaScript's auto semicolon insertion

I think this is similar to what the last podcast discussed. The "Be liberal in what you accept" means that extra work had to be put into the Javascript parser to fix cases where semicolons were left out. Now we have a boatload of pages out there floating around with bad syntax, that might break one day in the future when some browser decides to be a little more stringent on what it accepts. This type of rule should also apply to HTML and CSS. You can write broken HTML and CSS, but don't be surprise when you get weird and hard to debug behaviors when some browser doesn't properly interpret your incorrect code.

The article Semicolons in JavaScript are optional makes some really good points about not using semi colons in Javascript. It deals with all the points have been brought up by the answers to this question.

This is the very best explanation of automatic semicolon insertion that I've found anywhere. It will clear away all your uncertainty and doubt.

I use semicolon, since it is my habit.
Now I understand why I can't have string split into two lines... it puts semicolon at the end of each line.

No, only use semicolons when they're required.

We Keep Coding

JavaScript is the programming language of the Web.