Math.pow alternative "**" ES7 polyfill for IE11 - javascript

I'm trying to evaluate an expression which contains power, in string as **. i.e. eval("(22**3)/12*6+3/2").The problem is Internet Explorer 11 does not recognizes this and throws syntax error. Which poly-fill I should use to overcome this? Right now I'm using Modernizr 2.6.2.
example equation would be,
((1*2)*((3*(4*5)*(1+3)**(4*5))/((1+3)**(4*5)-1)-1)/6)/7
((1*2)*((3*(4*5)*(1+3)**(4*5))/((1+3)**(4*5)-1)-1)/6)/7*58+2*5
(4*5+4-5.5*5.21+14*36**2+69/0.258+2)/(12+65)
If it is not possible to do this, what are the possible alternatives?

You cannot polyfill operators - only library members (prototypes, constructors, properties).
As your operation is confined to an eval call, you could attempt to write your own expression parser, but that would be a lot of work.
(As an aside, you shouldn't be using eval anyway, for very good reasons that I won't get into in this posting).
Another (hack-ish) option is to use a regular expression to identify trivial cases of x**y and convert them to Math.pow:
function detectAndFixTrivialPow( expressionString ) {
var pattern = /(\w+)\*\*(\w+)/i;
var fixed = expressionString.replace( pattern, 'Math.pow($1,$2)' );
return fixed;
}
eval( detectAndFixTrivialPow( "foo**bar" ) );

You can use a regular expression to replace the occurrences of ** with Math.pow() invocations:
let expression = "(22**3)/12*6+3/2"
let processed = expression.replace(/(\w+)\*\*(\w+)/g, 'Math.pow($1,$2)');
console.log(processed);
console.log(eval(processed));
Things might get complicated if you start using nested or chained power expressions though.

I think you need to do some preprocessing of the input. Here is how i would approach this:
Find "**" in string.
Check what is on the left and right.
Extract "full expressions" from left and right - if there is just a number - take it as is, and if there is a bracket - find the matching one and take whatever is inside as an expression.
Replace the 2 expressions with Math.pow(left, right)

You can use Babel online to convert javascript for IE 11.

Related

Can we trim all the variables we are using in javascript implicitly?

I have to explicitly use trim() for too many variables. Is there anyway I can apply trim to all the string values in code without calling trim explicitly?
Note:
I asked this question out of curiosity to find if there is a way or possibility to do so. Even I dont want to apply trim in all the scenarios. But yes I've to use trim for all the literals and variables I use in an app(I've such a req). So wanted to know if there is a common place where I can change instead of missing few places.
No, there's nothing in JavaScript's strings that will globally enable automatic whitespace trimming for you. You'll have to do it when/where required, which should typically be only a few places (e.g., reading from inputs).
Here's a cool hack. would not recommend:
String.prototype.valueOf = function() {
return this.trim();
};
'query' + (new String(' asdf ')); // "queryasdf"

Why charAt() and charCodeAt() are called safe?

I was learning about javascript string methods here.
Under section Extracting String Characters, it said:
There are 2 safe methods for extracting string characters:
charAt(position)
charCodeAt(position)
The questions here are:
Why these methods are called safe?
What are these methods protecting from?
There are two ways to access a character from a string.
// Bracket Notation
"Test String1"[6]
// Real Implementation
"Test String1".charAt(6)
It is a bad idea to use brackets, for these reasons (Source):
This notation does not work in IE7.
The first code snippet will return
undefined in IE7. If you happen to use
the bracket notation for strings all
over your code and you want to migrate
to .charAt(pos), this is a real pain:
Brackets are used all over your code
and there's no easy way to detect if
that's for a string or an
array/object.
You can't set the character using this notation. As there is no warning of
any kind, this is really confusing and
frustrating. If you were using the
.charAt(pos) function, you would not
have been tempted to do it.
Also, it can produce unexpected results in edge cases
console.log('hello' [NaN]) // undefined
console.log('hello'.charAt(NaN)) // 'h'
console.log('hello' [true]) //undefined
console.log('hello'.charAt(true)) // 'e'
Basically, it's a short-cut notation that is not fully implemented across all browsers.
Note, you are not able to write characters using either method. However, that functionality is a bit easier to understand with the .charAt() function which, in most languages, is a read-only function.
So for the compatibility purpose .charAt is considered to be safe.
Source
Speed Test: http://jsperf.com/string-charat-vs-bracket-notation
Testing in Chrome 47.0.2526.80 on Mac OS X 10.10.4
Test Ops/sec
String charAt
testCharAt("cat", 1);
117,553,733
±1.25%
fastest
String bracket notation
testBracketNotation("cat", 1);
118,251,955
±1.56%
fastest

Negative lookahead Regular Expression

I want to match all strings ending in ".htm" unless it ends in "foo.htm". I'm generally decent with regular expressions, but negative lookaheads have me stumped. Why doesn't this work?
/(?!foo)\.htm$/i.test("/foo.htm"); // returns true. I want false.
What should I be using instead? I think I need a "negative lookbehind" expression (if JavaScript supported such a thing, which I know it doesn't).
The problem is pretty simple really. This will do it:
/^(?!.*foo\.htm$).*\.htm$/i.test("/foo.htm"); // returns false
What you are describing (your intention) is a negative look-behind, and Javascript has no support for look-behinds.
Look-aheads look forward from the character at which they are placed — and you've placed it before the .. So, what you've got is actually saying "anything ending in .htm as long as the first three characters starting at that position (.ht) are not foo" which is always true.
Usually, the substitute for negative look-behinds is to match more than you need, and extract only the part you actually do need. This is hacky, and depending on your precise situation you can probably come up with something else, but something like this:
// Checks that the last 3 characters before the dot are not foo:
/(?!foo).{3}\.htm$/i.test("/foo.htm"); // returns false
As mentioned JavaScript does not support negative look-behind assertions.
But you could use a workaroud:
/(foo)?\.htm$/i.test("/foo.htm") && RegExp.$1 != "foo";
This will match everything that ends with .htm but it will store "foo" into RegExp.$1 if it matches foo.htm, so you can handle it separately.
Like Renesis mentioned, "lookbehind" is not supported in JavaScript, so maybe just use two regexps in combination:
!/foo\.htm$/i.test(teststring) && /\.htm$/i.test(teststring)
Probably this answer has arrived just a little bit later than necessary but I'll leave it here just in case someone will run into the same issue now (7 years, 6 months after this question was asked).
Now lookbehinds are included in ECMA2018 standard & supported at least in last version of Chrome. However, you might solve the puzzle with or without them.
A solution with negative lookahead:
let testString = `html.htm app.htm foo.tm foo.htm bar.js 1to3.htm _.js _.htm`;
testString.match(/\b(?!foo)[\w-.]+\.htm\b/gi);
> (4) ["html.htm", "app.htm", "1to3.htm", "_.htm"]
A solution with negative lookbehind:
testString.match(/\b[\w-.]+(?<!foo)\.htm\b/gi);
> (4) ["html.htm", "app.htm", "1to3.htm", "_.htm"]
A solution with (technically) positive lookahead:
testString.match(/\b(?=[^f])[\w-.]+\.htm\b/gi);
> (4) ["html.htm", "app.htm", "1to3.htm", "_.htm"]
etc.
All these RegExps tell JS engine the same thing in different ways, the message that they pass to JS engine is something like the following.
Please, find in this string all sequences of characters that are:
Separated from other text (like words);
Consist of one or more letter(s) of english alphabet, underscore(s),
hyphen(s), dot(s) or digit(s);
End with ".htm";
Apart from that, the part of sequence before ".htm" could be anything
but "foo".
String.prototype.endsWith (ES6)
console.log( /* !(not)endsWith */
!"foo.html".endsWith("foo.htm"), // true
!"barfoo.htm".endsWith("foo.htm"), // false (here you go)
!"foo.htm".endsWith("foo.htm"), // false (here you go)
!"test.html".endsWith("foo.htm"), // true
!"test.htm".endsWith("foo.htm") // true
);
You could emulate the negative lookbehind with something like
/(.|..|.*[^f]..|.*f[^o].|.*fo[^o])\.htm$/, but a programmatic approach would be better.

When parsing Javascript, what determines the meaning of a slash?

Javascript has a tricky grammar to parse. Forward-slashes can mean a number of different things: division operator, regular expression literal, comment introducer, or line-comment introducer. The last two are easy to distinguish: if the slash is followed by a star, it starts a multiline comment. If the slash is followed by another slash, it is a line-comment.
But the rules for disambiguating division and regex literal are escaping me. I can't find it in the ECMAScript standard. There the lexical grammar is explicitly divided into two parts, InputElementDiv and InputElementRegExp, depending on what a slash will mean. But there's nothing explaining when to use which.
And of course the dreaded semicolon insertion rules complicate everything.
Does anyone have an example of clear code for lexing Javascript that has the answer?
It's actually fairly easy, but it requires making your lexer a little smarter than usual.
The division operator must follow an expression, and a regular expression literal can't follow an expression, so in all other cases you can safely assume you're looking at a regular expression literal.
You already have to identify Punctuators as multiple-character strings, if you're doing it right. So look at the previous token, and see if it's any of these:
. ( , { } [ ; , < > <= >= == != === !== + - * % ++ --
<< >> >>> & | ^ ! ~ && || ? : = += -= *= %= <<= >>= >>>=
&= |= ^= / /=
For most of these, you now know you're in a context where you can find a regular expression literal. Now, in the case of ++ --, you'll need to do some extra work. If the ++ or -- is a pre-increment/decrement, then the / following it starts a regular expression literal; if it is a post-increment/decrement, then the / following it starts a DivPunctuator.
Fortunately, you can determine whether it is a "pre-" operator by checking its previous token. First, post-increment/decrement is a restricted production, so if ++ or -- is preceded by a linebreak, then you know it is "pre-". Otherwise, if the previous token is any of the things that can precede a regular expression literal (yay recursion!), then you know it is "pre-". In all other cases, it is "post-".
Of course, the ) punctuator doesn't always indicate the end of an expression - for example if (something) /regex/.exec(x). This is tricky because it does require some semantic understanding to disentangle.
Sadly, that's not quite all. There are some operators that are not Punctuators, and other notable keywords to boot. Regular expression literals can also follow these. They are:
new delete void typeof instanceof in do return case throw else
If the IdentifierName you just consumed is one of these, then you're looking at a regular expression literal; otherwise, it's a DivPunctuator.
The above is based on the ECMAScript 5.1 specification (as found here) and does not include any browser-specific extensions to the language. But if you need to support those, then this should provide easy guidelines for determining which sort of context you're in.
Of course, most of the above represent very silly cases for including a regular expression literal. For example, you can't actually pre-increment a regular expression, even though it is syntactically allowed. So most tools can get away with simplifying the regular expression context checking for real-world applications. JSLint's method of checking the preceding character for (,=:[!&|?{}; is probably sufficient. But if you take such a shortcut when developing what's supposed to be a tool for lexing JS, then you should make sure to note that.
I am currently developing a JavaScript/ECMAScript 5.1 parser with JavaCC. RegularExpressionLiteral and Automatic Semicolon Insertion are two things which make me crazy in ECMAScript grammar. This question and an answers were invaluable for the regex question. In this answer I'd like to put my own findings together.
TL;DR In JavaCC, use lexical states and switch them from the parser.
Very important is what Thom Blake wrote:
The division operator must follow an expression, and a regular
expression literal can't follow an expression, so in all other cases
you can safely assume you're looking at a regular expression literal.
So you actually need to understand if it was an expression or not before. This is trivial in the parser but very hard in the lexer.
As Thom pointed out, in many (but, unfortunately, not all) cases you can understand if it was an expression by "looking" at the last token. You have to consider punctuators as well as keywords.
Let's start with keywords. The following keywords cannot precede a DivPunctuator (for example, you cannot have case /5), so if you see a / after these, you have a RegularExpressionLiteral:
case
delete
do
else
in
instanceof
new
return
throw
typeof
void
Next, punctuators. The following punctuators cannot precede a DivPunctuator (ex. in { /a... the symbol / can never start a division):
{ ( [
. ; , < > <=
>= == != === !==
+ - * %
<< >> >>> & | ^
! ~ && || ? :
= += -= *= %= <<=
>>= >>>= &= |= ^=
/=
So if you have one of these and see /... after this, then this can never be a DivPunctuator and therefore must be a RegularExpressionLiteral.
Next, if you have:
/
And /... after that it also must be a RegularExpressionLiteral. If there were no space between these slashes (i.e. // ...), this must have handled as a SingleLineComment ("maximal munch").
Next, the following punctuator may only end an expression:
]
So the following / must start a DivPunctuator.
Now we have the following remaining cases which are, unfortunately, ambiguous:
}
)
++
--
For } and ) you have to know if they end an expression or not, for ++ and -- - they end an PostfixExpression or start an UnaryExpression.
And I have come to the conclusion that it is very hard (if not impossible) to find out in the lexer. To give you a sense of that, a couple of examples.
In this example:
{}/a/g
/a/g is a RegularExpressionLiteral, but in this one:
+{}/a/g
/a/g is a division.
In case of ) you can have a division:
('a')/a/g
as well as a RegularExpressionLiteral:
if ('a')/a/g
So, unfortunately, it looks like you can't solve it with the lexer alone. Or you'll have to bring in so much grammar into the lexer so it's no lexer anymore.
This is a problem.
Now, a possible solution, which is, in my case JavaCC-based.
I am not sure if you have similar features in other parser generators, but JavaCC has a lexical states feature which can be used to switch between "we expect a DivPunctuator" and "we expect a RegularExpressionLiteral" states. For instance, in this grammar the NOREGEXP state means "we don't expect a RegularExpressionLiteral here".
This solves part of the problem, but not the ambiguous ), }, ++ and --.
For this, you'll need to be able to switch lexical states from the parser. This is possible, see the following question in JavaCC FAQ:
Can the parser force a switch to a new lexical state?
Yes, but it is very easy to create bugs by doing so.
A lookahead parser may have already gone too far in the token stream (i.e. already read / as a DIV or vice versa).
Fortunately there seems to be a way to make switching lexical states a bit safer:
Is there a way to make SwitchTo safer?
The idea is to make a "backup" token stream and push tokens read during lookahead back again.
I think that this should work for }, ), ++, -- as they are normally found in LOOKAHEAD(1) situations, but I am not 100% sure of that. In the worst case the lexer may have already tried to parse /-starting token as a RegularExpressionLiteral and failed as it was not terminated by another /.
In any case, I see no better way of doing that. The next good thing would be probably to drop the case altogether (like JSLint and many others did), document and just not parse these types of expressions. {}/a/g does not make much sense anyway.
JSLint appears to expect a regular expression if the preceding token is one of
(,=:[!&|?{};
Rhino always returns a DIV (slash) token from the lexer.
You can only know how to interpret the / by also implementing a syntax parser. Whichever lex path arrives at a valid parse determines how to interpret the character. Apparently, this is something they had considered fixing, but didn't.
More reading here:
http://www-archive.mozilla.org/js/language/js20-2002-04/rationale/syntax.html#regular-expressions
See section 7:
There are two goal symbols for the lexical grammar. The InputElementDiv symbol is used in those syntactic grammar contexts where a leading division (/) or division-assignment (/=) operator is permitted. The InputElementRegExp symbol is used in other syntactic grammar contexts.
NOTE There are no syntactic grammar contexts where both a leading division or division-assignment, and a leading RegularExpressionLiteral are permitted. This is not affected by semicolon insertion (see 7.9); in examples such as the
following:
a = b
/hi/g.exec(c).map(d);
where the first non-whitespace, non-comment character after a LineTerminator is slash (/) and the syntactic context allows division or division-assignment, no semicolon is inserted at the LineTerminator. That is, the above example is interpreted in
the same way as:
a = b / hi / g.exec(c).map(d);
I agree, it's confusing and there should be one top-level grammar expression rather than two.
edit:
But there's nothing explaining when to use which.
Maybe the simple answer is staring us in the face: try one and then try the other. Since they are not both permitted, at most one will yield an error-free match.

Substring arguments best practice

The JavaScript String object has two substring functions substring and substr.
substring takes two parameters beginIndex and endIndex.
substr also takes two parameters beginIndex and length.
It's trivial to convert any range between the two variants but I wonder if there's any significance two how the two normally would be used (in day-to-day programming). I tend to favor the index/length variant but I have no good explanation as to why.
I guess it depends on what kind of programming you do, but if you have strong opinion on the matter, I'd like to hear it.
When is a (absolute, relative) range more suited than an (absolute, absolute) and vice versa?
Update:
This is not a JavaScript question per se (JavaScript just happen to implement both variants [which I think is stupid]), but what practical implication does the relative vs. absolute range have? I'm looking for solid argument for why we prefer one over the other. To broaden the debate a bit, how would you prefer to design your data structures for use with either one approach?
I prefer the startIndex, endIndex variant (substring) because String.substring() operates the same way in Java and I feel it makes me more efficient to stick to the same concepts in whatever language I use most often (when possible).
If I were doing more C# work, I might use the other variant more because that is how String.Substring() works in C#.
To answer your comment about JavaScript having both, it looks like substr() was added to browsers after substring() (reference - it seems that although substr() was part of JavaScript 1.0, most browser vendors didn't implement it until later). This suggests to me that even the implementers of the early language recognized the duplication of functionality. I'd suggest substring() came first in an attempt to leverage the JavaScript trademark. Regardless, it seems that they recognized this duplication in ECMA-262 and took some small steps toward removing it:
substring(): ECMA Version: ECMA-262
substr(): ECMA Version: None, although ECMA-262 ed. 3 has a non-normative section suggesting uniform semantics for substr
Personally I wouldn't mind a substring() where the second parameter can be negative, which would return the characters between the first parameter and the length of the string minus the second parameter. Of course you can already achieve that more explicitly and I imagine the design would be confusing to many developers:
String s1 = "The quick brown fox jumps over the lazy dog";
String s2 = s1.substring(20, -13); // "jumps over"
When is a (absolute, relative) range more suited than an (absolute, absolute) and vice versa?
The former, when you know how much, the latter when you know where.
I presume substring is implemented in terms of substr:
substring( b, e ) {
return substr( b, e - b );
}
or substr in terms of substring:
substr( b, l) {
return substring( b, b + l );
}
I slightly prefer the startIndex, endIndex variant, since then to get the last bit of a string I can do:
string foo = bar.substring(5, foo.length());
instead of:
string foo = bar.substring(5, foo.length() - 5);
It depends on the case, but I more often find I know exactly how many characters I want to take out, and prefer the start with length parameterization. But I could easily see a case where I've searched a long string for two tokens and now have their indexes, while it's trivial math to use either case, in this case I might prefer the start and end indexes.
Also, from a document writer's perspective, having two parameters of the same basic meaning is probably easier to write about and an easier mnemonic.
Each of these functions does neat saves when given strange values, such as an end smaller than a start, a negative length, a negative start, or a length or end beyond the string's end.
For JavaScript the best practice is to use substring over substr because it's supported in more (albeit usually older) browsers. If they'd gone with BasicScript instead would there have been a MID() and a MIDDLE() function? Who doesn't love BASIC syntax?

Categories