Why does antlr4 choke on LT!* - javascript

I'm trying to use a JavaScript grammar with antlr4 (copyright 2008 by Chris Lambrou, retrieved from http://www.antlr3.org/grammar/1206736738015/JavaScript.g). The script contains many instances of "LT!*", which I understand as a regex expression meaning zero or more line terminators and don't include the tokens in the generated AST (from answer to stackoverflow question ANTLR 3, what does LT!* mean?).
antlr4 throws a syntax error for each instance of "LT!*" so I assume the most recent version doesn't handle that construct. What can be used to replace "LT!*" that will work in antlr4?
[edit] Note that the syntax error is on the "!"

ANTLR 4 does not produce AST. Therefore, the ! (and ->) inside parser rules is not allowed.
See: How can I build an AST using ANTLR4?

Related

Why are unicode property escapes throwing "Unknown property" errors?

The MDN website gives examples of matching patterns with unicode support, e.g.
const sentence = 'A ticket to 大阪 costs ¥2000 👌.';
const regexpCurrencyOrPunctuation = /\p{Sc}|\p{P}/gu;
console.log(sentence.match(regexpCurrencyOrPunctuation));
It works fine on stackoverflow as a snippet.
However, in a javascript codesandbox, the code throws an error:
/src/index.js: Unknown property: Sc
In a Next.js codesandbox it also throws the same error.
On the other hand, on regex101 website the pattern is correctly matched to the sentence, with ECMAScript flavor and with "gu" flag.
Additionally, in my real world Next.js Typescript project, a pattern /\P{L}/gu worked fine until yesterday when I upgraded all dependencies to latest versions. Now it throws similar error with strict mode set to true in tsconfig.json. With strict mode set to false it still works fine.
Why is this error occurring and how to use the /\p{Sc}|\p{P}/gu or /\P{L}/gu regex pattern in code?
Based on the documentation, Sc is a non-binary property. Which means you can't just use \p{Sc}; you have to use \p{Sc=some_script_name}, where the script name is taken from here.
Unfortunately, it's a bug in next.js: https://github.com/vercel/next.js/issues/19303

Unicode regex \p{L} not working in NodeJS

I am trying to make the following unicode regular expression work in nodejs, but all I get is an invalid escape error. I can't figure out, what to escape here or if this for some reason doesn't work at all in node. This is my original regex:
/([\p{L}|\-]+)/ug
If I escape the \p like \\p, the regex doesn't work anymore (outputs only p,L and -)
This works in chrome, so it should work in node somehow too, right? Thanks for your help.
var str = "thÛs Ís spå-rtÅ!";
console.log(str.match(/([\p{L}|\-]+)/ug))
A quick look through the nodejs changelog revealed this PR:
https://github.com/nodejs/node/pull/19052
which most notably states:
RegExp Unicode Property Escapes are at stage 4 and will be included in ES2018. They are available since V8 6.4 without a flag so they will be unflagged in Node.js v10. They are also available under the --harmony_regexp_property flag in Node.js v6-v9 and under the --harmony flag in Node.js v8-v9.
So by the look of it, if you are on node v6-v9, you can enable this feature by running node with a flag. For example, this works for me on node v8.11.3:
node --harmony regex-test.js
(where regex-test.js contains your sample code). Running this without the flag gives your Invalid escape error.
If you can update your node version to v10+, no flag is needed.
If you are going to use --harmony flag please consider this
As mentioned in the Node Documentation, --harmony flag enables the non-stable but to be soon stable features of ES6
The current behaviour of the --harmony flag on Node.js is to enable staged features only. After all, it is now a synonym of --es_staging. As mentioned above, these are completed features that have not been considered stable yet. If you want to play safe, especially on production environments, consider removing this runtime flag until it ships by default on V8 and, consequently, on Node.js. If you keep this enabled, you should be prepared for further Node.js upgrades to break your code if V8 changes their semantics to more closely follow the standard.
here is the link for that
https://nodejs.org/en/docs/es6/#:~:text=The%20current%20behaviour%20of%20the,to%20enable%20staged%20features%20only.&text=If%20you%20want%20to%20play,js.

Syntax error on regular expression in ExtendScript (Javascript ECMA-262 — Verison 3) [duplicate]

I have a regular expression testing for numbers(0-9) and/or forward slashes (/). It looks like this:
/^[0-9/]+$/i.test(value)
Now I believe this to be correct, but the eclipse javascript validator disagrees:
Syntax error on token "]", delete this token
I suppose this is because the separator/delimiter is / and eclipse 'thinks' the regex is finished (and therefore a ] would be unexpected).
We can satisfy eclipse by escaping the / like so:
/^[0-9\/]+$/i.test(value)
Note that both versions work for me.
My problem with this is:
As far as I know I do not need to escape the forward slash specifically in that range. It might be situation specific (as in, for javascript it is the used delimiter).
Although they both appear to be working, I'd rather use the 'correct' version because of behaviour in different environments, and, well.. because correct and all :)
Does anyone know what I'm supposed to do? Escape or not? I did not find any reputable site that told me to escape the / in a range, but the Eclipse-validator is probably not completely stupid...
The standard clearly says you can put anything unescaped in a character class except \, ] and newline:
RegularExpressionClassChar ::
RegularExpressionNonTerminator but not ] or \
RegularExpressionBackslashSequence
RegularExpressionNonTerminator ::
SourceCharacter but not LineTerminator
( http://es5.github.com/#x7.8.5 ). No need to escape /.
On the other side, I personally would escape everything when in doubt, just to make less smart parsers happy.

Using String.raw() with Node JS

I'm working on a Node.js app with and I would like to use String.raw() which is part of the ES 6 standard.
However, when using it as in the documentation:
text = String.raw`Hi\n${2+3}!` + text.slice(2);
It returns SyntaxError: Unexpected token ILLEGAL for the character after String.raw.
I think that there is a problem because String.raw() is a new technology only available for Chrome and Firefox yet. However, can I use it in Node.js and how?
The grave character after raw denotes template strings, which is a feature in ES6 Harmony. You can invoke node with --harmony flag, but this feature is not yet implemented. This is the reason of the syntax error. Raw strings are unsupported too.
If you want experimenting with this feature in server side, check out io.js, which is a fork of node, but with many ES6 features implemented and enabled by default.

Checking if code is valid JavaScript without actually evaluating it

Is there a function to test if a snippet is valid JavaScript without actually evaluating it? That is, the equivalent of
function validate(code){
try { eval(code); }
catch(err) { return false; }
return true;
};
without side effects.
Yes, there is.
new Function(code);
throws a SyntaxError if code isn't valid Javascript. (ECMA-262, edition 5.1, §15.3.2.1 guarantees that it will throw an exception if code isn't parsable).
Notice: this snippet only checks syntax validity. Code can still throw exceptions because of undefined references, for example. It is a way harder to check it: you either should evaluate code (and get all its side effects) or parse code and emulate its execution (that is write a JS virtual machine in JS).
You could use esprima.
Esprima (esprima.org) is a high performance, standard-compliant ECMAScript parser written in ECMAScript (also popularly known as JavaScript).
Features
Full support for ECMAScript 5.1 (ECMA-262)
Sensible syntax tree format, compatible with Mozilla Parser AST
Heavily tested (> 550 unit tests with solid 100% statement coverage)
Optional tracking of syntax node location (index-based and line-column)
Experimental support for ES6/Harmony (module, class, destructuring, ...)
You can use the online syntax validator or install it as npm package and run it locally from the command line. There are two commands: esparse and esvalidate. esvalidate yields (given the example from the online syntax validator above):
$ esvalidate foo.js
foo.js:1: Illegal return statement
foo.js:7: Octal literals are not allowed in strict mode.
foo.js:10: Duplicate data property in object literal not allowed in strict mode
foo.js:10: Strict mode code may not include a with statement
For the sake of completeness esparse produces an AST.

Categories