Non-terminating RegExp.exec in Rhino

Non-terminating RegExp.exec in Rhino - javascript

I have the following JavaScript program saved in a file pre.js:
var pre = readFile("method-help.html");
RegExp.multiline = true;
print(/<pre>((?:.|\s)+)<\/pre>/.exec(pre)[1]);
The contents of method-help.html is simply the page at http://api.stackoverflow.com/1.0/help/method?method=answers/%7bid%7d. What I'm trying to do is get the JSON code in between the pre tags. However, when I run the program in Rhino, nothing is printed out and the program does not terminate. The command I use is:
java -jar js.jar pre.js
My Rhino version is 1_7R2.

The reason it doesn't seem to terminate is probably catastrophic back-tracking due to . and \s overlapping (it would end eventually, but it could be a long time). Here's a correct, fast, version:
var pre = readFile("method-help.html");
print(/<pre>([\s\S]*?)<\/pre>/.exec(pre)[1])
You don't need multiline. That only affects the meaning of ^ and $, which you're not using. However, we do use \s\S to mean all characters (including newline, etc.). We also use *? to mean zero or more characters, non-greedy. The question mark (non-greedy) doesn't matter here but it would if there were multiple pre blocks.

Related

Weird character in a commit.template of git [duplicate]

I keep getting the ^M character in my .vimrc and it breaks my
configuration.

Unix uses 0xA for a newline character. Windows uses a combination of two characters: 0xD 0xA. 0xD is the carriage return character. ^M happens to be the way vim displays 0xD (0x0D = 13, M is the 13th letter in the English alphabet).
You can remove all the ^M characters by running the following:
:%s/^M//g
Where ^M is entered by holding down Ctrl and typing v followed by m, and then releasing Ctrl. This is sometimes abbreviated as ^V^M, but note that you must enter it as described in the previous sentence, rather than typing it out literally.
This expression will replace all occurrences of ^M with the empty string (i.e. nothing). I use this to get rid of ^M in files copied from Windows to Unix (Solaris, Linux, OSX).

:%s/\r//g
worked for me today. But my situation may have been slightly different.

To translate the new line instead of removing it:
:%s/\r/\r/g

It probably means you've got carriage returns (different operating systems use different ways of signaling the end of line).
Use dos2unix to fix the files or set the fileformats in vim:
set ffs=unix,dos

Let's say your text file is - file.txt, then run this command -
dos2unix file.txt
It converts the text file from dos to unix format.

I removed them all with sed:
sed -i -e 's/\r//g' <filename>
Could also replace with a different string or character. If there aren't line breaks already for example you can turn \r into \n:
sed -i -e 's/\r/\n/g' <filename>
Those sed commands work on the GNU/Linux version of sed but may need tweaking on BSDs (including macOS).

I got a text file originally generated on a Windows Machine by way of a Mac user and needed to import it into a Linux MySQL DB using the load data command.
Although VIM displayed the '^M' character, none of the above worked for my particular problem, the data would import but was always corrupted in some way. The solution was pretty easy in the end (after much frustration).
Solution:
Executing dos2unix TWICE on the same file did the trick! Using the file command shows what is happening along the way.
$ file 'file.txt'
file.txt: ASCII text, with CRLF, CR line terminators
$ dos2unix 'file.txt'
dos2unix: converting file file.txt to UNIX format ...
$ file 'file.txt'
file.txt: ASCII text, with CRLF line terminators
$ dos2unix 'file.txt'
dos2unix: converting file file.txt to UNIX format ...
$ file 'file.txt'
file.txt: ASCII text
And the final version of the file imported perfectly into the database.

In Unix it is probably easier to use 'tr' command.
cat file1.txt | tr "\r" "\n" > file2.txt

This is the only thing that worked in my case:
:e ++ff=dos
:wq

You can fix this in vim using
:1,$s/^V^M//g
where ^ is the control character.

If you didn't specify a different fileformat intentionally (say, :e ++ff=unix for a Windows file), it's likely that the target file has mixed EOLs.
For example, if a file has some lines with <CR><NL> endings and others with
<NL> endings, and fileformat is set to unix automatically by Vim when reading it, ^M (<CR>) will appear.
In such cases, fileformats (note: there's an extra s) comes into play. See :help ffs for the details.

If it breaks your configuration, and the ^M characters are required in mappings, you can simply replace the ^M characters by <Enter> or even <C-m> (both typed as simple character sequences, so 7 and 5 characters, respectively).
This is the single recommended, portable way of storing special keycodes in mappings

In FreeBSD, you can clear the ^M manually by typing the following:
:%s/ Ctrl+V, then Ctrl+M, then Ctrl+M again.

I've discovered that I've been polluting files for weeks due to the fact that my Homebrew Mvim instance was set to use filetype=dos. Made the required change in .vimrc....

try :%s/\^M// At least this worked for me.

Syntax error on regular expression in ExtendScript (Javascript ECMA-262 — Verison 3) [duplicate]

I have a regular expression testing for numbers(0-9) and/or forward slashes (/). It looks like this:
/^[0-9/]+$/i.test(value)
Now I believe this to be correct, but the eclipse javascript validator disagrees:
Syntax error on token "]", delete this token
I suppose this is because the separator/delimiter is / and eclipse 'thinks' the regex is finished (and therefore a ] would be unexpected).
We can satisfy eclipse by escaping the / like so:
/^[0-9\/]+$/i.test(value)
Note that both versions work for me.
My problem with this is:
As far as I know I do not need to escape the forward slash specifically in that range. It might be situation specific (as in, for javascript it is the used delimiter).
Although they both appear to be working, I'd rather use the 'correct' version because of behaviour in different environments, and, well.. because correct and all :)
Does anyone know what I'm supposed to do? Escape or not? I did not find any reputable site that told me to escape the / in a range, but the Eclipse-validator is probably not completely stupid...

The standard clearly says you can put anything unescaped in a character class except \, ] and newline:
RegularExpressionClassChar ::
RegularExpressionNonTerminator but not ] or \
RegularExpressionBackslashSequence
RegularExpressionNonTerminator ::
SourceCharacter but not LineTerminator
( http://es5.github.com/#x7.8.5 ). No need to escape /.
On the other side, I personally would escape everything when in doubt, just to make less smart parsers happy.

what does '</' mean in JavaScript?

I use Aptana Studio to code JavaScript.
When I write string with </, there will be warning saying
'<' + '/' + letter not allowed here
But it does not trigger error in browsers.
what does </ mean in JavaScript?

For inline scripts (e.g, using <script>), some HTML parsers may interpret anything that looks like </this (especially </script>) as an HTML tag, rather than part of your source code. Your IDE is trying to keep you from typing this by mistake.
This means that, if you're using an inline script, you can't have a </tag> as a constant string in JavaScript:
var endTag = "</tag>"; // don't do this!
You'll need to break it up somehow to keep it from being interpreted as a tag:
var endTag = "<" + "/tag>";
Note that this only applies to inline scripts. Standalone scripts (e.g, a .js file) can have anything they want in them.

It doesn't mean anything in a string, outside of a string it would be a syntax error.
EDIT: Before someone nitpicks there are some exceptions, eg var i = 1 </* comment */ 2; is legal and there may be some other cases (like performing less-than operation on a regex) but generally speaking it signifies nothing by itself.

It sounds like it's your IDE is denying it. Aptana Studio may be assuming some sort of injection attack, and thus throws an error.
You would probably get a more direct answer by asking them directly though; a general program help site like StackOverflow is less likely to know the reasoning for specific cases such as this.

CKEDITOR.instance[x].setData not working in IE

Ok, I'm using the CKEditor in a web application. One thing I need to do it set the text in the text area. I've been using the line:
CKEDITOR.instances.setData(html);
...where html is a varible containing HTML.
This works fine in Chrome & Firefox, but not at all in Internet Explorer or Safari.
Can anyone provide an insight as to why, or suggest a work-around?
Many thanks in advance! :-)

Make sure to strip all newlines from the string you pass into setData(). An exception is thrown if you don't, with a message about an unterminated string. The newline characters used by CKEditor are the UNIX-style of \n (in other words, not the DOS version: \r\n).
The newline apparently throws off the parser, making it think that it's the end of the statement.
Also note that if you call getData() to get that value you just set again, CKEditor puts the line breaks and tabs back into it. You'll need to strip them out again if you need to set that value back using setData(). I use a regexp pattern like this to strip out the newlines (and tabs just for completeness):
[\n\t]+
Also make sure that if you use the regular expression to strip them, you need to make sure that the pattern matching will match the \n character (called "single-line" mode in .NET, but I don't know what you're using).

stop javascript execution

When I run javascript script file in windows command line environment, and there is a free text coming after my code. How can I stop javascript interpreter to run into it?
For example:
var fso = new ActiveXObject("Scripting.FileSystemObject");
delete fso;
exit(); // some kind of WORKING exit command
Hungry lazy frog ate a big brown fox.

There is nothing you can do to stop the interpreter before it the compiler sees a particular line because the whole Javascript source file is first compiled to bytecode and it is the bytecode that is interpreted not your source code.
What you could do (though it would still be messy) would be to put some free text in a comment at the end of the file. Then you could open the source file and read it from the rest of the code. It still can't be completely free text though as it would have to be a valid comment
/*
whatever text you want provided it doesn't contain * followed immediately by /
*/
Much better is simply to admit defeat and store any data you need in a separate file.

If you surround that with quotes it will be interpreted/compiled but will have no effect eg
var fso = new ActiveXObject("Scripting.FileSystemObject");
delete fso;
exit(); // some kind of WORKING exit command
"Hungry lazy frog ate a big brown fox.";

Even if you did exit, you'd still have a problem because the whole script block needs to be parsed before the first instruction is executed. With any bogus content following an exit instruction, the script would fail to parse at all, even though that code would never be reached.
If the free text was a single line, you could get away with a trailing //. Doesn't work for multi-line comments though, as you would need a closing */. Same for conditional-comments requiring /*#end #*/.

We Keep Coding

JavaScript is the programming language of the Web.

Non-terminating RegExp.exec in Rhino - javascript

Related

Weird character in a commit.template of git [duplicate]

Syntax error on regular expression in ExtendScript (Javascript ECMA-262 — Verison 3) [duplicate]

what does '</' mean in JavaScript?

CKEDITOR.instance[x].setData not working in IE

stop javascript execution

Categories

Resources