Regex created via new RegExp(myString) not working (backslashes) - javascript

So, I'm trying to write a regex that matches all numbers. Here is that regex:
/\b[\d \.]+\b/g
And I try to use it on the string:
100 two 100
And everything works fine; it matches both of the numbers.
But I want to rewrite the regex in the form:
new RegExp(pattern,modifiers)
Because I think it looks clearer.
So I write it like this:
new RegExp('\b[\d \.]+\b','g')
But now it won't match the former test string. I have tried everything, but I just can't get it to work. What am I doing wrong?

Your problem is that the backslash in a string has a special meaning; if you want a backslash in your regexp, you first need to get literal backslashes in the string passed to the regex:
new RegExp('\\b[\\d \\.]+\\b','g');
Note that this is a pretty bad (permissive) regex, as it will match ". . . " as a 'number', or "1 1...3 42". Better might be:
/-?\d+(?:\.\d+)?\b/
Note that this matches odd things like 0000.3 also does not match:
Leading +
Scientific notation, e.g. 1.3e7
Missing leading digit, e.g. .4
Also, note that using the RegExp constructor is (marginally) slower and certainly less idiomatic than using a RegExp literal. Using it is only a good idea when you need to constructor your RegExp from supplied strings. Most anyone with more than passing familiarity with JavaScript will find the /.../ notation fully clear.

Related

Does regex syntax provide a way of escaping a whole string, rather than escaping characters one by one?

If I want to find a reference to precisely the following string:
http ://www.mydomain.com/home
within a more complex regex expression.
Is it possible to escape the whole sequence instead of escaping each / and . character individually? To get something more readable than
/http:\/\/www\.mydomain\.com\/home/
In the regex parsing site https://regexr.com/ , if I type the url in and set a regex to
/(http ://www.mydomain.com/home)/
, it appears to recognize the string, yet declares an error:
Unescaped forward slash. This may cause issues if copying/pasting this expression into code.
So I'm confused about this issue.
It appears that regex does not offer such a syntax, at least for Javascript. It is possible, however, to proceed as follows:
use a string and automatically escape all the special characters in it,
as indicated here: Javascript regular expression - string to RegEx object
concatenate that string with strings representing the rest of the expression you want to create
transform the string into a regex expression as indicated in Escape string for use in Javascript regex .

Angular 5 Form Control Price Pattern Validation Always Returns Invalid [duplicate]

So, I'm trying to write a regex that matches all numbers. Here is that regex:
/\b[\d \.]+\b/g
And I try to use it on the string:
100 two 100
And everything works fine; it matches both of the numbers.
But I want to rewrite the regex in the form:
new RegExp(pattern,modifiers)
Because I think it looks clearer.
So I write it like this:
new RegExp('\b[\d \.]+\b','g')
But now it won't match the former test string. I have tried everything, but I just can't get it to work. What am I doing wrong?
Your problem is that the backslash in a string has a special meaning; if you want a backslash in your regexp, you first need to get literal backslashes in the string passed to the regex:
new RegExp('\\b[\\d \\.]+\\b','g');
Note that this is a pretty bad (permissive) regex, as it will match ". . . " as a 'number', or "1 1...3 42". Better might be:
/-?\d+(?:\.\d+)?\b/
Note that this matches odd things like 0000.3 also does not match:
Leading +
Scientific notation, e.g. 1.3e7
Missing leading digit, e.g. .4
Also, note that using the RegExp constructor is (marginally) slower and certainly less idiomatic than using a RegExp literal. Using it is only a good idea when you need to constructor your RegExp from supplied strings. Most anyone with more than passing familiarity with JavaScript will find the /.../ notation fully clear.

Alternation operator inside square brackets does not work

I'm creating a javascript regex to match queries in a search engine string. I am having a problem with alternation. I have the following regex:
.*baidu.com.*[/?].*wd{1}=
I want to be able to match strings that have the string 'word' or 'qw' in addition to 'wd', but everything I try is unsuccessful. I thought I would be able to do something like the following:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
but it does not seem to work.
replace [wd|word|qw] with (wd|word|qw) or (?:wd|word|qw).
[] denotes character sets, () denotes logical groupings.
Your expression:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
does need a few changes, including [wd|word|qw] to (wd|word|qw) and getting rid of the redundant {1}, like so:
.*baidu.com.*[/?].*(wd|word|qw)=
But you also need to understand that the first part of your expression (.*baidu.com.*[/?].*) will match baidu.com hello what spelling/handle????????? or hbaidu-com/ or even something like lkas----jhdf lkja$##!3hdsfbaidugcomlaksjhdf.[($?lakshf, because the dot (.) matches any character except newlines... to match a literal dot, you have to escape it with a backslash (like \.)
There are several approaches you could take to match things in a URL, but we could help you more if you tell us what you are trying to do or accomplish - perhaps regex is not the best solution or (EDIT) only part of the best solution?

Trouble with word-boundary (\b)

I have an array of keywords, and I want to know whether at least one of the keywords is found within some string that has been submitted. I further want to be absolutely sure that it is the keyword that has been matched, and not something that is very similar to the word.
Say, for example, that our keywords are [English, Eng, En] because we are looking for some variation of English.
Now, say that the input from a user is i h8 eng class, or something equally provocative and illiterate - then the eng should be matched. It should also fail to match a word like england or some odd thing chen, even though it's got the en bit.
So, in my infinite lack of wisdom I believed I could do something along the lines of this in order to match one of my array items with the input:
.match(RegExp('\b('+array.join('|')+')\b','i'))
With the thinking that the regular expression would look for matches from the array, now presented like (English|Eng|En) and then look to see whether there were zero-width word bounds on either side.
You need to double the backslashes.
When you create a regex with the RegExp() constructor, you're passing in a string. JavaScript string constant syntax also treats the backslash as a meta-character, for quoting quotes etc. Thus, the backslashes will be effectively stripped out before the RegExp() code even runs!
By doubling them, the step of parsing the string will leave one backslash behind. Then the RegExp() parser will see the single backslash before the "b" and do the right thing.
You need to double the backslashes in a JavaScript string or you'll encode a Backspace character:
.match(RegExp('\\b('+array.join('|')+')\\b','i'))
You need to double-escape a \b, cause it have special value in strings:
.match(RegExp('\\b('+array.join('|')+')\\b','i'))
\b is an escape sequence inside string literals (see table 2.1 on this page). You should escape it by adding one extra slash:
.match(RegExp('\\b('+array.join('|')+')\\b','i'))
You do not need to escape \b when used inside a regular expression literal:
/\b(english|eng|en)\b/i

RegularExpression RegExp.test

I want to match a string with following regular expression -
^\d{4}-\d{5}$|^\d{4}-\d{6}$
which is regex for a zip code with 4 digits-then 5 OR 6 digits after dash.
I am hoping my regex is correct as I have tested it on some online RegEx tester.
and for matching my string with above regex in jquery, I am using:
var regExpTest = new RegExp("^\d{4}-\d{5}$|^\d{4}-\d{6}$");
alert(regExpTest.test("1234-123456"));
But I am always getting false, can anyone please guide what is going wrong here?
Thank you!
Because the regular expression constructor takes a string as its argument, you need to escape the backslash \ wherever you use it. In your example, anywhere you have a \d needs to be \\d. You can see what happens if you don't by testing your code in Firebug or Chrome's developer tools:
new RegExp("^\d{4}-\d{5}$|^\d{4}-\d{6}$");
//-> /^d{4}-d{5}$|^d{4}-d{6}$/
Notice the slashes are gone? Now watch what happens when we escape each backslash:
new RegExp("^\\d{4}-\\d{5}$|^\\d{4}-\\d{6}$");
//-> /^\d{4}-\d{5}$|^\d{4}-\d{6}$/
So that should fix your problem. However, it's much easier to use the literal grammar for regular expressions when you're not using a variable to create them:
var regExpTest = /^\d{4}-\d{5}$|^\d{4}-\d{6}$/;
alert(regExpTest.test("1234-123456"));
//-> "true"
This way, you can write the expression without having to worry about double-escaping.

Categories