Javascript regular expression is failing - javascript

I'm using the following javascript regex:
var pattern = new RegExp("^" + term + ".*")
console.log(pattern.toSource());
console.log(first_name + " : " + pattern.test(first_name) );
All i want it to do is check if the first name of the person begins with the search term given. E.g if the search term is 'a', then all the first names starting with a, e.g: andy, alice, etc should match. If its al, then only alice should match, etc. However the output is:
/^a.*/
Alyssa : false
What am I doing wrong?

Regexs are case-sensitive. So Alyssa won't match, because A and a are different symbols.
You may want to use case-insensitive regex match:
var pattern = new RegExp("^" + term + ".*", "i")
console.log(pattern.toSource());
console.log(first_name + " : " + pattern.test(first_name) );

There is nothing wrong, you should make the RegExp case insensitive using
var pattern = new RegExp("^" + term + ".*","i")
to match your name as long you want to render the test case insensitive or use a match like /^[aA].*/

You could specify your RegEx to be case insensitive, so that a will match both a and A:
var pattern = new RegExp("^" + term + ".*", "i");

Make your RegExp case insensitive:
var pattern = new RegExp("^" + term + ".*", "i");
See the documentation.

your regex is case sensitive so it's not matching Alyssa add this line to your code:
pattern.ignoreCase = true;

The reason it isn't working is that regular expressions are case sensitive. A is not a
Other things you are doing wrong:
Bothering to say "Followed by zero or more characters", that is just redundant.
Using a regular expression when a simple substring check will do the job.
I'd do it like this:
console.log(
first_name + " : " +
(first_name.toLowerCase().indexOf(term.toLowerCase()) === 0)
);

Related

JS Regexp - how to find text in a string

There is some text, exp: "The string class is an instantiation of the basic_string class template that uses char".
I need to find the text - "basic_string", but if there is no word "the" in front of him.
If use negative lookbehind, it was be:
(?<!\sthe)\s+basic_string
But javascript not understand negative lookbehind, what to do?
If the only allowed character between "the" and "basic_string" is the white-space:
([^e\s]|[^h]e|[^t]he)\s+basic_string
You can use xregexp library to get advanced regex features like lookbehind in Javascript.
Alternatively you can use alternation and capture group as a workaround:
var s = 'The string class is an instantiation of the basic_string class template that uses char';
var kw = s.match(/\bthe basic_string\b|(\bbasic_string\b)/)[1];
// undefined
s = 'instantiation of basic_string class template'
kw = s.match(/\bthe basic_string\b|(\bbasic_string\b)/)[1]
//=> "basic_string"
In this regex, captured group #1 will only be populated if bbasic_string isn't preceded by word the.
You can use RegExp /(the)(?\sbasic_string)/ or new RegExp("(" + before + ")(?=" + match + ")") to match "the" if followed by " basic_string", .match() to retrieve .index of matched string, .slice() to get "basic_string"
var str = "The string class is an instantiation of the basic_string class template that uses char";
var before = "the";
var match = " basic_string";
var index = str.match(new RegExp("(" + before + ")(?=" + match + ")")).index
+ before.length + 1;
console.log(str.slice(index, index + match.length));
The easiest way to emulate the negative lookbehind is via an optional capturing group, and check if the group participated in the match:
/(\bthe)?\s+basic_string/g
^^^^^^^^
See this JS demo:
var s = 'The string class is an instantiation of the basic_string class template that uses char, not basic_string.';
var re = /(\bthe)?(\s+basic_string)/gi;
var res = s.replace(re, function(match, group1, group2) {
return group1 ? match : "<b>" + group2 + "</b>";
});
document.body.innerHTML = res;

Regex to get word started with # in javascript

I have a problem replace certain words started with #. I have the following code
var x="#google",
eval("var pattern = /" + '\\b' + x + '\\b');
txt.replace(pattern,"MyNewWord");
when I use the following code it works fine
var x="google",
eval("var pattern = /" + '\\b' + x + '\\b');
txt.replace(pattern,"MyNewWord");
it works fine
any suggestion how to make the first part of code working
ps. I use eval because x will be a user input.
The problem is that \b represents a boundary between a "word" character (letter, digit, or underscore) and a "non-word" character (anything else). # is a non-word character, so \b# means "a # that is preceded by a word character" — which is not at all what you want. If anything, you want something more like \B#; \B is a non-boundary, so \B# means "a # that is not preceded by a word character".
I'm guessing that you want your words to be separated by whitespace, instead of by a programming-language concept of what makes something a "word" character or a "non-word" character; for that, you could write:
var x = '#google'; // or 'google'
var pattern = new RegExp('(^|\\s)' + x);
var result = txt.replace(pattern, '$1' + 'MyNewWord');
Edited to add: If x is really supposed to be a literal string, not a regex at all, then you should "quote" all of the special characters in it, with a backslash. You can do that by writing this:
var x = '#google'; // or 'google' or '$google' or whatever
var quotedX = x.replace(/[^\w\s]/g, '\\$&');
var pattern = new RegExp('(^|\\s)' + quotedX);
var result = txt.replace(pattern, '$1' + 'MyNewWord');
Make you patter something like this:
/(#)?\w*/
If you want to make a Regular Expression, try this instead of eval:
var pattern = new RegExp(x);
Btw the line:
eval("var pattern = /" + '\\b' + x + '\\b');
will make an error because of no enclose pattern, should be :
eval("var pattern = /" + '\\b' + x + '\\b/');
How about
var x = "#google";
x.match(/^\#/);

javascript find and replace a dynamic pattern in a string

I have a dynamic pattern that I have been using the code below to find
var matcher = new RegExp("%" + dynamicnumber + ":", "g");
var found = matcher.test(textinput);
I need the pattern to have a new requirement, which is to include an additional trailing 5 characters of either y or n. And then delete it or replace it with a '' (nothing).
I tried this syntax for the pattern, but obviously it does not work.
var matcher = new RegExp("%" + dynamicnumber + ":" + /([yn]{5})/, "g");
Any tip is appreciated
TIA.
You should only pass the regex string into the RegExp c'tor :
var re = new RegExp("%" + number + ":" + "([yn]{5})", "g");
var matcher = new RegExp("(%" + number + ":)([yn]{5})", "g");
Then replace it with the contents of the first capture group.
Use quotes instead of slashes:
var matcher = new RegExp("%" + number + ":([yn]{5})", "g");
Also, make sure that dynamicnumber or number are valid RegExps. special characters have to be prefixed by a double slash, \\, a literal double slash has to be written as four slashes: \\\\.

Simple regex: how do I say: "it's okay to have an " s " or " es " or " 's " at the end of the word, match those too."?

I'm using Jquery(don't know if that's relevant), here's the regex:
var re = new RegExp('\\b' + a_filter + '\\b');
So it matches whole words in the variabe a_filter which has a bunch of words in it. Right now it will match 'wrench', but not 'wrenches'. It will match 'chair', but not 'chairs', it will match "john" but not "john's". I've been trying but I can't figure it out.
Can someone please help me adjust my regex above to allow for these at the end of the word?
s es 's are what I want to allow at the end of a word match, so i don't have to include every single possible variation of each word. I think that's all the word endings that there really are that someone would type, if you know more, it would be great to get help, THANKS!
EDIT: here's my jsfiddle, maybe I had a_filter mixed up with filter_tags, I think i'm doing it backwards, ugh. ???
http://jsfiddle.net/nicktheandroid/mMTsc/18/
I have a group of your endings after the filter concatenation, with ? to require 0 or 1 match.
var a_filter = "wrench";
var re = new RegExp('\\b' + a_filter + '(s|es|\'s)?\\b');
alert( re.test("wrench's") );
Example: http://jsfiddle.net/qctAG/ (alert() warning. you'll get 4 of them)
You want something that looks like this:
var re = new RegExp('\\b' + a_filter + '(s|es|\'s)?\\b');
Of course, that will not match all plurals (e.g. oxen, geese) and it will match words that don't exist (e.g. sheeps).
This works for me...assuming you have an array!
var a_filter = ["wrench","wrenches","wrench's"];
for(var i=0; i < a_filter.length; i++){
var re = new RegExp('\^' + a_filter[i] + '\$');
document.write(re.test("wrench's") + " " + a_filter[i] + "<br />");
}
Here is the fiddle: http://jsfiddle.net/XCARd/2/
Play with the re.test() to see it match.

How can I concatenate regex literals in JavaScript?

Is it possible to do something like this?
var pattern = /some regex segment/ + /* comment here */
/another segment/;
Or do I have to use new RegExp() syntax and concatenate a string? I'd prefer to use the literal as the code is both more self-evident and concise.
Here is how to create a regular expression without using the regular expression literal syntax. This lets you do arbitary string manipulation before it becomes a regular expression object:
var segment_part = "some bit of the regexp";
var pattern = new RegExp("some regex segment" + /*comment here */
segment_part + /* that was defined just now */
"another segment");
If you have two regular expression literals, you can in fact concatenate them using this technique:
var regex1 = /foo/g;
var regex2 = /bar/y;
var flags = (regex1.flags + regex2.flags).split("").sort().join("").replace(/(.)(?=.*\1)/g, "");
var regex3 = new RegExp(expression_one.source + expression_two.source, flags);
// regex3 is now /foobar/gy
It's just more wordy than just having expression one and two being literal strings instead of literal regular expressions.
Just randomly concatenating regular expressions objects can have some adverse side effects. Use the RegExp.source instead:
var r1 = /abc/g;
var r2 = /def/;
var r3 = new RegExp(r1.source + r2.source,
(r1.global ? 'g' : '')
+ (r1.ignoreCase ? 'i' : '') +
(r1.multiline ? 'm' : ''));
console.log(r3);
var m = 'test that abcdef and abcdef has a match?'.match(r3);
console.log(m);
// m should contain 2 matches
This will also give you the ability to retain the regular expression flags from a previous RegExp using the standard RegExp flags.
jsFiddle
I don't quite agree with the "eval" option.
var xxx = /abcd/;
var yyy = /efgh/;
var zzz = new RegExp(eval(xxx)+eval(yyy));
will give "//abcd//efgh//" which is not the intended result.
Using source like
var zzz = new RegExp(xxx.source+yyy.source);
will give "/abcdefgh/" and that is correct.
Logicaly there is no need to EVALUATE, you know your EXPRESSION. You just need its SOURCE or how it is written not necessarely its value. As for the flags, you just need to use the optional argument of RegExp.
In my situation, I do run in the issue of ^ and $ being used in several expression I am trying to concatenate together! Those expressions are grammar filters used accross the program. Now I wan't to use some of them together to handle the case of PREPOSITIONS.
I may have to "slice" the sources to remove the starting and ending ^( and/or )$ :)
Cheers, Alex.
Problem If the regexp contains back-matching groups like \1.
var r = /(a|b)\1/ // Matches aa, bb but nothing else.
var p = /(c|d)\1/ // Matches cc, dd but nothing else.
Then just contatenating the sources will not work. Indeed, the combination of the two is:
var rp = /(a|b)\1(c|d)\1/
rp.test("aadd") // Returns false
The solution:
First we count the number of matching groups in the first regex, Then for each back-matching token in the second, we increment it by the number of matching groups.
function concatenate(r1, r2) {
var count = function(r, str) {
return str.match(r).length;
}
var numberGroups = /([^\\]|^)(?=\((?!\?:))/g; // Home-made regexp to count groups.
var offset = count(numberGroups, r1.source);
var escapedMatch = /[\\](?:(\d+)|.)/g; // Home-made regexp for escaped literals, greedy on numbers.
var r2newSource = r2.source.replace(escapedMatch, function(match, number) { return number?"\\"+(number-0+offset):match; });
return new RegExp(r1.source+r2newSource,
(r1.global ? 'g' : '')
+ (r1.ignoreCase ? 'i' : '')
+ (r1.multiline ? 'm' : ''));
}
Test:
var rp = concatenate(r, p) // returns /(a|b)\1(c|d)\2/
rp.test("aadd") // Returns true
Providing that:
you know what you do in your regexp;
you have many regex pieces to form a pattern and they will use same flag;
you find it more readable to separate your small pattern chunks into an array;
you also want to be able to comment each part for next dev or yourself later;
you prefer to visually simplify your regex like /this/g rather than new RegExp('this', 'g');
it's ok for you to assemble the regex in an extra step rather than having it in one piece from the start;
Then you may like to write this way:
var regexParts =
[
/\b(\d+|null)\b/,// Some comments.
/\b(true|false)\b/,
/\b(new|getElementsBy(?:Tag|Class|)Name|arguments|getElementById|if|else|do|null|return|case|default|function|typeof|undefined|instanceof|this|document|window|while|for|switch|in|break|continue|length|var|(?:clear|set)(?:Timeout|Interval))(?=\W)/,
/(\$|jQuery)/,
/many more patterns/
],
regexString = regexParts.map(function(x){return x.source}).join('|'),
regexPattern = new RegExp(regexString, 'g');
you can then do something like:
string.replace(regexPattern, function()
{
var m = arguments,
Class = '';
switch(true)
{
// Numbers and 'null'.
case (Boolean)(m[1]):
m = m[1];
Class = 'number';
break;
// True or False.
case (Boolean)(m[2]):
m = m[2];
Class = 'bool';
break;
// True or False.
case (Boolean)(m[3]):
m = m[3];
Class = 'keyword';
break;
// $ or 'jQuery'.
case (Boolean)(m[4]):
m = m[4];
Class = 'dollar';
break;
// More cases...
}
return '<span class="' + Class + '">' + m + '</span>';
})
In my particular case (a code-mirror-like editor), it is much easier to perform one big regex, rather than a lot of replaces like following as each time I replace with a html tag to wrap an expression, the next pattern will be harder to target without affecting the html tag itself (and without the good lookbehind that is unfortunately not supported in javascript):
.replace(/(\b\d+|null\b)/g, '<span class="number">$1</span>')
.replace(/(\btrue|false\b)/g, '<span class="bool">$1</span>')
.replace(/\b(new|getElementsBy(?:Tag|Class|)Name|arguments|getElementById|if|else|do|null|return|case|default|function|typeof|undefined|instanceof|this|document|window|while|for|switch|in|break|continue|var|(?:clear|set)(?:Timeout|Interval))(?=\W)/g, '<span class="keyword">$1</span>')
.replace(/\$/g, '<span class="dollar">$</span>')
.replace(/([\[\](){}.:;,+\-?=])/g, '<span class="ponctuation">$1</span>')
It would be preferable to use the literal syntax as often as possible. It's shorter, more legible, and you do not need escape quotes or double-escape backlashes. From "Javascript Patterns", Stoyan Stefanov 2010.
But using New may be the only way to concatenate.
I would avoid eval. Its not safe.
You could do something like:
function concatRegex(...segments) {
return new RegExp(segments.join(''));
}
The segments would be strings (rather than regex literals) passed in as separate arguments.
You can concat regex source from both the literal and RegExp class:
var xxx = new RegExp(/abcd/);
var zzz = new RegExp(xxx.source + /efgh/.source);
Use the constructor with 2 params and avoid the problem with trailing '/':
var re_final = new RegExp("\\" + ".", "g"); // constructor can have 2 params!
console.log("...finally".replace(re_final, "!") + "\n" + re_final +
" works as expected..."); // !!!finally works as expected
// meanwhile
re_final = new RegExp("\\" + "." + "g"); // appends final '/'
console.log("... finally".replace(re_final, "!")); // ...finally
console.log(re_final, "does not work!"); // does not work
No, the literal way is not supported. You'll have to use RegExp.
the easier way to me would be concatenate the sources, ex.:
a = /\d+/
b = /\w+/
c = new RegExp(a.source + b.source)
the c value will result in:
/\d+\w+/
I prefer to use eval('your expression') because it does not add the /on each end/ that ='new RegExp' does.

Categories