How to use 'or' operator in regex properly? - javascript

I am trying Javascript's regular expression.
I understand that '|' is used to or-ing two regular expression.
I created a regex /^a*|b*$/, and I want it to detect any string that contains only charater of 'a' or 'b'.
But when I try /^a*|b*$/.test('c'), it produces true?
What I am missing understading of '|' operator?
Here's my code:
let reg = /^a*|b*$/;
< undefined
reg.test('c');
< true

| has very low precedence. ^a*|b*$ matches
either ^a*
or b*$
i.e. either a string beginning with 0 or more 'a's or a string ending with 0 or more 'b's. (Because matching 0 'a's is allowed by the regex, any string will match (because every string has a beginning).)
To properly anchor the match on both sides, you need
/^(?:a*|b*)$/
(the (?: ) construct is a non-capturing group).
You could also use
/^a*$|^b*$/
instead.
Note that both of these regexes will only match strings like aa, bbbbbb, etc., but not aba. If you want to allow the use of mixed a/b characters in a string, you need something like
/^(?:a|b)*$/

The OR in your example will split the expression in these alternatives:
^a* and b*$.
You can use groups to delimit the alternatives
Something like
/^(a*|b*)$/
This will match empty strings, strings that contains only a characters and strings that contain only b characters.
If you're looking to match strings that contain both a and b characters, you can use something like:
/^[ab]*$/

Related

Parsing array syntax using regex in javascript

I found the answer for this here but it's in php.
I would like to match an array like [123, "hehe", "lala"] but only if the array syntax is correct.
I made this regex /\["?.+"?(?:,"?.+"?)*\]/.
The problem is that if the input is [123, "hehe, "lala"], the regex match, but the syntax is incorrect.
How can I make it only match if the array syntax is correct?
My problem is making the second " required when the first "is matched.
Edit: I'm only trying to do it only with strings and numbers inside the array.
You can try this regex: /\[((\d+|"([^"]|\\")*?")\s*,?\s*)*(?<!,)\]/
Each item should either
"([^"]|\\")*?": start and end with ", containing anything but ". If " is contained it should be escaped (\").
\d+: a number
After each item should be
\s*,?\s*: a comma with any number of spaces before or after.
And before the closing bracket should not be a comma: (?<!,)
Demo: https://regex101.com/r/jRAQUc/1
You must have two (or more) separate expressions (using the | operator) in order to do that.
So it would be something like this:
/\[\s*("[^"]*"|[0-9]+)(\s*,\s*("[^"]*"|[0-9]+))*\s*\]/
(You may also want to use ^ at the start and $ at the end to make sure nothing else appears before/after the array: /^...snip...$/ to match the string from start to finish.)
If you need floating point numbers with exponents, add a period and the 'e' character: [0-9.eE]+ (which is why I did not use \d+ because only digits are allowed in that case.) To make sure a number is valid, it's much more complicated, obviously (sign, exponent with/without sign, digits only before or after the decimal point...)
You could also support single quoted strings. That too is a separate expression: '[^']*'.
You may want to allow spaces before and after the brackets too (start: /^\s*\[... and end: ...\]\s*$/).
Finally, if you want to really support JavaScript strings you would need to add support for the backslash. Something like this: ("([^"]|\\.)*").
Note
Your .+ expression would match " and , too and without the ^ and $ an array as follow matches your expression just fine:
This Array ["test", 123, true, "this"] Here

Write a regex for usernames

I want a Regex for my mongoose schema to test if a username contains only letters, numbers and underscore, dash or dot. What I got so far is
/[a-zA-Z0-9-_.]/
but somehow it lets pass everything.
Your regex is set to match a string if it contains ANY of the contained characters, but it doesn't make sure that the string is composed entirely of those characters.
For example, /[a-zA-Z0-9-_.]/.test("a&") returns true, because the string contains the letter a, regardless of the fact that it also includes &.
To make sure all characters are one of your desired characters, use a regex that matches the beginning of the string ^, then your desired characters followed by a quantifier + (a plus means one or more of the previous set, a * would mean zero or more), then end of string $. So:
const reg = /^[a-zA-Z0-9-_.]+$/
console.log(reg.test("")) // false
console.log(reg.test("I-am_valid.")) // true
console.log(reg.test("I-am_not&")) // false
Try like this with start(^) and end($),
^[a-zA-Z0-9-_.]+$
See demo : https://regex101.com/r/6v0nNT/3
/^([a-zA-Z0-9]|[-_\.])*$/
This regex should work.
^ matches at the beginning of the string. $ matches at the end of the string. This means it checks for the entire string.
The * allows it to match any number of characters or sequences of characters. This is required to match the entire password.
Now the parentheses are required for this as there is a | (or) used here. The first stretch was something you already included, and it is for capital/lowercase letters, and numbers. The second area of brackets are used for the other characters. The . must be escaped with a backslash, as it is a reserved character in regex, used for denoting that something can be any character.

javascript regexp does not evaluate only one character

I'm using this regexp:
/[^+][a-z]/.test(str)
I'm trying to ensure that if there are any letters ([a-z]) in a string (str) not proceeded by a plus ([^+]) , a match is found and therefore it will return true.
It mostly works except when there is only one character in the string. For example, a returns false, even though there is no plus sign preceding it.
How can I ensure it works for all strings including one character strings. Thanks!
Add a ^ as an alternative to [^+]:
/(?:^|[^+])[a-z]/.test(str)
^^^^^^^^^^
The (?:^|[^+]) is a non-capturing alternation group matching either the start of the string (with ^) or (|) any char other than + (with [^+]).

unable to parse - in Regular expression in Javascript

I am a bit new to the regular expressions in Javascript.
I am trying to write a function called parseRegExpression()
which parses the attributes passed and generates a key/value pairs
It works fine with the input:
"iconType:plus;iconPosition:bottom;"
But it is not able to parse the input:
"type:'date';locale:'en-US';"
Basically the - sign is being ignored. The code is at:
http://jsfiddle.net/visibleinvisibly/ZSS5G/
The Regular Expression key value pair is as below
/[a-z|A-Z|-]*\s*:\s*[a-z|A-Z|'|"|:|-|_|\/|\.|0-9]*\s*;|[a-z|A-Z|-]*\s*:\s*[a-z|A-Z|'|"|:|-|_|\/|\.|0-9]*\s*$/gi;
There are a few problems:
A | inside a character class means a literal | character, not an alternation.
A . inside a character class means a literal . character, so there's no need to escape it.
A - as the first or last character inside a character class means a literal - character, otherwise it means a character range.
There's no need to use [a-zA-Z] when you use the case-insensitive modifier (i); [a-z] is enough.
The only difference between your alterations is the last bit; this can be simplified significantly by just limiting your alternation to that part which is different.
This should be equivalent to your original pattern:
/[a-z-]*\s*:\s*[a-z0-9'":_\/.-]*\s*(?:;|$)/gi
You can avoid the regex:
var test1 = "iconType:plus;iconPosition:bottom;";
var test2 = "type:'date';locale:'en-US';";
function toto(str) {
var result = new Array();
var temp = str.split(';');
for (i=0; i<temp.length-1; i++) {
result[i] = temp[i].split(':',1);
}
return result;
}
console.log(toto(test1));
console.log(toto(test2));
Inside a character set atom [...] the pipe char | is just a regular char and doesn't mean "or".
A character set atom lists characters or ranges you want to accept (or exclude if the character set starts with ^) and "or" is implicit.
You can use a backslash in a character set if you need to include/exclude a close bracket ], the ^ sign, the dash - that is used for ranges, the backslash \ itself, an unprintable character or if you want to use a non-ASCII unicode char specifying the code instead of literally.
Regular expression syntax however also lets you to avoid backslash-escaping in a character set atom by placing the character in a position where it cannot have the special meaning... for example a dash - as first or last in the set (it cannot mean a range there).
Note also that if you need to be able to match as values quoted strings, including backslash escaping, the regular expression is more complex, for example
'(?:[^'\\]|\\.)*'|"(?:[^"\\]|\\.)*"
matches a single-quoted or double-quoted string including backslash escaping, the meaning being:
A single quote '
Zero or more of either:
Any char except the single quote ' or the backslash \
A pair composed of a backslash \ followed by any char
A single quote '
or the same with double quotes " instead.
Note that the groups have been delimited with (?:...) instead of plain (...) to avoid capture
It doesn't match hyphens because it interpreting |-| as a range that starts at | and ends at |. (I would have expected that to be treated as a syntax error, but there you have it. It works the same in every regex flavor I've tried, too.)
Have a look at this regex:
/(?:^|;)([a-z-]*)\s*:\s*([a-z'":_\/.0-9-]*)\s*(?=;|$)/ig
As suggested by the other responders, I collapsed it to one alternative, removed the unneeded pipes, and escaped the hyphen by moving it to the end. I also anchored it at the beginning as well as the end. Or anchored it as well as I can, anyway. I used a lookahead to match the trailing semicolon so it will still be there when the next match starts. It's far from foolproof, but it should work okay as long as the input is well formed.
Replace regular expressions in your code as follow:
regExpKeyValuePair = /[-a-z]*\s*:\s*[-a-z'":_\/.0-9]*\s*;|[-a-z]*\s*:\s*[-a-z'":-_\/.0-9]*\s*$/gi;
regExpKey = /[-a-z]*/gi;
regExpValue = /[-a-z:_\/.0-9]*/gi;
You don't need escape . inside [].
No need to put | between elements [].
Because you are using /i flag, [A-Z] is not needed.
- should be at the beginning or at the end.

How to find occurence of multiple strings in a given string using javascript RegExp()

I wanted to check the availability of multiple strings in a given string ( without using a loop ).
like
my_string = "How to find occurence of multiple sting in a given string using javascript RegExp";
// search operated on this string
// having a-z (lower case) , '_' , 0-9 , and space
// following are the strings wanted to search .( having a-z , 0-9 , and '_')
search_str[0]="How";
search_str[1]="javascript";
search_str[2]="sting";
search_str[3]="multiple";
I don't need their position.
I just needed to know all the search_str are must be in my_string.
order of search_str never effect the result .
is there is any regular expression available for this ?
UPDATE : WHAT AM I MISSING
in the answers i found this one is working in the above problem
if (/^(?=.*\bHow\b)(?=.*\bjavascript\b)(?=.*\bsting\b)(?=.*\bmultiple\b)/.test(subject)) {
// Successful match
}
But in this case it is not working.
m_str="_3_5_1_13_10_11_";
search_str[0]='3';
search_str[1]='1';
tst=new RegExp("^(?=.*\\b_"+search_str[0]+"_\\b)(?=.*\\b_"+search_str[1]+"_\\b)");
if(tst.test(m_str)) alert('fooooo'); else alert('wrong');
if (/^(?=.*\bHow\b)(?=.*\bjavascript\b)(?=.*\bsting\b)(?=.*\bmultiple\b)/.test(subject)) {
// Successful match
}
This assumes that your string doesn't contain newlines. If it does, you need to change all the .s to [\s\S].
I have used word boundary anchors to make sure that Howard or resting don't accidentally provide a match. If you do want to allow that, remove the \bs.
Explanation:
(?=...) is a lookahead assertion: It looks ahead in the string to check whether the enclosed regex could match at the current position without actually consuming characters for the match. Therefore, a succession of lookaheads works like a sequence of regexes (anchored to the start of the string by ^) that are combined with a logical && operator.

Categories