What is the meaning of the returend array of exec() - javascript

For instance one has the simple regular expression:
var regex = /([^\\^\{^\}]+)|(\\[A-Za-z]+(\{[^}^{]*})*)|($[.]+$)|($$[.]+$$)/g;
and a string to check:
var text = '\\a{b}{c}{d}';
on witch the function var matched = regex.exec(text) is runned.
The returned Array machted looks now like:
matched =['\\a{b}{c}{d}', undefined, '\\a{b}{c}{d}', '{d}', undefined, undefined];
What do the single entries mean? And how to control them?
Thanks in advance!

var regex = /([^\\^\{^\}]+)|(\\[A-Za-z]+(\{[^}^{]*})*)|($[.]+$)|($$[.]+$$)/;
regex.exec('\\a{b}{c}{d}');
//=> ["\a{b}{c}{d}", undefined, "\a{b}{c}{d}", "{d}", undefined, undefined]
Resulting array contains matched groups where first element of array is whole input that matched your regex.
undefined means some of your groups didn't match anything and remained empty.
You can use **non-capturing groups and avoid undefined:
var regex = /(?:[^\\^\{^}]+)|(\\[A-Za-z]+(\{[^}^{]*})*)|(?:$[.]+$)|(?:$$[.]+$$)/;
regex.exec('\\a{b}{c}{d}');
//=> ["\a{b}{c}{d}", "\a{b}{c}{d}", "{d}"]

This is just how RegExp.prototype.exec works.
The groups in your regex — contents inside the () chars — create elements in the returned array

The zero-th element is the full match substring, the following elements are the substrings that were matched by capture groups (....). For a simpler example:
/(c)|(b(.))./.exec('abcdef') should return ['bcd', undefined, 'bc', 'c']. The pattern is an alternation between (c) and (b(.)).; "bcd" came before "c", so the second alternative matched while the first didn't. Thus, first capture group (c) is undefined, since it matched nothing. The second capture group (b(.)) matched "bc". The third, (.), matched "c".

Let's start with a simplier example :
var text = 'azrt12345';
var regex = /([a-z]+)|([0-9]+)/;
var matched = regex.exec(text);
/* matched = ["azrt", "azrt", undefined] */
As you can see, the regexp catches an alpha string or a numeric one. As text begins with alpha characters, the first capturing group works, not the second. Then, matched[0] contains the whole string matched, matched[1] the first capturing group (with what it captured), matched[2] the second one (with nothing captured, then it contains undefined).
See this excellent doc to understand the way it works.

Related

JavaScript regex returns two matches without /g flag

Here is my code I tried many different ways to achieve but failed to do here is the expression /((\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+)/ which should return only one match i.e. 123,24,23233 because I am not using \g flag. But it returns weird output i.e. 123,24,23233,123,24,23233,,,,,, same thing two times
But why when I use \g flag i.e. /((\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+)/g , it works properly and returns 123,24,23233,123,24,23233325676.
My problem is I want regex to return the first match which is 123,24,23233. but getting weird return which is 123,24,23233,123,24,23233,,,,,,
Reference 1/2:
var str = "(3,(123,24,23233),(3,3,(123,24,(123,24,23233325676))))";
var exp = /((\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+)/;
var res = str.match(exp);
console.log(res);
Reference 2/2:
JSfiddle (Without /g flag)
Jsfiddle (With /g flag)
Regex
Since you did not include the global flag, "only the first complete match and its related capturing groups [will be] returned."
You will want to destructure the first group from your match object. The full match is the first item in the match result.
const
str = "(3,(123,24,23233),(3,3,(123,24,(123,24,23233325676))))",
exp = /((\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+,(\s?)+\d{1,}(\s?)+)/,
[fullMatch, res] = str.match(exp);
console.log(res); // 123,24,23233
Please review String.prototype.match() and the RegExp object over at MDN.
if the g flag is not used, only the first complete match and its related capturing groups are returned.
Reference documentation
Return value
An Array whose contents depend on the presence or absence of the global (g) flag, or null if no matches are found.
If the g flag is used, all results matching the complete regular expression will be returned, but capturing groups will not.
if the g flag is not used, only the first complete match and its related capturing groups are returned. In this case, the returned item will have additional properties as described below.
Additional properties
As explained above, some results contain additional properties as described below.
groups
An object of named capturing groups whose keys are the names and values are the capturing groups or undefined if no named capturing groups were defined. See Groups and Ranges for more information.
index
The index of the search at which the result was found.
input
A copy of the search string.

javascript getting a faulty result using a regular expression

In my web page, I have:
var res = number.match(/[0-9\+\-\(\)\s]+/g);
alert(res);
As you can see, I want to get only numbers, the characters +, -, (, ) and the space(\s)
When I tried number = '98+66-97fffg9', the expected result is: 98+66-979
but I get 98+66-97,9
the comma is an odd character here! How can eliminate it?
Its probably because you get two groups that satisfied your expression.
In other words: match mechanism stops aggregating group when it finds first unwanted character -f. Then it skips matching until next proper group that, in this case, contains only one number - 9. This two groups are separated by comma.
Try this:
var number = '98+66-97fffg9';
var res = number.match(/[0-9\+\-\(\)\s]+/g);
// res is an array! You have to join elements!
var joined = res.join('');
alert(joined);
You're getting this because your regex matched two results in the number string, not one. Try printing res, you'll see that you've matched both 98+66-979 as well as 9
String.match returns an array of matched items. In your case you have received two items ['98+66-97','9'], but alert function outputs them as one string '98+66-97,9'. Instead of match function use String.replace function to remove(filter) all unallowable characters from input number:
var number = '98+66-97fffg9',
res = number.replace(/[^0-9\+\-\(\)\s]+/g, "");
console.log(res); // 98+66-979
stringvariable.match(/[0-9\+\-\(\)\s]+/g); will give you output of matching strings from stringvariable excluding unmatching characters.
In your case your string is 98+66-97fffg9 so as per the regular expression it will eliminate "fffg" and will give you array of ["98+66-97","9"].
Its default behavior of match function.
You can simply do res.join('') to get the required output.
Hope it helps you
As per documents from docs, the return value is
An Array containing the entire match result and any parentheses-captured matched results, or null if there were no matches.
S,your return value contains
["98+66-97", "9"]
So if you want to skip parentheses-captured matched results
just remove g flag from regular expression.
So,your expression should like this one
number.match(/[0-9\+\-\(\)\s]+/); which gives result ["98+66-97"]

Whats wrong with this regex logic

I am trying to fetch the value after equal sign, its works but i am getting duplicated values , any idea whats wrong here?
// Regex for finding a word after "=" sign
var myregexpNew = /=(\S*)/g;
// Regex for finding a word before "=" sign
var mytype = /(\S*)=/g;
//Setting data from Grid Column
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
var newtype = mytype.exec(strNew);
alert(matchNew);
https://jsfiddle.net/6vjjv0hv/
exec returns an array, the first element is the global match, the following ones are the submatches, that's why you get ["=20", "20"] (using console.log here instead of alert would make it clearer what you get).
When looking for submatches and using exec, you're usually interested in the elements starting at index 1.
Regarding the whole parsing, it's obvious there are better solution, like using only one regex with two submatches, but it depends on the real goal.
You can try without using Regex like this:
var val = 'QCById=20';
var myString = val.substr(val.indexOf("=") + 1);
alert(myString);
Presently exec is returning you the matched value.
REGEXP.exec(SOMETHING) returns an array (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec).
The first item in the array is the full match and the rest matches the parenthesized substrings.
You do not get duplicated values, you just get an array of a matched value and the captured text #1.
See RegExp#exec() help:
If the match succeeds, the exec() method returns an array and updates properties of the regular expression object. The returned array has the matched text as the first item, and then one item for each capturing parenthesis that matched containing the text that was captured.
Just use the [1] index to get the captured text only.
var myregexpNew = /=(\S*)/g;
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
if (matchNew) {
console.log(matchNew[1]);
}
To get values on both sides of =, you can use /(\S*)=(\S*)/g regex:
var myregexpNew = /(\S*)=(\S*)/g;
var strNew = "QCById=20";
var matchNew = myregexpNew.exec(strNew);
if (matchNew) {
console.log(matchNew[1]);
console.log(matchNew[2]);
}
Also, you may want to add a check to see if the captured values are not undefined/empty since \S* may capture an empty string. OR use /(\S+)=(\S+)/g regex that requires at least one non-whitespace character to appear before and after the = sign.

`match` and `exec` with non-global regex appear to return the first match twice

I do not quite understand the behaviour of the JavaScript regex methods.
The problem is that I can’t get regexes of type /(something|something)/ to work with the match or exec methods without the global identifier, e.g. /(somereg1|somereg2)/g.
When the global identifier is there, the methods correctly return every instance it finds. But when it is not there, both methods correctly return only the first match they find. The problem is that they appear to return it twice. For instance:
const str = "Here is somereg1 and somereg2";
str.match(/(somereg1|somereg2)/)
I would expect this match call to return "somereg1". Instead it appears to return "somereg1,somereg1".
Check this JSFiddle. The code should be fairly self explanatory. The first example is taken from W3Schools.
The first element is the full match of the regex. If you tried this:
const str = "Here is somereg1 and somereg2";
str.match(/.*(somereg1|somereg2)/)
Your result would be [ "Here is somereg1 and somereg2", "somereg2" ].
This same behaviour occurs with an .exec(str) method call.
You might want to read about .match and .exec.
About the “sub parentheses matches”: in regexes, parentheses delimit capture groups. So, if you had this regex:
/.*(somereg1).*?(somereg2)/
Your .match result would be [ "Here is somereg1 and somereg2", "somereg1", "somereg2" ]. So, as you can see, the result array consists of the full match followed by all capture groups matches.
And to force a group not to be captured, just delimit with (?: and ):
"Here is somereg1 and somereg2".match(/.*(?:somereg1).*?(somereg2)/);
// Will result in [ "Here is somereg1 and somereg2", "somereg2" ].
Note that the g (global) flag changes the return semantics of match: they will return an array of full matches and capture groups will be ignored. exec, on the other hand, always returns the full match and capture group matches of the match which is after the current lastIndex of the RegExp instance. For convenience, matchAll can be used instead, which returns an iterator of all matches, including all capture groups.
You can use the following to get the req. result:
var str = "Here is somereg1 and somereg2" //I would expect
str.match(/(?=(somereg1|somereg2))/)
As for the match and exec. I would say go for the match as it uses regex object and prevents you from double escape and all for the strings used as re.
Modify your second line as below:
str.match(/somereg1|somereg2/)

Trying to capture group in javascript regex (port from c#)

I have this regex
var mregex = /(\$m[\w|\.]+)/g;
string mstring= "$m.x = $m.y";
So basically capture each instance of $m.[+ any number of alphanumeric or . until another character or the end]
I have this working in C# but I'm trying to port it over to javascript, so dropped the name capture.
var match = mregexp.exec(mstring);
match has
0: "$m.x"
1: "$m.x" // not $m.y as I would have expected.
What am i doing wrong?
thanks
You regular expression just matches once. The [0] element of the return array is the entire matched substring. The [1] element is the first group, which in your case is the same. You'd have to call .exec() again to get it to find the second instance.
You can pass a function to .replace(), which I personally like:
mstring.replace(mregexp, function(_, group) {
console.log( group );
});
That'd show you both matched groups. (The function is passed arguments that are of the same nature as the elements of the returned array from .exec().)
You will have to repeat mregexp.exec() until it returns null.
var match = []; //initialize a new array
while(mregexp.exec(mstring)){
match.push(mregexp[1]);
}
For Javascript's flavor of regexen see http://www.regular-expressions.info/javascript.html
You can call mstring.match(mregexp) to return all of the matches, but you only see the matched substrings (in which case you could simplify mregexp to /\$m[\w.]+/g).

Categories