How use regexp match all repeat substring in javascript?
For example:
Get [ "cd","cd","cdcd","cdcdcd", "cdcdcdcd" ] by "abccdddcdcdcdcd123"
+ is not working:
"abccdddcdcdcdcd123".match(/(cd)+/g)
Array [ "cd", "cdcdcdcd" ]
This can be done with positive look aheads ?=. This type of matching doesnt move the cursor forward so you can match the same content multiple times.
var re = /cd(?=((cd)*))/g;
var str = "abccdddcdcdcdcd123";
var m;
while (m = re.exec(str)) {
console.log(m[0]+m[1]);
}
Capture group 0 gets the first cd, then a positive lookahead captures all subsequent cd characters. You can combine the two to get the desired result.
See https://www.regular-expressions.info/refadv.html
Matches at a position where the pattern inside the lookahead can be matched. Matches only the position. It does not consume any characters or expand the match. In a pattern like one(?=two)three, both two and three have to match at the position where the match of one ends.
I guess you could also do it like this.
Put the capture group inside a lookahead assertion.
Most engines bump the current regex position if it didn't change since
last match. Not JS though, you have to do it manually via incrementing lastIndex.
Readable regex
(?=
( # (1 start)
(?: cd )+
) # (1 end)
)
var re = /(?=((?:cd)+))/g;
var str = "abccdddcdcdcdcd123";
var m;
while (m = re.exec(str)) {
console.log( m[1] );
++re.lastIndex;
}
I think the common solution to an overlapping match problem like this should be as following:
/(?=((cd)+))cd
Match the inner pattern in group one or more times in a lookahead whilst moving the carret two characters at a time ahead. (We could also move by two dots ..).
Code sample:
var re = /(?=((cd)+))cd/g;
var str = "abccdddcdcdcdcd123";
var m; //var arr = new Array();
while (m = re.exec(str)) {
//arr.push(m[1]);
console.log(m[1]);
}
We get the result from group 1 via m[1].
Use .push(m[1]); to add it to an array.
Related
I want a Javascript regex or with any possible solution,
For a given string finds all the substrings that start with a particular string and end with a particular character. The returned set of subStrings can be an Array.
this string can also have nested within parenthesis.
var str = "myfunc(1,2) and myfunc(3,4) or (myfunc(5,6) and func(7,8))";
starting char = "myfunc" ending char = ")" . here ending character should be first matching closing paranthesis.
output: function with arguments.
[myfunc(1,2),
myfunc(3,4),
myfunc(5,6),
func(7,8)]
I have tried with this. but, its returning null always.
var str = "myfunc(1,2) and myfunc(3,4) or (myfunc(5,6) and func(7,8))";
var re = /\myfunc.*?\)/ig
var match;
while ((match = re.exec(str)) != null){
console.log(match);
}
Can you help here?
I tested your regex and it seems to work fine:
let input = "myfunc(1,2) and myfunc(3,4) or (myfunc(5,6) and func(7,8))"
let pattern = /myfunc.*?\)/ig
// there is no need to use \m since it does nothing, and NO you dont need it even if you use 'm' at the beginning.
console.log(input.match(pattern))
//[ "myfunc(1,2)", "myfunc(3,4)", "myfunc(5,6)" ]
If you use (?:my|)func\(.+?\) you will be able to catch 'func(7,8)' too.
(?:my|)
( start of group
?: non capturing group
my| matches either 'my' or null, this will match either myfunc or func
) end of group
Test the regex here: https://regex101.com/r/3ujbdA/1
const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
let newClass = tweet.text.replace(/[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}/g, '');
console.log(`Found match: ${newClass}`);
};
when tweet.text = "123.qwe.456 test" I still get the same output but I want to remove anything which doesnt fit the pattern
/[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/
any ideas?
You can use capture groups to extract exactly what gets matched in your string and then replace your original variable with this value. Something like
const regex = /([1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3})/
let match = tweet.text.match(regex)
tweet.text = match[1]
Instead of replace, you can get the match instead
\b[1-9a-zA-Z]{3}([-.])[1-9a-zA-Z]{3}\1[1-9a-zA-Z]{3}\b
Explanation
\b A word boundary
[1-9a-zA-Z]{3} Match 3 times any of the listed (Note that 1-9 does not match a 0)
([-.]) Capture in group 1 either an - or .
[1-9a-zA-Z]{3} Match 3 times any of the listed
\1 Back reference to group 1, match the same as captured in group 1
[1-9a-zA-Z]{3} Match 3 times any of the listed
\b A word boundary
Regex demo
const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
console.log(`Found match: ${m[0]}`);
figured the solution
This may be a simple expression to write but I am having the hardest time with this one. I need to match group sets where each group has 2 parts, what we can call the operation and the value. I need the value to match to anything after the operation EXCEPT another operation.
Valid operations to match (standard math operators): [>,<,=,!,....]
For example: '>=25!30<50' Would result in three matching groups:
1. (>=, 25)
2. (!, 30)
3. (<, 50)
I can currently solve the above using: /(>=|<=|>|<|!|=)(\d*)/g however this only works if the characters in the second match set are numbers.
The wall I am running into is how to match EVERYTHING after EXCEPT for the specified operators.
For example I don't know how to solve: '<=2017-01-01' without writing a regex to specify each and every character I would allow (which is anything except the operators) and that just doesn't seem like the correct solution.
There has got to be a way to do this! Thanks guys.
What you might do is match the operations (>=|<=|>|<|!|=) which will be the first of the 2 parts and in a capturing group use a negative lookahead to match while there is not an operation directly at the right side which will be the second of the 2 parts.
(?:>=|<=|>|<|!|=)((?:(?!(?:>=|<=|>|<|!|=)).)+)
(?:>=|<=|>|<|!|=) Match one of the operations using an alternation
( Start capturing group (This will contain your value)
(?: Start non capturing group
(?!(?:>=|<=|>|<|!|=)). Negative lookahead which asserts what is on the right side is not an operation and matches any character .
)+ Close non capturing group and repeat one or more times
) Close capturing group
const regex = /(?:>=|<=|>|<|!|=)((?:(?!(?:>=|<=|>|<|!|=)).)+)/gm;
const strings = [
">=25!30<50",
">=test!30<$##%",
"34"
];
let m;
strings.forEach((s) => {
while ((m = regex.exec(s)) !== null) {
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
console.log(m[1]);
}
});
You can use this code
var str = ">=25!30<50";
var pattern = RegExp(/(?:([\<\>\=\!]{1,2})(\d+))/, "g");
var output = [];
let matchs = null;
while((matchs = pattern.exec(str)) != null) {
output.push([matchs[1], matchs[2]]);
}
console.log(output);
Output array :
0: Array [ ">=", "25" ]
1: Array [ "!", "30" ]
2: Array [ "<", "50" ]
I think this is what you need:
/((?:>=|<=|>|<|!|=)[^>=<!]+)/g
the ^ excludes characters you don't want, + means any number of
Could anyone help me with this regular expression issue?
expr = /\(\(([^)]+)\)\)/;
input = ((111111111111))
the one I would need to be working is = ((111111111111),(222222222),(333333333333333))
That expression works fine to get 111111 from (input) , but not when there are also the groups 2222... and 3333.... the input might be variable by variable I mean could be ((111111111111)) or the one above or different (always following the same parenthesis pattern though)
Is there any reg expression to extract the values for both cases to an array?
The result I would like to come to is:
[0] = "111111"
[1] = "222222"
[2] = "333333"
Thanks
If you are trying to validate format while extracting desired parts you could use sticky y flag. This flag starts match from beginning and next match from where previous match ends. This approach needs one input string at a time.
Regex:
/^\(\(([^)]+)\)|(?!^)(?:,\(([^)]+)\)|\)$)/yg
Breakdown:
^\(\( Match beginning of input and immedietly ((
( Start of capturing group #1
[^)]+ Match anything but )
)\) End of CG #1, match ) immediately
| Or
(?!^) Next patterns shouldn't start at beginning
(?: Start of non-capturing group
,\(([^)]+)\) Match a separetd group (capture value in CG #2, same pattern as above)
| Or
\)$ Match ) and end of input
) End of group
JS code:
var str = '((111111111111),(222222222),(333333333333333))';
console.log(
str.replace(/^\(\(([^)]+)\)|(?!^)(?:,\(([^)]+)\)|\)$)/yg, '$1$2\n')
.split(/\n/).filter(Boolean)
);
You can replace brackes with , split it with , and then use substring to get the required number of string characters out of it.
input.replace(/\(/g, '').replace(/\)/g, '')
This will replace all the ( and ) and return a string like
111111111111,222222222,333333333333333
Now splitting this string with , will result into an array to what you want
var input = "((111111111111),(222222222),(333333333333333))";
var numbers = input.replace(/\(/g, '').replace(/\)/g, '')
numbers.split(",").map(o=> console.log(o.substring(0,6)))
If the level of nesting is fixed, you can just leave out the outer () from the pattern, and add the left parentheses to the [^)] group:
var expr = /\(([^()]+)\)/g;
var input = '((111111111111),(222222222),(333333333333333))';
var match = null;
while(match = expr.exec(input)) {
console.log(match[1]);
}
I want to capture the "1" and "2" in "http://test.com/1/2". Here is my regexp /(?:\/([0-9]+))/g.
The problem is that I only get ["/1", "/2"]. According to http://regex101.com/r/uC2bW5 I have to get "1" and "1".
I'm running my RegExp in JS.
You have a couple of options:
Use a while loop over RegExp.prototype.exec:
var regex = /(?:\/([0-9]+))/g,
string = "http://test.com/1/2",
matches = [];
while (match = regex.exec(string)) {
matches.push(match[1]);
}
Use replace as suggested by elclanrs:
var regex = /(?:\/([0-9]+))/g,
string = "http://test.com/1/2",
matches = [];
string.replace(regex, function() {
matches.push(arguments[1]);
});
In Javascript your "match" has always an element with index 0, that contains the WHOLE pattern match. So in your case, this index 0 is /1 and /2 for the second match.
If you want to get your DEFINED first Matchgroup (the one that does not include the /), you'll find it inside the Match-Array Entry with index 1.
This index 0 cannot be removed and has nothing to do with the outer matching group you defined as non-matching by using ?:
Imagine Javascript wrapps your whole regex into an additional set of brackets.
I.e. the String Hello World and the Regex /Hell(o) World/ will result in :
[0 => Hello World, 1 => o]