Returning matched lookahead groups - javascript

Good day,
I am trying to return groups of 3 digits of a given string where digits are "consumed" twice in JavaScript:
From a string "123 456" I would like exec to return ["1", "12"] for the first match, ["2", "23"] and so on.
I tried using a lookahead like this:
let exp = /(\d(?=\d\d))/g;
let match;
while(match = exp.exec("1234 454")) {
console.log(match);
}
This, will however still only each digit which precedes two digits.
Does someone have a solution? I have searched but am not exactly sure what to search for so I might have missed something.
Thank you in advance!

You need to capture inside a positive lookahead here:
let exp = /(?=((\d)\d))/g;
let match;
while(match = exp.exec("1234 454")) {
if (match.index === exp.lastIndex) { // \
exp.lastIndex++; // - Prevent infinite loop
} // /
console.log([match[1], match[2]]); // Print the output
}
The (?=((\d)\d)) pattern matches a location followed with 2 digits (captured into Group 1) the first being captured into Group 2.

Related

js Remove a part from a parameter which doesnt fit a pattern

const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
let newClass = tweet.text.replace(/[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}/g, '');
console.log(`Found match: ${newClass}`);
};
when tweet.text = "123.qwe.456 test" I still get the same output but I want to remove anything which doesnt fit the pattern
/[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/
any ideas?
You can use capture groups to extract exactly what gets matched in your string and then replace your original variable with this value. Something like
const regex = /([1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3})/
let match = tweet.text.match(regex)
tweet.text = match[1]
Instead of replace, you can get the match instead
\b[1-9a-zA-Z]{3}([-.])[1-9a-zA-Z]{3}\1[1-9a-zA-Z]{3}\b
Explanation
\b A word boundary
[1-9a-zA-Z]{3} Match 3 times any of the listed (Note that 1-9 does not match a 0)
([-.]) Capture in group 1 either an - or .
[1-9a-zA-Z]{3} Match 3 times any of the listed
\1 Back reference to group 1, match the same as captured in group 1
[1-9a-zA-Z]{3} Match 3 times any of the listed
\b A word boundary
Regex demo
const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
console.log(`Found match: ${m[0]}`);
figured the solution

Is it possible to have one regex that solves this task?

string = '1,23'
When a comma is present in the string, I want the regex to match the first digit (\n) after the comma e.g.2.
Sometimes the comma will not be there. When it's not present, I want the regex to match the first digit of the string e.g. 1.
Also, we can't reverse the order of the string to solve this task.
I am genuinely stuck. The only idea I had was prepending this: [,|nothing]. I tried '' to mean nothing but that didn't work.
You can match an optional sequence of chars other than a comma and then a comma at the start of a string, and then match and capture the digit with
/^(?:[^,]*,)?(\d)/
See the regex demo.
Details
^ - start of string
(?:[^,]*,)? - an optional sequence of
[^,]* - 0 any chars other than a comma
, - a comma
(\d) - Capturing group 1: any digit
See the JavaScript demo:
const strs = ['123', '1,23'];
const rx = /^(?:[^,]*,)?(\d)/;
for (const s of strs) {
const result = (s.match(rx) || ['',''])[1];
// Or, const result = s.match(rx)?.[1] || "";
console.log(s, '=>', result);
}

I need help getting the first n characters of a string up to when a number character starts

I'm working with a string where I need to extract the first n characters up to where numbers begin. What would be the best way to do this as sometimes the string starts with a number: 7EUSA8889er898 I would need to extract 7EUSA But other string examples would be SWFX74849948, I would need to extract SWFX from that string.
Not sure how to do this with regex my limited knowledge is blocking me at this point:
^(\w{4}) that just gets me the first four characters but I don't really have a stopping point as sometimes the string could be somelongstring292894830982 which would require me to get somelongstring
Using \w will match a word character which includes characters and digits and an underscore.
You could match an optional digit [0-9]? from the start of the string ^and then match 1+ times A-Za-z
^[0-9]?[A-Za-z]+
Regex demo
const regex = /^[0-9]?[A-Za-z]+/;
[
"7EUSA8889er898",
"somelongstring292894830982",
"SWFX74849948"
].forEach(s => console.log(s.match(regex)[0]));
Can use this regex code:
(^\d+?[a-zA-Z]+)|(^\d+|[a-zA-Z]+)
I try with exmaple and good worked:
1- somelongstring292894830982 -> somelongstring
2- 7sdfsdf5456 -> 7sdfsdf
3- 875werwer54556 -> 875werwer
If you want to create function where the RegExp is parametrized by n parameter, this would be
function getStr(str,n) {
var pattern = "\\d?\\w{0,"+n+"}";
var reg = new RegExp(pattern);
var result = reg.exec(str);
if(result[0]) return result[0].substr(0,n);
}
There are answers to this but here is another way to do it.
var string1 = '7EUSA8889er898';
var string2 = 'SWFX74849948';
var Extract = function (args) {
var C = args.split(''); // Split string in array
var NI = []; // Store indexes of all numbers
// Loop through list -> if char is a number add its index
C.map(function (I) { return /^\d+$/.test(I) === true ? NI.push(C.indexOf(I)) : ''; });
// Get the items between the first and second occurence of a number
return C.slice(NI[0] === 0 ? NI[0] + 1 : 0, NI[1]).join('');
};
console.log(Extract(string1));
console.log(Extract(string2));
Output
EUSA
SWFX7
Since it's hard to tell what you are trying to match, I'd go with a general regex
^\d?\D+(?=\d)

Regex match everything after match set except for match set

This may be a simple expression to write but I am having the hardest time with this one. I need to match group sets where each group has 2 parts, what we can call the operation and the value. I need the value to match to anything after the operation EXCEPT another operation.
Valid operations to match (standard math operators): [>,<,=,!,....]
For example: '>=25!30<50' Would result in three matching groups:
1. (>=, 25)
2. (!, 30)
3. (<, 50)
I can currently solve the above using: /(>=|<=|>|<|!|=)(\d*)/g however this only works if the characters in the second match set are numbers.
The wall I am running into is how to match EVERYTHING after EXCEPT for the specified operators.
For example I don't know how to solve: '<=2017-01-01' without writing a regex to specify each and every character I would allow (which is anything except the operators) and that just doesn't seem like the correct solution.
There has got to be a way to do this! Thanks guys.
What you might do is match the operations (>=|<=|>|<|!|=) which will be the first of the 2 parts and in a capturing group use a negative lookahead to match while there is not an operation directly at the right side which will be the second of the 2 parts.
(?:>=|<=|>|<|!|=)((?:(?!(?:>=|<=|>|<|!|=)).)+)
(?:>=|<=|>|<|!|=) Match one of the operations using an alternation
( Start capturing group (This will contain your value)
(?: Start non capturing group
(?!(?:>=|<=|>|<|!|=)). Negative lookahead which asserts what is on the right side is not an operation and matches any character .
)+ Close non capturing group and repeat one or more times
) Close capturing group
const regex = /(?:>=|<=|>|<|!|=)((?:(?!(?:>=|<=|>|<|!|=)).)+)/gm;
const strings = [
">=25!30<50",
">=test!30<$##%",
"34"
];
let m;
strings.forEach((s) => {
while ((m = regex.exec(s)) !== null) {
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
console.log(m[1]);
}
});
You can use this code
var str = ">=25!30<50";
var pattern = RegExp(/(?:([\<\>\=\!]{1,2})(\d+))/, "g");
var output = [];
let matchs = null;
while((matchs = pattern.exec(str)) != null) {
output.push([matchs[1], matchs[2]]);
}
console.log(output);
Output array :
0: Array [ ">=", "25" ]
​
1: Array [ "!", "30" ]
​
2: Array [ "<", "50" ]
I think this is what you need:
/((?:>=|<=|>|<|!|=)[^>=<!]+)/g
the ^ excludes characters you don't want, + means any number of

How use regexp match all repeat substring in javascript?

How use regexp match all repeat substring in javascript?
For example:
Get [ "cd","cd","cdcd","cdcdcd", "cdcdcdcd" ] by "abccdddcdcdcdcd123"
+ is not working:
"abccdddcdcdcdcd123".match(/(cd)+/g)
Array [ "cd", "cdcdcdcd" ]
This can be done with positive look aheads ?=. This type of matching doesnt move the cursor forward so you can match the same content multiple times.
var re = /cd(?=((cd)*))/g;
var str = "abccdddcdcdcdcd123";
var m;
while (m = re.exec(str)) {
console.log(m[0]+m[1]);
}
Capture group 0 gets the first cd, then a positive lookahead captures all subsequent cd characters. You can combine the two to get the desired result.
See https://www.regular-expressions.info/refadv.html
Matches at a position where the pattern inside the lookahead can be matched. Matches only the position. It does not consume any characters or expand the match. In a pattern like one(?=two)three, both two and three have to match at the position where the match of one ends.
I guess you could also do it like this.
Put the capture group inside a lookahead assertion.
Most engines bump the current regex position if it didn't change since
last match. Not JS though, you have to do it manually via incrementing lastIndex.
Readable regex
(?=
( # (1 start)
(?: cd )+
) # (1 end)
)
var re = /(?=((?:cd)+))/g;
var str = "abccdddcdcdcdcd123";
var m;
while (m = re.exec(str)) {
console.log( m[1] );
++re.lastIndex;
}
I think the common solution to an overlapping match problem like this should be as following:
/(?=((cd)+))cd
Match the inner pattern in group one or more times in a lookahead whilst moving the carret two characters at a time ahead. (We could also move by two dots ..).
Code sample:
var re = /(?=((cd)+))cd/g;
var str = "abccdddcdcdcdcd123";
var m; //var arr = new Array();
while (m = re.exec(str)) {
//arr.push(m[1]);
console.log(m[1]);
}
We get the result from group 1 via m[1].
Use .push(m[1]); to add it to an array.

Categories