I'm trying to create a regex that will select the numbers/numbers with commas(if easier, can trim commas later) that do not have a parentheses after and not the numbers inside the parentheses should not be selected either.
Used with the JavaScript's String.match method
Example strings
9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
What i have so far:
/((^\d+[^\(])|(,\d+,)|(,*\d+$))/gm
I tried this in regex101 and underlined the numbers i would like to match and x on the one that should not.
You could start with a substitution to remove all the unwanted parts:
/\d*\(.*?\),?//gm
Demo
This leaves you with
5,10
10,2,5,
10,7,2,4
which makes the matching pretty straight forward:
/(\d+)/gm
If you want it as a single match expression you could use a negative lookbehind:
/(?<!\([\d,]*)(\d+)(?:,|$)/gm
Demo - and here's the same matching expression as a runnable javascript (skeleton code borrowed from Wiktor's answer):
const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/(?<!\([\d,]*)(\d+)(?:,|$)/gm), x=>x[1])
console.log(matches);
Here, I'd recommend the so-called "best regex trick ever": just match what you do not need (negative contexts) and then match and capture what you need, and grab the captured items only.
If you want to match integer numbers that are not matched with \d+\([^()]*\) pattern (a number followed with a parenthetical substring), you can match this pattern or match and capture the \d+, one or more digit matching pattern, and then simply grab Group 1 values from matches:
const text = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4`;
const matches = Array.from(text.matchAll(/\d+\([^()]*\)|(\d+)/g), x=> x[1] ?? "").filter(Boolean)
console.log(matches);
Details:
text.matchAll(/\d+\([^()]*\)|(\d+)/g) - matches one or more digits (\d+) + ( (with \() + any zero or more chars other than ( and ) (with [^()]*) + \) (see \)), or (|) one or more digits captured into Group 1 ((\d+))
Array.from(..., x=> x[1] ?? "") - gets Group 1 value, or, if not assigned, just adds an empty string
.filter(Boolean) - removes empty strings.
Using several replacement regexes
var textA = `9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`
console.log('A', textA)
var textB = textA.replace(/\(.*?\),?/g, ';')
console.log('B', textB)
var textC = textB.replace(/^\d+|\d+$|\d*;\d*/gm, '')
console.log('C', textC)
var textD = textC.replace(/,+/g, ' ').trim(',')
console.log('D', textD)
With a loop
Here is a solution which splits the lines on comma and loops over the pieces:
var inside = false;
var result = [];
`9(296,178),5,3(123),10
10,9(296,178),2,5,3(123),3(124,125)
10,7,5(296,293,444,1255),3(218),2,4
`.split("\n").map(line => {
let pieceArray = line.split(",")
pieceArray.forEach((piece, k) => {
if (piece.includes('(')) {
inside = true
} else if (piece.includes(')')) {
inside = false
} else if (!inside && k > 0 && k < pieceArray.length-1 && !pieceArray[k-1].includes(')')) {
result.push(piece)
}
})
})
console.log(result)
It does print the expected result: ["5", "7"]
const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
let newClass = tweet.text.replace(/[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}/g, '');
console.log(`Found match: ${newClass}`);
};
when tweet.text = "123.qwe.456 test" I still get the same output but I want to remove anything which doesnt fit the pattern
/[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/
any ideas?
You can use capture groups to extract exactly what gets matched in your string and then replace your original variable with this value. Something like
const regex = /([1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3})/
let match = tweet.text.match(regex)
tweet.text = match[1]
Instead of replace, you can get the match instead
\b[1-9a-zA-Z]{3}([-.])[1-9a-zA-Z]{3}\1[1-9a-zA-Z]{3}\b
Explanation
\b A word boundary
[1-9a-zA-Z]{3} Match 3 times any of the listed (Note that 1-9 does not match a 0)
([-.]) Capture in group 1 either an - or .
[1-9a-zA-Z]{3} Match 3 times any of the listed
\1 Back reference to group 1, match the same as captured in group 1
[1-9a-zA-Z]{3} Match 3 times any of the listed
\b A word boundary
Regex demo
const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
console.log(`Found match: ${m[0]}`);
figured the solution
I get a string like:
str = “Test/hello/filename/12345678/first
Hddhkhd
Hdhal
filename/1212abcd/second”
I want to get an array of the all strings that comes after “filename//“ and I know that after the “/“ there is an 8 letter word that I want to get.
In this case, I want to get an array that will be:
strArr = [“12345678”, “1212abcd”]
How do I solve this problem?
A regex that captures the 8 characters that immediately follow a literal "filename//":
/filename\/\/(.{8})/
Try use this regex first:
filename\/\w{8}
and after it, slice from the result by this regex:
\w{8}$
First you will get:
filename/12345678
filename/1212abcd
Second you will get :
12345678
1212abcd
You might also capture in a group matching 8 times not a forward slash or a newline after matching /filename
\bfilename\/([^\/\n]{8})
Regex demo
If you want to match 8 or more times you could use {8,} instead or if you want to match 1 or more times you could use a +.
If you don't want to match whitespace characters you could change the \n to \s
const regex = /filename\/([^\/\n]{8})/g;
const str = `Test/hello/filename/12345678/first
Hddhkhd
Hdhal
filename/1212abcd/second`;
let m;
while ((m = regex.exec(str)) !== null) {
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
console.log(m[1]);
}
You can use the following code. It will match all characters after the filename/ until it encounters another /. After you get the matches in an array you can map it out and replace all the filename/ with '':
let a = /filename\/[^\/]+/g;
let b = 'Test/hello/filename/12345678/first Hddhkhd Hdhal filename/1212abcd/second';
let c = b.match(a).map(x=>x.replace('filename/',''));
console.log(c);
For explanation check this REGEX
var arr = "Test/hello/filename/12345678/first Hddhkhd Hdhal filename/1212abcd/second".match(/(?<=filename\/)(.*?)(?=\/)/g);
console.log(arr)
OR
For unsupported Lookbehinds browser use Array#map after regex
var arr = "Test/hello/filename/12345678/first Hddhkhd Hdhal filename/1212abcd/second".match(/filename\/(.*?)\//g).map(i=> i.split('/')[1]);
console.log(arr)
I have an input string
var input = 'J1,J2, J3';
I'm using the following pattern to extract the group value
var regex = /(,? ?(?<JOUR>J[0-9]+)+)/
while extracting the groups as below
var match = regex.exec(input);
match.groups contains only one group. How can i get all the groups J1 J2 and J3 from the input string ?
You can use .match of string to get groups
input.match(/J[0-9]+/g)
var input = 'J1,J2, J3';
console.log(input.match(/J[0-9]+/gi))
Match a capital J, then any amount of numbers:
var input = 'J1,J2, J3';
var regex = /J[0-9]+/g;
console.log(input.match(regex));
You could take the start of the string and the comma with an optional space into account and remove the outer group to use only 1 capturing group. To prevent the digits being part of a larger word you might add a word boundary \b
Note that you can omit the quantifier+ after )+ because that will repeat the group and will give you only the value of the last iteration.
(?:^|[,-] ?)(?<JOUR>J[0-9]+)\b
(?:^|[,-] ?) Match either the start of the string or comma or hyphen with an optional space
(?<JOUR>J[0-9]+) Named capture group JOUR, match J and then 1+ digits
\b Word boundary to prevent the digits being part of a larger word
Regex demo
Use exec to get the value from the first capturing group
const regex = /(?:^|, ?)(?<JOUR>J[0-9]+\b)+/g;
let m;
[
"J1, J2, J3 - J5, J7",
"J1,J2, J3"
].forEach(str => {
while ((m = regex.exec(str)) !== null) {
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
console.log(m[1]);
}
});
const input = 'J1,J2, J3,J10';
const regexJfollowOneDigit = /(J\d{1}(?!\d))/g
const regexJfollowOneOrMoreDigit = /(J\d+)/g
console.log(input.match(regexJfollowOneDigit))
console.log(input.match(regexJfollowOneOrMoreDigit))
I always have a hard time with regex..
I'm trying to select the text between (taking into acount the before and after)
'window.API=' and ';' //for window.API= '--API--';
and other cases like:
'window.img_cdn=' and ';' //for window.img_cdn= '--imgCDN--';
any tips on witch regex concepts I should use would be a great help!
If you want to capture the content between 'xx' you can use a regex like this:
'(.*?)'
working demo
For the sample text:
window.API= '--API--';
window.img_cdn= '--imgCDN--';
You will capture:
MATCH 1
1. [13-20] `--API--`
MATCH 2
1. [40-50] `--imgCDN--`
The javascript code you can use is:
var re = /'(.*?)'/g;
var str = 'window.API= \'--API--\';\nwindow.img_cdn= \'--imgCDN--\';';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
On the other hand, if you specifically want to capture the content for only those entries, then you can use this regex:
window\.(?:API|img_cdn).*?'(.*?)'
If you want to match any text between a <some string>= sign and a semicolon, here you go:
(?:[\w\.]+\s*=\s')(.+)(?:';)$
This regex pattern will match a full string if an escaped apostrophe is present in the string: //for window.img_cdn = '--imgCDN and \'semicolon\'--';
JavaScript code:
var re = /(?:[\w\.]+\s*=\s')(.+)(?:';)$/gm;
var str = '//for window.img_cdn= \'--imgCDN--\';\n//for window.img_cdn = \'--imgCDN and semicolon = ;;;--\';';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// view results
}
The required text is in the 1st captured group. In case there is a semicolon in the text you are looking for, you will correctly match it due to the $ anchor.
See demo here