Getting Multiple Matches with RegExp in JavaScript - javascript

I have a string like this:
`DateTime.now().setZone("America Blorp");`
This is my RegEx:
string.match(/DateTime\.(.*)[^)][(;]/)
How can I modify my RegEx so that I can get matches like this:
DateTime.now and DateTime.now.setZone.
I have tried to group matches like this
string.match(/DateTime\.(.*)([^)]*)([(;]*)/)
But I don't get the expected output. Can anyone please help me with this?
PS. I can only use match function, cannot use matchAll.
const string = `DateTime.now().setZone("America Blorp");`
console.log(
string.match(/DateTime\.(.*)[^)][(;]/)
)

You could match the format using 2 capture groups and concat the groups.
\b(DateTime\.now)\(\)(\.[^.()]+)\([^()]*\);
The pattern matches:
\b A word boundary to prevent a partial match
(DateTime\.now) Capture group 1, match DateTime.now
\(\) Match ()
(\.[^.()]+) Capture group 2, match . and 1+ times any char except . or ( and )
\([^()]*\); Match from ( till ) and ;
See a regex demo.
const regex = /\b(DateTime\.now)\(\)(\.[^.()]+)\([^()]*\);/;
const str = `DateTime.now().setZone("America Blorp");`;
const match = str.match(regex);
if (match) {
console.log(match[1] + match[2]);
}

Related

Regex match hashtag with exceptions

I have the current expression:
/(?<![http://|https://|#])#([\d\w]+[^\d\s<]+[^\s<>]+)/g
However it's not compatible to run on Safari. I'm trying to handle the following cases:
#tag => match
#123 => no match
#32bit => match
##tag => no match
http://google.com/#/test => no match
tag##tag => no match
tag#tag => no match
<p>#tag</p> => match only #tag
#tag. => match only #tag
tag## => no match
tag# => no match
this is a match #tag => only #tag
I wonder how I can make a character before the match result in a negative match. E.g. # and /.
Is there any alternative to negative look behind that is compatible with Safari?
Thanks in advance.
You might use a negated character class and a capture group, and make sure that there are not only digits.
Note that \w also matches \d
(?:^|[^\w#/])(#(?!\d+\b)\w+)\b
Explanation
(?: Non capture group
^ Assert the start of the string
| Or
[^\w#/] Match a single non word char other than # or /
) Close non capture group
( Capture group 1
# Match literally
(?!\d+\b) Negative lookahead, assert not only digits to the right followed by a word boundary
\w+ Match 1+ word characters
) Close group 1
\b A word boundary to prevent a partial word match
Regex demo
let regex = /(?:^|[^\w#/])(#(?!\d+\b)\w+)\b/;
[
"#tag",
"#123",
"#32bit",
"##tag",
"http://google.com/#/test",
"tag##tag",
"tag#tag",
"<p>#tag</p>",
"#tag.",
"tag##",
"tag#"
].forEach(s => {
const m = s.match(regex)
if (m) {
console.log(`${s} --> ${m[1]}`)
}
})
Using the matches in a replacement:
let regex = /((?:^|[^\w#/]))(#(?!\d+\b)\w+)\b/;
[
"#tag",
"#123",
"#32bit",
"##tag",
"http://google.com/#/test",
"tag##tag",
"tag#tag",
"<p>#tag</p>",
"#tag.",
"tag##",
"tag#",
"this is a match #tag"
].forEach(s => {
const m = s.match(regex)
if (m) {
console.log(s.replace(regex, "$1<span>$2</span>"))
}
})
If you use the following pattern the second matching group contains what you want.
^(<\w*>)?(#\w+[a-zA-Z])
This satisfies your test cases. Not sure though whether you want this or not.
It does't work on #123 but I forgot it and I'm now lazy to add it as a screenshot.
/^(?:[^#]*[^#\w])?(#[\w]*[a-zA-Z][\w]*).*$/g
If you only want to allow <tags> before the "#", you can insted use #kendle's solution for the first non-capture group (before the actual group starting with #).
(?:<\w*>)?
You can also achieve this, without a capture group, with this regex:
/(?<![#\w])#{1}(?!\d+\b)\w+/g
const regex = /(?<![#\w])#{1}(?!\d+\b)\w+/g;
const stringToTest = [
"#tag",
"#123",
"#32bit",
"##tag",
"http://google.com/#/test",
"tag##tag",
"tag#tag",
"<p>#tag</p>",
"#tag.",
"tag##",
"tag#",
"this is a match #tag",
];
stringToTest.forEach(str => {
const match = str.match(regex);
if (match) {
console.log(`${str} -> ${match[0]}`);
} else {
console.log(`${str} -> ${match}`);
}
});
Good luck !

js Remove a part from a parameter which doesnt fit a pattern

const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
let newClass = tweet.text.replace(/[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}-[^1-9a-zA-Z]{3}/g, '');
console.log(`Found match: ${newClass}`);
};
when tweet.text = "123.qwe.456 test" I still get the same output but I want to remove anything which doesnt fit the pattern
/[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/
any ideas?
You can use capture groups to extract exactly what gets matched in your string and then replace your original variable with this value. Something like
const regex = /([1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3})/
let match = tweet.text.match(regex)
tweet.text = match[1]
Instead of replace, you can get the match instead
\b[1-9a-zA-Z]{3}([-.])[1-9a-zA-Z]{3}\1[1-9a-zA-Z]{3}\b
Explanation
\b A word boundary
[1-9a-zA-Z]{3} Match 3 times any of the listed (Note that 1-9 does not match a 0)
([-.]) Capture in group 1 either an - or .
[1-9a-zA-Z]{3} Match 3 times any of the listed
\1 Back reference to group 1, match the same as captured in group 1
[1-9a-zA-Z]{3} Match 3 times any of the listed
\b A word boundary
Regex demo
const regex = /[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}-[1-9a-zA-Z]{3}/gm;
let m;
while ((m = regex.exec(tweet.text)) !== null) {
console.log(`Found match: ${m[0]}`);
figured the solution

regular expression replacement in JavaScript with some part remaining intact

I need to parse a string that comes like this:
-38419-indices-foo-7119-attributes-10073-bar
Where there are numbers followed by one or more words all joined by dashes. I need to get this:
[
0 => '38419-indices-foo',
1 => '7119-attributes',
2 => '10073-bar',
]
I had thought of attempting to replace only the dash before a number with a : and then using .split(':') - how would I do this? I don't want to replace the other dashes.
Imo, the pattern is straight-forward:
\d+\D+
To even get rid of the trailing -, you could go for
(\d+\D+)(?:-|$)
Or
\d+(?:(?!-\d|$).)+
You can see it here:
var myString = "-38419-indices-foo-7119-attributes-10073-bar";
var myRegexp = /(\d+\D+)(?:-|$)/g;
var result = [];
match = myRegexp.exec(myString);
while (match != null) {
// matched text: match[0]
// match start: match.index
// capturing group n: match[n]
result.push(match[1]);
match = myRegexp.exec(myString);
}
console.log(result);
// alternative 2
let alternative_results = myString.match(/\d+(?:(?!-\d|$).)+/g);
console.log(alternative_results);
Or a demo on regex101.com.
Logic
lazy matching using quantifier .*?
Regex
.*?((\d+)\D*)(?!-)
https://regex101.com/r/WeTzF0/1
Test string
-38419-indices-foo-7119-attributes-10073-bar-333333-dfdfdfdf-dfdfdfdf-dfdfdfdfdfdf-123232323-dfsdfsfsdfdf
Matches
Further steps
You need to split from the matches and insert into your desired array.

Extract a part of a regex name

Examples of filenames
FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt
FDIP_fr-fr-nn_Text_v1_YYYYMMDD_SequenceNumber.txt
FDIP_de-de-nn_Text_v1_YYYYMMDD_SequenceNumber.txt
REGEX is FDIP_([a-z]{2}-[A-Z]{2}-[a-z]{2})_Text_v1_[0-9]{8}_[0-9]{14}.txt
The only part I need is the translation code which is 'en-gb', 'fr-fr' , 'de-de.
How do I extract just that part of the filename?
Modified the regex little bit to match the numbers and text. You can play around here
Explanation
to capture a group you need to wrap the regex into () this will capture as a group.
to do the named capturing you can (?<name_of_group>) and then you can access by name.
Here goes the matching process.
[a-z]{2} match 2 char from a-z
[a-zA-Z0-9] match any char of a-z or A-Z or 0-9
g means global flag i.e. match all.
i means ignore case.
var r = /FDIP_([a-z]{2}-[A-Z]{2})-[a-z]{2}_Text_v1_[0-9A-Z]{8}_[A-Z0-9]{14}.txt/gi;
let t = 'FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt';
let dd = r.exec(t);
console.log(dd[1]);
This is example of group capturing
See the name in the regex and the object destructing name is matching.
const { groups: { language } } = /FDIP_(?<language>[a-z]{2}-[A-Z]{2})-[a-z]{2}_Text_v1_[0-9A-Z]{8}_[A-Z0-9]{14}.txt/gi.exec('FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt');
console.log(language);
To solve your problem, you should:
Fix your regex:
FDIP_([a-z]{2}-[A-Z]{2}-[a-z]{2})_Text_v1_[0-9]{8}_[0-9]{14}.txt
// to
FDIP_([a-z]{2}-[a-z]{2})-[a-z]{2}_Text_v1_[0-9]{8}_[0-9]{14}.txt
Use get value from first group by using regex.exec function
const fileNames = [
'FDIP_en-gb-nn_Text_v1_20190101_12345678901234.txt',
'FDIP_fr-fr-nn_Text_v1_20200202_12345678901234.txt',
'FDIP_de-de-nn_Text_v1_20180808_12345678901234.txt']
const cultureNames = fileNames.map(name => {
const matched = /FDIP_([a-z]{2}-[a-z]{2})-[a-z]{2}_Text_v1_[0-9]{8}_[0-9]{14}.txt/.exec(name)
return matched && matched[1]
})
console.log(cultureNames)
Change FDIP_([a-z]{2}-[A-Z]{2}-[a-z]{2})_Text_v1_[0-9]{8}_[0-9]{14}.txt
to
let pattern = /FDIP_([a-z]{2}-[a-z]{2})-[a-z]{2}_Text_v1_[\w]{8}_[\w]{14}.txt/;
var str = 'FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt';
console.log(str.match(pattern)[1]);

Get the string between the last 2 / in regex in javascript

How can I get the strings between last 2 slashes in regex in javascript?
for example:
stackoverflow.com/questions/ask/index.html => "ask"
http://regexr.com/foo.html?q=bar => "regexr.com"
https://www.w3schools.com/icons/default.asp => "icons"
You can use /\/([^/]+)\/[^/]*$/; [^/]*$ matches everything after the last slash, \/([^/]+)\/ matches the last two slashes, then you can capture what is in between and extract it:
var samples = ["stackoverflow.com/questions/ask/index.html",
"http://regexr.com/foo.html?q=bar",
"https://www.w3schools.com/icons/default.asp"]
console.log(
samples.map(s => s.match(/\/([^/]+)\/[^/]*$/)[1])
)
You can solve this by using split().
let a = 'stackoverflow.com/questions/ask/index.html';
let b = 'http://regexr.com/foo.html?q=bar';
let c = 'https://www.w3schools.com/icons/default.asp';
a = a.split('/')
b = b.split('/')
c = c.split('/')
indexing after split()
console.log(a[a.length-2])
console.log(b[b.length-2])
console.log(c[c.length-2])
I personally do not recommend using regex. Because it is hard to maintain
I believe that will do:
[^\/]+(?=\/[^\/]*$)
[^\/]+ This matches all chars other than /. Putting this (?=\/[^\/]*$) in the sequence looks for the pattern that comes before the last /.
var urls = [
"stackoverflow.com/questions/ask/index.html",
"http://regexr.com/foo.html?q=bar",
"https://www.w3schools.com/icons/default.asp"
];
urls.forEach(url => console.log(url.match(/[^\/]+(?=\/[^\/]*$)/)[0]));
You can use (?=[^/]*\/[^/]*$)(.*?)(?=\/[^/]*$). You can test it here: https://www.regexpal.com/
The format of the regex is: (positive lookahead for second last slash)(.*?)(positive lookahead for last slash).
The (.*?) is a lazy match for what's between the slashes.
references:
Replace second to last "/" character in URL with a '#'
RegEx that will match the last occurrence of dot in a string

Categories