Javascript regex to find two characters between two delimitators - javascript

EDITED
I need to find two characters between '[' ']' and '/' '/' using Javascript.
I am using this regex:
([^.][/[string]]|\/string\/)|(\[(string))|(\/(string))| ((string)\])|((string)\/)
that gets two charactes but gets too one character.
The question is, how can I do to get just two characters?
Also I want to get exactly the two characters inside the string, I mean not just only the exact match.
Eg.
User input: dz
It must to find just exact matches that contains "dz", e.g. --> "dzone" but not "dazone". Currently I am getting matches with both strings, "dzone" and "dazone".
Demo: https://regex101.com/r/FEs6ib/1

You could optionally repeat any char except the delimiters between the delimiters them selves, and capture in a group what you want to keep.
If you want multiple matches for /dzone/dzone/ you could assert the last delimiter to the right instead of matching it.
The matches are in group 1 or group 2 where you can check for if they exist.
\/[^\/]*(dz)[^\/]*(?=\/)|\[[^\][]*(dz)[^\][]*(?=])
The pattern matches:
\/ Match /
[^\/]*(dz)[^\/]* Capture dz in group 1 between optional chars other than /
(?=\/) Positive lookahead, assert / to the right
| Or
\[ Match [
[^\][]*(dz)[^\][]* Capture dz in group 2 between optional chars other than [ and ]
-(?=]) Positive lookahead, assert ] to the right
Regex demo
This will match 1 occurrence of dz in the word. If you want to match the whole word, the capture group can be broadened to before and after the negated character class like:
\/([^\/]*dz[^\/]*)(?=\/)|\[([^\][]*dz[^\][]*)(?=])
Regex demo
const regex = /\/[^\/]*(dz)[^\/]*(?=\/)|\[[^\][]*(dz)[^\][]*(?=])/g;
[
"[dzone]",
"/dzone/",
"/dzone/dzone/",
"/testdztest/",
"[dazone]",
"/dazone/",
"dzone",
"dazone"
].forEach(s =>
console.log(
`${s} --> ${Array.from(s.matchAll(regex), m => m[2] ? m[2] : m[1])}`
)
);
If supported, you might also match all occurrences of dz between the delimiters using lookarounds with an infinite quantifier:
(?<=\/[^\/]*)dz(?=[^\/]*\/)|(?<=\[[^\][]*)dz(?=[^\][]*])
Regex demo
const regex = /(?<=\/[^\/]*)dz(?=[^\/]*\/)|(?<=\[[^\][]*)dz(?=[^\][]*])/g;
[
"[adzadzone]",
"[dzone]",
"/dzone/",
"/dzone/dzone/",
"/testdztest/",
"[dazone]",
"/dazone/",
"dzone",
"dazone"
].forEach(s => {
const m = s.match(regex);
if (m) {
console.log(`${s} --> ${s.match(regex)}`);
}
});

Related

Regex match hashtag with exceptions

I have the current expression:
/(?<![http://|https://|#])#([\d\w]+[^\d\s<]+[^\s<>]+)/g
However it's not compatible to run on Safari. I'm trying to handle the following cases:
#tag => match
#123 => no match
#32bit => match
##tag => no match
http://google.com/#/test => no match
tag##tag => no match
tag#tag => no match
<p>#tag</p> => match only #tag
#tag. => match only #tag
tag## => no match
tag# => no match
this is a match #tag => only #tag
I wonder how I can make a character before the match result in a negative match. E.g. # and /.
Is there any alternative to negative look behind that is compatible with Safari?
Thanks in advance.
You might use a negated character class and a capture group, and make sure that there are not only digits.
Note that \w also matches \d
(?:^|[^\w#/])(#(?!\d+\b)\w+)\b
Explanation
(?: Non capture group
^ Assert the start of the string
| Or
[^\w#/] Match a single non word char other than # or /
) Close non capture group
( Capture group 1
# Match literally
(?!\d+\b) Negative lookahead, assert not only digits to the right followed by a word boundary
\w+ Match 1+ word characters
) Close group 1
\b A word boundary to prevent a partial word match
Regex demo
let regex = /(?:^|[^\w#/])(#(?!\d+\b)\w+)\b/;
[
"#tag",
"#123",
"#32bit",
"##tag",
"http://google.com/#/test",
"tag##tag",
"tag#tag",
"<p>#tag</p>",
"#tag.",
"tag##",
"tag#"
].forEach(s => {
const m = s.match(regex)
if (m) {
console.log(`${s} --> ${m[1]}`)
}
})
Using the matches in a replacement:
let regex = /((?:^|[^\w#/]))(#(?!\d+\b)\w+)\b/;
[
"#tag",
"#123",
"#32bit",
"##tag",
"http://google.com/#/test",
"tag##tag",
"tag#tag",
"<p>#tag</p>",
"#tag.",
"tag##",
"tag#",
"this is a match #tag"
].forEach(s => {
const m = s.match(regex)
if (m) {
console.log(s.replace(regex, "$1<span>$2</span>"))
}
})
If you use the following pattern the second matching group contains what you want.
^(<\w*>)?(#\w+[a-zA-Z])
This satisfies your test cases. Not sure though whether you want this or not.
It does't work on #123 but I forgot it and I'm now lazy to add it as a screenshot.
/^(?:[^#]*[^#\w])?(#[\w]*[a-zA-Z][\w]*).*$/g
If you only want to allow <tags> before the "#", you can insted use #kendle's solution for the first non-capture group (before the actual group starting with #).
(?:<\w*>)?
You can also achieve this, without a capture group, with this regex:
/(?<![#\w])#{1}(?!\d+\b)\w+/g
const regex = /(?<![#\w])#{1}(?!\d+\b)\w+/g;
const stringToTest = [
"#tag",
"#123",
"#32bit",
"##tag",
"http://google.com/#/test",
"tag##tag",
"tag#tag",
"<p>#tag</p>",
"#tag.",
"tag##",
"tag#",
"this is a match #tag",
];
stringToTest.forEach(str => {
const match = str.match(regex);
if (match) {
console.log(`${str} -> ${match[0]}`);
} else {
console.log(`${str} -> ${match}`);
}
});
Good luck !

Getting Multiple Matches with RegExp in JavaScript

I have a string like this:
`DateTime.now().setZone("America Blorp");`
This is my RegEx:
string.match(/DateTime\.(.*)[^)][(;]/)
How can I modify my RegEx so that I can get matches like this:
DateTime.now and DateTime.now.setZone.
I have tried to group matches like this
string.match(/DateTime\.(.*)([^)]*)([(;]*)/)
But I don't get the expected output. Can anyone please help me with this?
PS. I can only use match function, cannot use matchAll.
const string = `DateTime.now().setZone("America Blorp");`
console.log(
string.match(/DateTime\.(.*)[^)][(;]/)
)
You could match the format using 2 capture groups and concat the groups.
\b(DateTime\.now)\(\)(\.[^.()]+)\([^()]*\);
The pattern matches:
\b A word boundary to prevent a partial match
(DateTime\.now) Capture group 1, match DateTime.now
\(\) Match ()
(\.[^.()]+) Capture group 2, match . and 1+ times any char except . or ( and )
\([^()]*\); Match from ( till ) and ;
See a regex demo.
const regex = /\b(DateTime\.now)\(\)(\.[^.()]+)\([^()]*\);/;
const str = `DateTime.now().setZone("America Blorp");`;
const match = str.match(regex);
if (match) {
console.log(match[1] + match[2]);
}

Regex match a sequence of numbers and a character . between them

So I need to get all ${{1.33.98}} strings from a string.
new RegExp('\\$\\{\\{(.*?)\\}\\}', 'g'); doesnt work well on case like:
${{1.33.98}${{2.44.1}} - should match only ${{2.44.1}} because ${{1.33.98} is missing } at this example.
So it shouldn't match if string missing any of the two {{ or two }} or $.
Between {{ and }} can be only a sequence of numbers separated by a dot - ex. 4.23.4545
Thanks
You match unwelcome values because . can match any char, and thus it matches any chars from the leftmost {{ to the first }} to the right of {{.
You may use
/\${{(\d[\d.]*)}}/g
Or, if the dot-separated number format is important
/\${{(\d+(?:\.\d+)*)}}/g
See this regex demo and this regex demo.
Note that if the strings are prevalidated, and you are sure there are no { and } inside ${{ and }}, you may even use [^{}]* instead of \d[\d.]*:
/\${{([^{}]*)}}/g
So, you either capture
\d[\d.]* - a digit and then 0 or more digits and dots
or
\d+(?:\.\d+)* - 1+ digits and then 0+ repetitions of . and 1+ digits.
JS demo:
const s = '${{1.33.98}${{2.44.1}} ${{1.24.52.44.1}}';
let m = [...s.matchAll(/\${{(\d[\d.]*)}}/g)];
console.log(Array.from(m, x => x[1]));
For legacy ES versions:
var s = '${{1.33.98}${{2.44.1}} ${{1.24.52.44.1}}';
var rx = /\${{(\d[\d.]*)}}/g, results = [], m;
while (m=rx.exec(s)) {
results.push(m[1]);
}
console.log(results);

Filter version number from string in javascript?

I found some threads about extracting version number from a string on here but none that does exactly what I want.
How can I filter out the following version numbers from a string with javascript/regex?
Title_v1_1.00.mov filters 1
v.1.0.1-Title.mp3 filters 1.0.1
Title V.3.4A. filters 3.4A
V3.0.4b mix v2 filters 3.0.4b
So look for the first occurrence of: "v" or "v." followed by a digit, followed by digits, letters or dots until either the end of the string or until a whitepace occurs or until a dot (.) occurs with no digit after it.
As per the comments, to match the first version number in the string you could use a capturing group:
^.*?v\.?(\d+(?:\.\d+[a-z]?)*)
Regex demo
That will match:
^ Assert the start of the string
.*? Match 0+ any character non greedy
v\.? Match v followed by an optional dot
( Capturing group
\d+ Match 1+ digits
(?: Non capturing group
\.\d+[a-z]? Match a dot, 1+ digits followed by an optional character a-z
)* Close non capturing group and repeat 0+ times
) Close capturing group
If the character like A in V.3.4A can only be in the last part, you could use:
^.*?v\.?(\d+(?:\.\d+)*[a-z]?)
const strings = [
"Title_v1_1.00.mov filters 1",
"v.1.0.1-Title.mp3 filters 1.0.1",
"Title V.3.4A. filters 3.4A",
"V3.0.4b mix v2 filters 3.0.4b"
];
let pattern = /^.*?v\.?(\d+(?:\.\d+[a-z]?)*)/i;
strings.forEach((s) => {
console.log(s.match(pattern)[1]);
});
Details:
v - character "v"
(?:\.)? - matches 1 or 0 repetition of "."
Version capturing group
[0-9a-z\.]* - Matches alphanumeric and "." character
[0-9a-z] - ensures that version number don't ends with "."
You can use RegExp.exec() method to extract matches from string one by one.
const regex = /v(?:\.?)([0-9a-z\.]*[0-9a-z]).*/gi;
let str = [
"Title_v1_1.00.mov filters 1",
"v.1.0.1-Title.mp3 filters 1.0.1",
"Title V.3.4A. filters 3.4A",
"V3.0.4b mix v2 filters 3.0.4b"
];
let versions = [];
let v; // variable to store match
for(let i = 0; i < str.length; i++) {
// Executes a check on str[i] to get the result of first capturing group i.e., our version number
if( (v = regex.exec(str[i])) !== null)
versions.push(v[1]); // appends the version number to the array
// If not found, then it checks again if there is a match present or not
else if(str[i].match(regex) !== null)
i--; // if match found then it loops over the same string again
}
console.log(versions);
var test = [
"Title_v1_1.00.mov filters 1",
"v.1.0.1-Title.mp3 filters 1.0.1",
"Title V.3.4A. filters 3.4A",
"V3.0.4b mix v2 filters 3.0.4b",
];
console.log(test.map(function (a) {
return a.match(/v\.?([0-9a-z]+(?:\.[0-9a-z]+)*)/i)[1];
}));
Explanation:
/ # regex delimiter
v # letter v
\.? # optional dot
( # start group 1, it will contain the version number
[0-9a-z]+ # 1 or more alphanumeric
(?: # start non capture group
\. # a dot
[0-9a-z]+ # 1 or more alphanumeric
)* # end group, may appear 0 or more times
) # end group 1
/i # regex delimiter and flag case insensitive

What will be the regular expression for below requirement in javascript

Criteria:
any word that start with a and end with b having middle char digit. this word should not be on the line which start with char '#'
Given string:
a1b a2b a3b
#a4b a5b a6b
a7b a8b a9b
Expected output:
a1b
a2b
a3b
a7b
a8b
a9b
regex: ?i need it for javascipt.
So far tried below thing:
var text_content =above_mention_content
var reg_exp = /^[^#]?a[0-9]b/gmi;
var matched_text = text_content.match(reg_exp);
console.log(matched_text);
Getting below output:
[ 'a1b', ' a7b' ]
Your /^[^#]?a[0-9]b/gmi will match multiple occurrences of the pattern matching the start of line, then 1 or 0 chars other than #, then a, digit and b. No checking for a whole word, nor actually matching words farther than at the beginning of a string.
You may use a regex that will match lines starting with # and match and capture the words you need in other contexts:
var s = "a1b a2b a3b\n#a4b a5b a6b\n a7b a8b a9b";
var res = [];
s.replace(/^[^\S\r\n]*#.*|\b(a\db)\b/gm, function($0,$1) {
if ($1) res.push($1);
});
console.log(res);
Pattern details:
^ - start of a line (as m multiline modifier makes ^ match the line start)
[^\S\r\n]* - 0+ horizontal whitespaces
#.* - a # and any 0+ chars up to the end of a line
| - or
\b - a leading word boundary
(a\db) - Group 1 capturing a, a digit, a b
\b - a trailing word boundary.
Inside the replace() method, a callback is used where the res array is populated with the contents of Group 1 only.
I would suggest to use 2 reg ex:
First Reg ex fetches the non-hashed lines:
^[^#][a\db\s]+
and then another reg ex for fetching individual words(from each line):
^a\db\s

Categories