javascript multiple regex matches - javascript

Given the string below
[NeMo (PROD)] 10.10.100.100 (EFA-B-3) [Brocade FC-Switch ] Sensor:
Power Supply #1 (SNMP Custom Table) Down (No Such Name (SNMP error #
2))
I try to get multiple matches to extract the following values:
var system = "PROD";
var ip = "10.10.100.100";
var location = "EFA-B-3";
var device = "Brocade FC-Switch";
var sensor = "Sensor: Power Supply #1";
var sensorArt = "SNMP Custom Table";
var sensorState = "Down";
var errorMsg = "No Such Name (SNMP error # 2)";
Since I am a beginner with regex I tried to define some "rules":
Extract first value within the first round brackets e.g PROD
Extract the value between the first closing square bracket and
second opening round bracket e.g. 10.10.100.100
Extract the value within the second round brackets e.g EFA-B-3
Extract the value within the second square brackets e.g. Brocade
FC-Switch
Extract the value between the second closing square bracket and the
third opening round bracket e.g. Sensor: Power Supply #1
Extract the value given within the third round brackets e.g. SNMP
Custom Table
Extract the value between the third closing round bracket and the
fourth opening round bracket e.g. Down
Extract the value within the fourth round brackets e.g. No Such Name
(SNMP error # 2)
Using the webpage https://scriptular.com/ I tried to achieve my goal.
So far I managed to build the regex
(?=(([^)]+)))
which gives me my first match (rule 1). Somehow I fail to declare the regex to look between the brackets. What am I missing?

Since there is no way to define separators, the only way is to match the parts and capture them separately.
/\(([^()]+)\)]\s*(.*?)\s*\(([^()]*)\)\s*\[([^\][]*)]\s*(.*?)\s*\(([^()]+)\)\s*(.*?)\s*\((.*)\)/
See the regex demo.
Details
\( - a ( char
([^()]+) - Group 1: 1 or more chars other than ( and )
\)]\s* - )] and 0+ whitespaces
(.*?) - Group 2: any 0+ chars other than line break chars, as few as possible
\s*\( - 0+ whitespaces, (
([^()]*) - Group 3: 1 or more chars other than ( and )
\)\s*\[ - ), 0+ whitespaces, [
([^\][]*) - Group 4: 1 or more chars other than [ and ]
]\s* - ] and 0+ whitespaces
(.*?) - Group 5: any 0+ chars other than line break chars, as few as possible
\s*\( - 0+ whitespaces, (
([^()]+) - Group 6: 1 or more chars other than ( and )
\)\s* - ) and 0+ whitespaces
(.*?) - Group 7: any 0+ chars other than line break chars, as few as possible
\s*\( - 0+ whitespaces and (
(.*) - Group 8: any 0+ chars other than line break chars, as many as possible
\) - ) char.
ES6+ code snippet:
var s = "[NeMo (PROD)] 10.10.100.100 (EFA-B-3) [Brocade FC-Switch ] Sensor: Power Supply #1 (SNMP Custom Table) Down (No Such Name (SNMP error # 2))";
let [_, system, ip, location1, device, sensor, sensorArt, sensorState, errorMsg] = s.match(/\(([^()]+)\)]\s*(.*?)\s*\(([^()]*)\)\s*\[([^\][]*)]\s*(.*?)\s*\(([^()]+)\)\s*(.*?)\s*\((.*)\)/);
console.log(`System=${system}\nIP=${ip}\nLocation=${location1}\nDevice=${device}\nSensor=${sensor}\nSensorArt=${sensorArt}\nSensorState=${sensorState}\nErrorMsg=${errorMsg}`);
ES5:
var s = "[NeMo (PROD)] 10.10.100.100 (EFA-B-3) [Brocade FC-Switch ] Sensor: Power Supply #1 (SNMP Custom Table) Down (No Such Name (SNMP error # 2))";
var system, ip, location1, device, sensor, sensorArt, sensorState, errorMsg;
var rx = /\(([^()]+)\)]\s*(.*?)\s*\(([^()]*)\)\s*\[([^\][]*)]\s*(.*?)\s*\(([^()]+)\)\s*(.*?)\s*\((.*)\)/;
if (m = s.match(rx)) {
system = m[1];
ip = m[2];
location1=m[3];
device=m[4];
sensor=m[5];
sensorArt=m[6];
sensorState=m[7];
errorMsg=m[8];
}
console.log("System="+system+"\nIP="+ip+"\nLocation="+location1+"\nDevice="+device+"\nSensor="+sensor+"\nSensorArt="+sensorArt+"\nSensorState="+sensorState+"\nErrorMsg="+errorMsg);

Related

Javascript regex to find two characters between two delimitators

EDITED
I need to find two characters between '[' ']' and '/' '/' using Javascript.
I am using this regex:
([^.][/[string]]|\/string\/)|(\[(string))|(\/(string))| ((string)\])|((string)\/)
that gets two charactes but gets too one character.
The question is, how can I do to get just two characters?
Also I want to get exactly the two characters inside the string, I mean not just only the exact match.
Eg.
User input: dz
It must to find just exact matches that contains "dz", e.g. --> "dzone" but not "dazone". Currently I am getting matches with both strings, "dzone" and "dazone".
Demo: https://regex101.com/r/FEs6ib/1
You could optionally repeat any char except the delimiters between the delimiters them selves, and capture in a group what you want to keep.
If you want multiple matches for /dzone/dzone/ you could assert the last delimiter to the right instead of matching it.
The matches are in group 1 or group 2 where you can check for if they exist.
\/[^\/]*(dz)[^\/]*(?=\/)|\[[^\][]*(dz)[^\][]*(?=])
The pattern matches:
\/ Match /
[^\/]*(dz)[^\/]* Capture dz in group 1 between optional chars other than /
(?=\/) Positive lookahead, assert / to the right
| Or
\[ Match [
[^\][]*(dz)[^\][]* Capture dz in group 2 between optional chars other than [ and ]
-(?=]) Positive lookahead, assert ] to the right
Regex demo
This will match 1 occurrence of dz in the word. If you want to match the whole word, the capture group can be broadened to before and after the negated character class like:
\/([^\/]*dz[^\/]*)(?=\/)|\[([^\][]*dz[^\][]*)(?=])
Regex demo
const regex = /\/[^\/]*(dz)[^\/]*(?=\/)|\[[^\][]*(dz)[^\][]*(?=])/g;
[
"[dzone]",
"/dzone/",
"/dzone/dzone/",
"/testdztest/",
"[dazone]",
"/dazone/",
"dzone",
"dazone"
].forEach(s =>
console.log(
`${s} --> ${Array.from(s.matchAll(regex), m => m[2] ? m[2] : m[1])}`
)
);
If supported, you might also match all occurrences of dz between the delimiters using lookarounds with an infinite quantifier:
(?<=\/[^\/]*)dz(?=[^\/]*\/)|(?<=\[[^\][]*)dz(?=[^\][]*])
Regex demo
const regex = /(?<=\/[^\/]*)dz(?=[^\/]*\/)|(?<=\[[^\][]*)dz(?=[^\][]*])/g;
[
"[adzadzone]",
"[dzone]",
"/dzone/",
"/dzone/dzone/",
"/testdztest/",
"[dazone]",
"/dazone/",
"dzone",
"dazone"
].forEach(s => {
const m = s.match(regex);
if (m) {
console.log(`${s} --> ${s.match(regex)}`);
}
});

Regex Conditional Lookahead JavaScript

I just used regex101, to create the following regex.
([^,]*?)=(.*?)(?(?=, )(?:, )|(?:$))(?(?=[^,]*?=)(?:(?=[^,]*?=))|(?:$))
It seems to work perfectly for my use case of getting keys and values that are comma separated while still preserving commas in the values.
Problem is, I want to use this Regex in Node.js (JavaScript), but while writing this entire Regex in regex101, I had it set to PCRE (PHP).
It looks like JavaScript doesn't support Conditional Lookaheads ((?(?=...)()|()).
Is there a way to get this working in JavaScript?
Examples:
2 matches
group 1: id, group 2: 1
group 1: name, group 2: bob
id=1, name=bob
3 matches
group 1: id, group 2: 2
group 1: type, group 2: store
group 1: description, group 2: Hardwood Store
id=2, type=store, description=Hardwood Store
4 matches
group 1: id, group 2: 4
group 1: type, group 2: road
group 1: name, group 2: The longest road name, in the entire world, and universe, forever
group 1: built, group 2: 20190714
id=4, type=road, name=The longest road name, in the entire world, and universe, forever, built=20190714
3 matches
group 1: id, group 2: 3
group 1: type, group 2: building
group 1: builder, group 2: Random Name, and Other Person, with help from Final Person
id=3, type=building, builder=Random Name, and Other Person, with help from Final Person
You may use
/([^,=\s][^,=]*)=(.*?)(?=(?:,\s*)?[^,=]*=|$)/g
See the regex demo.
Details
([^,=\s][^,=]*) - Group 1:
[^,=\s] - a char other than ,, = and whitespace
[^,=]* - zero or more chars other than , and =
= - a = char
(.*?) - Group 2: any zero or more chars other than line break chars, as few as possible
(?=(?:,\s*)?[^,=]*=|$) - a positive lookahead that requires an optional sequence of , and 0+ whitespaces and then 0+ chars other than , and = and then a = or end of string immediately to the right of the current location
JS demo:
var strs = ['id=1, name=bob','id=2, type=store, description=Hardwood Store', 'id=4, type=road, name=The longest road name, in the entire world, and universe, forever, built=20190714','id=3, type=building, builder=Random Name, and Other Person, with help from Final Person']
var rx = /([^,=\s][^,=]*)=(.*?)(?=(?:,\s*)?[^,=]*=|$)/g;
for (var s of strs) {
console.log("STRING:", s);
var m;
while (m=rx.exec(s)) {
console.log(m[1], m[2])
}
}
Maybe, these expressions would be somewhat close to what you might want to design:
([^=\n\r]*)=\s*([^=\n\r]*)\s*(?:,|$)
or
\s*([^=\n\r]*)=\s*([^=\n\r]*)\s*(?:,|$)
not sure though.
DEMO
The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.
const regex = /\s*([^=\n\r]*)=\s*([^=\n\r]*)\s*(?:,|$)/gm;
const str = `id=3, type=building, builder=Random Name, and Other Person, with help from Final Person
id=4, type=road, name=The longest road name, in the entire world, and universe, forever, built=20190714
id=2, type=store, description=Hardwood Store
id=1, name=bob
`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
RegEx Circuit
jex.im visualizes regular expressions:
Yet another way to do it
\s*([^,=]*?)\s*=\s*((?:(?![^,=]*=)[\S\s])*)(?=[=,]|$)
https://regex101.com/r/J6SSGr/1
Readable version
\s*
( [^,=]*? ) # (1), Key
\s* = \s* # =
( # (2 start), Value
(?:
(?! [^,=]* = )
[\S\s]
)*
) # (2 end)
(?= [=,] | $ )
Ultimate PCRE version
\s*([^,=]*?)\s*=\s*((?:(?!\s*[^,=]*=)[\S\s])*(?<![,\s]))\s*(?=[=,\s]|$)
https://regex101.com/r/slfMR1/1
\s* # Wsp trim
( [^,=]*? ) # (1), Key
\s* = \s* # Wsp trim = Wsp trim
( # (2 start), Value
(?:
(?! \s* [^,=]* = )
[\S\s]
)*
(?<! [,\s] ) # Wsp trim
) # (2 end)
\s* # Wsp trim
(?= [=,\s] | $ ) # Field seperator

Filter version number from string in javascript?

I found some threads about extracting version number from a string on here but none that does exactly what I want.
How can I filter out the following version numbers from a string with javascript/regex?
Title_v1_1.00.mov filters 1
v.1.0.1-Title.mp3 filters 1.0.1
Title V.3.4A. filters 3.4A
V3.0.4b mix v2 filters 3.0.4b
So look for the first occurrence of: "v" or "v." followed by a digit, followed by digits, letters or dots until either the end of the string or until a whitepace occurs or until a dot (.) occurs with no digit after it.
As per the comments, to match the first version number in the string you could use a capturing group:
^.*?v\.?(\d+(?:\.\d+[a-z]?)*)
Regex demo
That will match:
^ Assert the start of the string
.*? Match 0+ any character non greedy
v\.? Match v followed by an optional dot
( Capturing group
\d+ Match 1+ digits
(?: Non capturing group
\.\d+[a-z]? Match a dot, 1+ digits followed by an optional character a-z
)* Close non capturing group and repeat 0+ times
) Close capturing group
If the character like A in V.3.4A can only be in the last part, you could use:
^.*?v\.?(\d+(?:\.\d+)*[a-z]?)
const strings = [
"Title_v1_1.00.mov filters 1",
"v.1.0.1-Title.mp3 filters 1.0.1",
"Title V.3.4A. filters 3.4A",
"V3.0.4b mix v2 filters 3.0.4b"
];
let pattern = /^.*?v\.?(\d+(?:\.\d+[a-z]?)*)/i;
strings.forEach((s) => {
console.log(s.match(pattern)[1]);
});
Details:
v - character "v"
(?:\.)? - matches 1 or 0 repetition of "."
Version capturing group
[0-9a-z\.]* - Matches alphanumeric and "." character
[0-9a-z] - ensures that version number don't ends with "."
You can use RegExp.exec() method to extract matches from string one by one.
const regex = /v(?:\.?)([0-9a-z\.]*[0-9a-z]).*/gi;
let str = [
"Title_v1_1.00.mov filters 1",
"v.1.0.1-Title.mp3 filters 1.0.1",
"Title V.3.4A. filters 3.4A",
"V3.0.4b mix v2 filters 3.0.4b"
];
let versions = [];
let v; // variable to store match
for(let i = 0; i < str.length; i++) {
// Executes a check on str[i] to get the result of first capturing group i.e., our version number
if( (v = regex.exec(str[i])) !== null)
versions.push(v[1]); // appends the version number to the array
// If not found, then it checks again if there is a match present or not
else if(str[i].match(regex) !== null)
i--; // if match found then it loops over the same string again
}
console.log(versions);
var test = [
"Title_v1_1.00.mov filters 1",
"v.1.0.1-Title.mp3 filters 1.0.1",
"Title V.3.4A. filters 3.4A",
"V3.0.4b mix v2 filters 3.0.4b",
];
console.log(test.map(function (a) {
return a.match(/v\.?([0-9a-z]+(?:\.[0-9a-z]+)*)/i)[1];
}));
Explanation:
/ # regex delimiter
v # letter v
\.? # optional dot
( # start group 1, it will contain the version number
[0-9a-z]+ # 1 or more alphanumeric
(?: # start non capture group
\. # a dot
[0-9a-z]+ # 1 or more alphanumeric
)* # end group, may appear 0 or more times
) # end group 1
/i # regex delimiter and flag case insensitive

What will be the regular expression for below requirement in javascript

Criteria:
any word that start with a and end with b having middle char digit. this word should not be on the line which start with char '#'
Given string:
a1b a2b a3b
#a4b a5b a6b
a7b a8b a9b
Expected output:
a1b
a2b
a3b
a7b
a8b
a9b
regex: ?i need it for javascipt.
So far tried below thing:
var text_content =above_mention_content
var reg_exp = /^[^#]?a[0-9]b/gmi;
var matched_text = text_content.match(reg_exp);
console.log(matched_text);
Getting below output:
[ 'a1b', ' a7b' ]
Your /^[^#]?a[0-9]b/gmi will match multiple occurrences of the pattern matching the start of line, then 1 or 0 chars other than #, then a, digit and b. No checking for a whole word, nor actually matching words farther than at the beginning of a string.
You may use a regex that will match lines starting with # and match and capture the words you need in other contexts:
var s = "a1b a2b a3b\n#a4b a5b a6b\n a7b a8b a9b";
var res = [];
s.replace(/^[^\S\r\n]*#.*|\b(a\db)\b/gm, function($0,$1) {
if ($1) res.push($1);
});
console.log(res);
Pattern details:
^ - start of a line (as m multiline modifier makes ^ match the line start)
[^\S\r\n]* - 0+ horizontal whitespaces
#.* - a # and any 0+ chars up to the end of a line
| - or
\b - a leading word boundary
(a\db) - Group 1 capturing a, a digit, a b
\b - a trailing word boundary.
Inside the replace() method, a callback is used where the res array is populated with the contents of Group 1 only.
I would suggest to use 2 reg ex:
First Reg ex fetches the non-hashed lines:
^[^#][a\db\s]+
and then another reg ex for fetching individual words(from each line):
^a\db\s

Retrieve BSR and category from string with RegExp

When I parse Amazon products I get this such of string.
"#19 in Home Improvements (See top 100)"
I figured how to retrieve BSR number which is /#\d*/
But have no idea how to retrieve Category which is going after in and end until brackets (See top 100).
I suggest
#(\d+)\s+in\s+([^(]+?)\s*\(
See the regex demo
var re = /#(\d+)\s+in\s+([^(]+?)\s*\(/;
var str = '#19 in Home Improvements (See top 100)';
var m = re.exec(str);
if (m) {
console.log(m[1]);
console.log(m[2]);
}
Pattern details:
# - a hash
(\d+) - Group 1 capturing 1 or more digits
\s+in\s+ - in enclosed with 1 or more whitespaces
([^(]+?) - Group 2 capturing 1 or more chars other than ( as few as possible before th first...
\s*\( - 0+ whitespaces and a literal (.

Categories