Regexp, capture between parentheses, javascript

Regexp, capture between parentheses, javascript - javascript

I have regexp that extracts values between parentheses.
It's working most of the time but not when it ends with a parentheses
var val = 'STR("ABC(t)")';
var regExp = /\(([^)]+)\)/;.
var matches = regExp.exec(val);

console.log(matches[1]); //"ABC(t"
What I want is "ABC(t)".
Any ideas how I can modify my regexp to Achive this?
Update
The value is always inside the parentheses.
Some examples:
'ASD("123")'; => '123'
'ASD(123)'; => '123'
'ASD(aa(10)asda(459))'; => 'aa(10)asda(459)'
So first there is some text (always text). Then there is a (, and it always ends with a ). I want the value between.

You may use greedy dot matching inside Group 1 pattern: /\((.+)\)/. It will match the first (, then any 1+ chars other than linebreak symbols and then the last ) in the line.
var vals = ['STR("ABC(t)")', 'ASD("123")', 'ASD(123)', 'ASD(aa(10)asda(459))'];
var regExp = /\((.+)\)/;
for (var val of vals) {
var matches = regExp.exec(val);
console.log(val, "=>", matches[1]);
}
Answering the comment: If the texts to extract must be inside nested balanced parentheses, either a small parsing code, or XRegExp#matchRecursive can help. Since there are lots of parsing codes around on SO, I will provide XRegExp example:
var str = 'some text (num(10a ) ss) STR("ABC(t)")';
var res = XRegExp.matchRecursive(str, '\\(', '\\)', 'g');
console.log(res);
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/2.0.0/xregexp-all-min.js"></script>

Related

Getting the content between two characters

So I have this (example) string: 1234VAR239582358X
And I want to get what's in between VAR and X. I can easily replace it using .replace(/VAR.*X/, "replacement");
But, how would I get the /VAR.*X/as a variable?

I think what you are looking for might be
string.match(/VAR(.*)X/)[1]
The brackets around the .* mark a group. Those groups are returned inside the Array that match creates :)
If you want to only replace what's in between "VAR" and "X" it would be
string.replace(/VAR(.*)X/, "VAR" + "replacement" + "X");
Or more generic:
string.replace(/(VAR).*(X)/, "$1replacement$2");

You can try use the RegExp class, new RegExp(`${VAR}.*X`)

You can store it as variable like this,
const pattern = "VAR.*X";
const reg = new RegExp(pattern);
Then use,
.replace(reg, "replacement");

If you
want to get what's in between VAR and X
then using .* would do the job for the given example string.
But note that is will match until the end of the string, and then backtrack to the first occurrence of X it can match, being the last occurrence of the X char in the string and possible match too much.
If you want to match only the digits, you can match 1+ digits in a capture group using VAR(\d+)X
const regex = /VAR(\d+)X/;
const str = "1234VAR239582358X";
const m = str.match(regex);
if (m) {
let myVariable = m[1];
console.log(myVariable);
}
Or you can match until the first occurrence of an X char using a negated character class VAR([^\r\nX]+)X
const regex = /VAR([^\r\nX]+)X/;
const str = "1234VAR239582358X";
const m = str.match(regex);
if (m) {
let myVariable = m[1];
console.log(myVariable);
}

JavaScript Regular Expression with special characters [duplicate]

I have this code to highlight words that exist in an array everything works fine except it didn't highlight the words that contain '.'
spansR[i].innerHTML = t[i].replace(new RegExp(wordsArray.join("|"),'gi'), function(c) {
return '<span style="color:red">'+c+'</span>';
});
I also tried to escape dot in each word
for(var r=0;r<wordsArray.length;r++){
if(wordsArray[r].includes('.')){
wordsArray[r] = wordsArray[r].replace(".", "\\.");
wordsArray[r] = '\\b'+wordsArray[r]+'\\b';
}
}
I also tried to change replace by those and non of them worked "replace(".", "\.")" , "replace(".", "\.")" , "replace(".", "/.")" , "replace('.','/.')" , "replace('.','/.')" .
This is a simplified test case (I want to match 'free.' )
<!DOCTYPE html>
<html>
<body>
<button onclick="myFunction()">Try it</button>
<p id="demo"></p>
<script>
function myFunction() {
var re = "\\bfree\\.\\b";
var str = "The best things in life are free.";
var patt = new RegExp(re);
var res = patt.test(str);
document.getElementById("demo").innerHTML = res;
}
</script>
</body>
</html>

Implement an unambiguous word boundary in JavaScript.
Here is a version for JS that does not support ECMAScript 2018 and newer:
var t = "Some text... firas and firas. but not firass ... Also, some shop and not shopping";
var wordsArray = ['firas', 'firas.', 'shop'];
wordsArray.sort(function(a, b){
return b.length - a.length;
});
var regex = new RegExp("(^|\\W)(" + wordsArray.map(function(x) {
return x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')
}).join("|") + ")(?!\\w)",'gi');
console.log( t.replace(regex, '$1<span style="color:red">$2</span>') );
Here, the regex will look like /(^|\W)(firas\.|firas|shop)(?!\w)/gi, see demo. The (^|\W) captures into Group 1 ($1) start of string or a non-word char, then there is a second capturing group that catures the term in question and (?!\w) negative lookahead matches a position that is not immediately followed with a word char.
The wordsArray.sort is important, as without it, the shorter words with the same beginning might "win" before the longer ones appear.
The .replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') is a must to escape special chars in the search terms.
A variation for JS environments that support lookbehinds:
let t = "Some text... firas and firas. but not firass ... Also, some shop and not shopping";
let wordsArray = ['firas', 'firas.', 'shop'];
wordsArray.sort((a, b) => b.length - a.length );
let regex = new RegExp(String.raw`(?<!\w)(?:${wordsArray.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join("|")})(?!\w)`,'gi');
console.log( t.replace(regex, '<span style="color:red">$&</span>') );
The regex will look like /(?<!\w)(?:firas\.|firas|shop)(?!\w)/gi, see demo. Here, (?<!\w) negative lookbehind matches a location that is not immediately preceded with a word char. This also makes capturing group redundant and I replaced it with a non-capturing one, (?:...), and the replacement pattern now contains just one placeholder, $&, that inserts the whole match.

Here is your solution:
Replace this:
new RegExp(wordsArray.join("|"),'gi')
With this:
new RegExp(wordsArray.join("|"),'gi').replace(/\./g,'\\.')
Example :
['javascript', 'firas.', 'regexp'].join("|").replace(/\./g,'\\.')
Will print
javascript|firas\.|regexp
Which is the regular expression you are looking for, with the escaped dot. It will match firas. but it will not match firas, as you specifically asked in your last comment

Extract a part of a regex name

Examples of filenames
FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt
FDIP_fr-fr-nn_Text_v1_YYYYMMDD_SequenceNumber.txt
FDIP_de-de-nn_Text_v1_YYYYMMDD_SequenceNumber.txt
REGEX is FDIP_([a-z]{2}-[A-Z]{2}-[a-z]{2})_Text_v1_[0-9]{8}_[0-9]{14}.txt
The only part I need is the translation code which is 'en-gb', 'fr-fr' , 'de-de.
How do I extract just that part of the filename?

Modified the regex little bit to match the numbers and text. You can play around here
Explanation
to capture a group you need to wrap the regex into () this will capture as a group.
to do the named capturing you can (?<name_of_group>) and then you can access by name.
Here goes the matching process.
[a-z]{2} match 2 char from a-z
[a-zA-Z0-9] match any char of a-z or A-Z or 0-9
g means global flag i.e. match all.
i means ignore case.
var r = /FDIP_([a-z]{2}-[A-Z]{2})-[a-z]{2}_Text_v1_[0-9A-Z]{8}_[A-Z0-9]{14}.txt/gi;
let t = 'FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt';
let dd = r.exec(t);
console.log(dd[1]);
This is example of group capturing
See the name in the regex and the object destructing name is matching.
const { groups: { language } } = /FDIP_(?<language>[a-z]{2}-[A-Z]{2})-[a-z]{2}_Text_v1_[0-9A-Z]{8}_[A-Z0-9]{14}.txt/gi.exec('FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt');
console.log(language);

To solve your problem, you should:
Fix your regex:
FDIP_([a-z]{2}-[A-Z]{2}-[a-z]{2})_Text_v1_[0-9]{8}_[0-9]{14}.txt
// to
FDIP_([a-z]{2}-[a-z]{2})-[a-z]{2}_Text_v1_[0-9]{8}_[0-9]{14}.txt
Use get value from first group by using regex.exec function
const fileNames = [
'FDIP_en-gb-nn_Text_v1_20190101_12345678901234.txt',
'FDIP_fr-fr-nn_Text_v1_20200202_12345678901234.txt',
'FDIP_de-de-nn_Text_v1_20180808_12345678901234.txt']
const cultureNames = fileNames.map(name => {
const matched = /FDIP_([a-z]{2}-[a-z]{2})-[a-z]{2}_Text_v1_[0-9]{8}_[0-9]{14}.txt/.exec(name)
return matched && matched[1]
})
console.log(cultureNames)

Change FDIP_([a-z]{2}-[A-Z]{2}-[a-z]{2})_Text_v1_[0-9]{8}_[0-9]{14}.txt
to
let pattern = /FDIP_([a-z]{2}-[a-z]{2})-[a-z]{2}_Text_v1_[\w]{8}_[\w]{14}.txt/;
var str = 'FDIP_en-gb-nn_Text_v1_YYYYMMDD_SequenceNumber.txt';
console.log(str.match(pattern)[1]);

Extraction of string using regex in javascript

I have a string of the following format:
"hello(%npm%)hi"
My goal is to split the string into three parts
a) hello
b) (%npm%)
c) hi
I am using regex as follows:
var myString = "hello(%npm%)hi".match(/[a-z]*/);
var backdtring = "hello(%npm%)hi".match(/\)[a-z]*/);
var midstring = "hello(%npm%)hi".match(/\(\%[a-z]*\%\)/);
var res = backdtring.replace(")", "");
https://jsfiddle.net/1988/ff6aupmL/
I am trying in jsfiddle , where theres an error in the line:
var res = backdtring.replace(")", "");
"backdtring.replace is not a function" .
Whats wrong in the replace function above?
Update:
Also, have I used the best practices of regular expressions ?

As it has been mentioned in the comments, you are trying to use a String#replace method on an array, see the description of the return value of String#match:
An Array containing the entire match result and any parentheses-captured matched results; null if there were no matches.
To streamline tokenization, I'd rather use .split(/(\([^()]*\))/) to get all substrings in parentheses and the substrings that remain:
var s = "hello(%npm%)hi";
var res = s.split(/(\([^()]*\))/);
console.log(res);
Details:
(\([^()]*\)) - the pattern is enclosed with capturing group so as split could return both the substrings that match and those that do not match the pattern
\( -a literal (
[^()]* - 0+ chars other than ( and )
\) - a literal ).

Regex to grab strings between square brackets

I have the following string: pass[1][2011-08-21][total_passes]
How would I extract the items between the square brackets into an array? I tried
match(/\[(.*?)\]/);
var s = 'pass[1][2011-08-21][total_passes]';
var result = s.match(/\[(.*?)\]/);
console.log(result);
but this only returns [1].
Not sure how to do this.. Thanks in advance.

You are almost there, you just need a global match (note the /g flag):
match(/\[(.*?)\]/g);
Example: http://jsfiddle.net/kobi/Rbdj4/
If you want something that only captures the group (from MDN):
var s = "pass[1][2011-08-21][total_passes]";
var matches = [];
var pattern = /\[(.*?)\]/g;
var match;
while ((match = pattern.exec(s)) != null)
{
matches.push(match[1]);
}
Example: http://jsfiddle.net/kobi/6a7XN/
Another option (which I usually prefer), is abusing the replace callback:
var matches = [];
s.replace(/\[(.*?)\]/g, function(g0,g1){matches.push(g1);})
Example: http://jsfiddle.net/kobi/6CEzP/

var s = 'pass[1][2011-08-21][total_passes]';
r = s.match(/\[([^\]]*)\]/g);
r ; //# => [ '[1]', '[2011-08-21]', '[total_passes]' ]
example proving the edge case of unbalanced [];
var s = 'pass[1]]][2011-08-21][total_passes]';
r = s.match(/\[([^\]]*)\]/g);
r; //# => [ '[1]', '[2011-08-21]', '[total_passes]' ]

add the global flag to your regex , and iterate the array returned .
match(/\[(.*?)\]/g)

I'm not sure if you can get this directly into an array. But the following code should work to find all occurences and then process them:
var string = "pass[1][2011-08-21][total_passes]";
var regex = /\[([^\]]*)\]/g;
while (match = regex.exec(string)) {
alert(match[1]);
}
Please note: i really think you need the character class [^\]] here. Otherwise in my test the expression would match the hole string because ] is also matches by .*.

'pass[1][2011-08-21][total_passes]'.match(/\[.+?\]/g); // ["[1]","[2011-08-21]","[total_passes]"]
Explanation
\[ # match the opening [
Note: \ before [ tells that do NOT consider as a grouping symbol.
.+? # Accept one or more character but NOT greedy
\] # match the closing ] and again do NOT consider as a grouping symbol
/g # do NOT stop after the first match. Do it for the whole input string.
You can play with other combinations of the regular expression
https://regex101.com/r/IYDkNi/1

[C#]
string str1 = " pass[1][2011-08-21][total_passes]";
string matching = #"\[(.*?)\]";
Regex reg = new Regex(matching);
MatchCollection matches = reg.Matches(str1);
you can use foreach for matched strings.

We Keep Coding

JavaScript is the programming language of the Web.

Regexp, capture between parentheses, javascript - javascript

Related

Getting the content between two characters

JavaScript Regular Expression with special characters [duplicate]

Extract a part of a regex name

Extraction of string using regex in javascript

Regex to grab strings between square brackets

Categories

Resources