How do i make a nested match in regex? - javascript

var matches = pattern.match(/\((.+?)\)/g);
matched against:
[e[1]]
returns "[e[1]", i assume it is a problem with nesting, how do I fix this?

If you are only interested in how to match substrings inside fixed delimiters, you may use XRegExp XRegExp.matchRecursive:
Returns an array of match strings between outermost left and right delimiters, or an array of objects with detailed match parts and position data. An error is thrown if delimiters are unbalanced within the data.
Since the delimiters are lost, but you know what they are you can later restore them in all the matches.
var str = '[e[1]] [ [e[[2]34]]]';
document.body.innerHTML = XRegExp.matchRecursive(str, '\\[', ']', 'g').map(x => '[' + x + ']');
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/2.0.0/xregexp-all-min.js"></script>

Related

Regex to get the text between two characters?

I want to replace a text after a forward slash and before a end parantheses excluding the characters.
My text:
<h3>notThisText/IWantToReplaceThis)<h3>
$('h3').text($('h3').text().replace(regEx, 'textReplaced'));
Wanted result after replace:
notThisText/textReplaced)
I have tried
regex = /([^\/]+$)+/ //replaces the parantheses as well
regex = \/([^\)]+) //replaces the slash as well
but as you can see in my comments neither of these excludes both the slash and the end parantheses. Can someone help?
A pattern like /(?<=\/)[^)]+(?=\))/ won't work in JS as its regex engine does not support a lookbehind construct. So, you should use one of the following solutions:
s.replace(/(\/)[^)]+(\))/, '$1textReplaced$2')
s.replace(/(\/)[^)]+(?=\))/, '$1textReplaced')
s.replace(/(\/)[^)]+/, '$1textReplaced')
s.replace(/\/[^)]+\)/, '/textReplaced)')
The (...) forms a capturing group that can be referenced to with $ + number, a backreference, from the replacement pattern. The first solution is consuming / and ), and puts them into capturing groups. If you need to match consecutive, overlapping matches, use the second solution (s.replace(/(\/)[^)]+(?=\))/, '$1textReplaced')). If the ) is not required at the end, the third solution (replace(/(\/)[^)]+/, '$1textReplaced')) will do. The last solution (s.replace(/\/[^)]+\)/, '/textReplaced)')) will work if the / and ) are static values known beforehand.
You can use str.split('/')
var text = 'notThisText/IWantToReplaceThis';
var splited = text.split('/');
splited[1] = 'yourDesireText';
var output = splited.join('/');
console.log(output);
Try Following: In your case startChar='/', endChar = ')', origString=$('h3').text()
function customReplace(startChar, endChar, origString, replaceWith){
var strArray = origString.split(startChar);
return strArray[0] + startChar + replaceWith + endChar;
}
First of all, you didn't define clearly what is the format of the text which you want to replace and the non-replacement part. For example,
Does notThisText contain any slash /?
Does IWantToReplaceThis contain any parentheses )?
Since there are too many uncertainties, the answer here only shows up the pattern exactly matches your example:
yourText.replace(/(\/).*?(\))/g, '$1textReplaced$2')
var text = "notThisText/IWantToReplaceThis";
text = text.replace(/\/.*/, "/whatever");
output : "notThisText/whatever"`

Javascript nested square brackets in string

I am looking for an easier (and less hacky) way to get the substring of what is inside matching square brackets in a string. For example, lets say this is the string:
[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ
I want the substring:
ABC[D][E[FG]]HIJK[LMN]
Right now, I am looping through the string and counting the open and closed brackets, and when those numbers are the same, I take substring of the first open bracket and last closed bracket.
Is there an easier way to do this (ie with regex), so that I do need to loop through every character?
Here's another approach, an ugly hack which turns the input into a JS array representation and then parses it using JSON.parse:
function parse(str) {
return JSON.parse('[' +
str.split('') . join(',') . // insert commas
replace(/\[,/g, '[') . // clean up leading commas
replace(/,]/g, ']') . // clean up trailing commas
replace(/\w/g, '"$&"') // quote strings
+ ']');
}
>> hack('A[B]C')
<< ["A", ["B"], "C"]
Now a stringifier to turn arrays back into the bracketed form:
function stringify(array) {
return Array.isArray(array) ? '[' + array.map(stringify).join('') + ']' : array;
}
Now your problem can be solved by:
stringify(parse("[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ")[0])
Not sure if I get the question right (sorry about that).
So you mean that if you were to have a string of characters X, you would like to check if the string combination Y is contained within X?
Where Y being ABC[D][E[FG]]HIJK[LMN]
If so then you could simply do:
var str = "[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ";
var res = str.match(/ABC\[D]\[E\[FG]]HIJK\[LMN]/);
The above would then return the string literal Y as it matches what is inside str.
It is important that you pay attention to the fact that the symbols [ are being escaped with a \. This is because in regex if you were to have the two square brackets with any letter in between (ie. [asd]) regex would then match the single characters included in the specified set.
You can test the regex here:
https://regex101.com/r/zK3vZ3/1
I think the problem is to get all characters from an opening square bracket up to the corresponding closing square bracket. Balancing groups are not implemented in JavaScript, but there is a workaround: we can use several optional groups between these square brackets.
The following regex will match up to 3 nested [...] groups and you can add the capturing groups to support more:
\[[^\]\[]*(?:
\[[^\]\[]*(?:
\[[^\]\[]*(?:\[[^\]\[]*\])*\]
)*[^\]\[]*
\][^\]\[]*
)*[^\]\[]*
\]
See example here. However, performance may be not that high with such heavy backtracking.
UPDATE
Use XRegExp:
var str = '[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ';
// First match:
var res = XRegExp.matchRecursive(str, '\\[', ']');
document.body.innerHTML = "Getting the first match:<br/><pre>" + JSON.stringify(res, 0, 4) + "</pre><br/>And now, multiple matches (add \"g\" modifier when defining the XRegExp)";
// Multiple matches:
res = XRegExp.matchRecursive(str, '\\[', ']', 'g');
document.body.innerHTML += "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/2.0.0/xregexp-all-min.js"></script>

How can I get a substring located between 2 quotes?

I have a string that looks like this: "the word you need is 'hello' ".
What's the best way to put 'hello' (but without the quotes) into a javascript variable? I imagine that the way to do this is with regex (which I know very little about) ?
Any help appreciated!
Use match():
> var s = "the word you need is 'hello' ";
> s.match(/'([^']+)'/)[1];
"hello"
This will match a starting ', followed by anything except ', and then the closing ', storing everything in between in the first captured group.
http://jsfiddle.net/Bbh6P/
var mystring = "the word you need is 'hello'"
var matches = mystring.match(/\'(.*?)\'/); //returns array
​alert(matches[1]);​
If you want to avoid regular expressions then you can use .split("'") to split the string at single quotes , then use jquery.map() to return just the odd indexed substrings, ie. an array of all single-quoted substrings.
var str = "the word you need is 'hello'";
var singleQuoted = $.map(str.split("'"), function(substr, i) {
return (i % 2) ? substr : null;
});
DEMO
CAUTION
This and other methods will get it wrong if one or more apostrophes (same as single quote) appear in the original string.

Javascript Regex- replace sequence of characters with same number of another character

I'm trying to replace part of a string with the same number of dummy characters in JavaScript, for example: '==Hello==' with '==~~~~~=='.
This question has been answered using Perl and PHP, but I can't get it to work in JavaScript. I've been trying this:
txt=txt.replace(/(==)([^=]+)(==)/g, "$1"+Array("$2".length + 1).join('~')+"$3");
The pattern match works fine, but the replacement does not - the second part adds '~~' instead of the length of the pattern match. Putting the "$2" inside the parentheses doesn't work. What can I do to make it insert the right number of characters?
Use a function for replacement instead:
var txt = "==Hello==";
txt = txt.replace(/(==)([^=]+)(==)/g, function ($0, $1, $2, $3) {
return $1 + (new Array($2.length + 1).join("~")) + $3;
});
alert(txt);
//-> "==~~~~~=="
The issue with the expression
txt.replace(/(==)([^=]+)(==)/g, "$1"+Array("$2".length + 1).join('~')+"$3")
is that "$2".length forces $2 to be taken as a string literal, namely the string "$2", that has length 2.
From the MDN docs:
Because we want to further transform the result of the match before the final substitution is made, we must use a function.
This forces evaluation of the match before the transformation.
With an inline function as parameter (and repeat) -- here $1, $2, $3 are local variables:
txt.replace(/(==)([^=]+)(==)/g, (_,$1,$2,$3) => $1+'~'.repeat($2.length)+$3);
txt = '==Hello==';
//inline function
console.log(
txt.replace(/(==)([^=]+)(==)/g, (_, g1, g2, g3) => g1 + '~'.repeat(g2.length) + g3)
);
The length attribute is being evaluated before the $2 substitution so replace() won't work. The function call suggested by Augustus should work, another approach would be using match() instead of replace().
Using match() without the /g, returns an array of match results which can be joined as you expect.
txt="==Hello==";
mat=txt.match(/(==)([^=]+)(==)/); // mat is now ["==Hello==","==","Hello","=="]
txt=mat[1]+Array(mat[2].length+1).join("~")+mat[3]; // txt is now "==~~~~~=="
You excluded the leading/trailing character from the middle expression, but if you want more flexibility you could use this and handle anything bracketed by the leading/trailing literals.
mat=txt.match(/(^==)(.+)(==$)/)
A working sample uses the following fragment:
var processed = original.replace(/(==)([^=]+)(==)/g, function(all, before, gone, after){
return before+Array(gone.length+1).join('~')+after;
});
The problem in your code was that you always measured the length of "$2" (always a string with two characters). By having the function you can measure the length of the matched part. See the documentation on replace for further examples.

Javascript RegEx non-capturing prefix

I am trying to do some string replacement with RegEx in Javascript. The scenario is a single line string containing long comma-delimited list of numbers, in which duplicates are possible.
An example string is: 272,2725,2726,272,2727,297,272 (The end may or may not end in a comma)
In this example, I am trying to match each occurrence of the whole number 272. (3 matches expected)
The example regex I'm trying to use is: (?:^|,)272(?=$|,)
The problem I am having is that the second and third matches are including the leading comma, which I do not want. I am confused because I thought (?:^|,) would match, but not capture. Can someone shed light on this for me? An interesting bit is that the trailing comma is excluded from the result, which is what I want.
For what it is worth, if I were using C# there is syntax for prefix matching that does what I want: (?<=^|,)
However, it appears to be unsupported in JavaScript.
Lastly, I know I could workaround it using string splitting, array manipulation and rejoining, but I want to learn.
Use word boundaries instead:
\b272\b
ensures that only 272 matches, but not 2725.
(?:...) matches and doesn't capture - but whatever it matches will be part of the overall match.
A lookaround assertion like (?=...) is different: It only checks if it is possible (or impossible) to match the enclosed regex at the current point, but it doesn't add to the overall match.
Here is a way to create a JavaScript look behind that has worked in all cases I needed.
This is an example. One can do many more complex and flexible things.
The main point here is that in some cases,
it is possible to create a RegExp non-capturing prefix
(look behind) construct in JavaScript .
This example is designed to extract all fields that are surrounded by braces '{...}'.
The braces are not returned with the field.
This is just an example to show the idea at work not necessarily a prelude to an application.
function testGetSingleRepeatedCharacterInBraces()
{
var leadingHtmlSpaces = ' ' ;
// The '(?:\b|\B(?={))' acts as a prefix non-capturing group.
// That is, this works (?:\b|\B(?=WhateverYouLike))
var regex = /(?:\b|\B(?={))(([0-9a-zA-Z_])\2{4})(?=})/g ;
var string = '' ;
string = 'Message has no fields' ;
document.write( 'String => "' + string
+ '"<br>' + leadingHtmlSpaces + 'fields => '
+ getMatchingFields( string, regex )
+ '<br>' ) ;
string = '{LLLLL}Message {11111}{22222} {ffffff}abc def{EEEEE} {_____} {4444} {666666} {55555}' ;
document.write( 'String => "' + string
+ '"<br>' + leadingHtmlSpaces + 'fields => '
+ getMatchingFields( string, regex )
+ '<br>' ) ;
} ;
function getMatchingFields( stringToSearch, regex )
{
var matches = stringToSearch.match( regex ) ;
return matches ? matches : [] ;
} ;
Output:
String => "Message has no fields"
fields =>
String => "{LLLLL}Message {11111}{22222} {ffffff}abc def{EEEEE} {_____} {4444} {666666} {55555}"
fields => LLLLL,11111,22222,EEEEE,_____,55555

Categories