Javascript nested square brackets in string - javascript

I am looking for an easier (and less hacky) way to get the substring of what is inside matching square brackets in a string. For example, lets say this is the string:
[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ
I want the substring:
ABC[D][E[FG]]HIJK[LMN]
Right now, I am looping through the string and counting the open and closed brackets, and when those numbers are the same, I take substring of the first open bracket and last closed bracket.
Is there an easier way to do this (ie with regex), so that I do need to loop through every character?

Here's another approach, an ugly hack which turns the input into a JS array representation and then parses it using JSON.parse:
function parse(str) {
return JSON.parse('[' +
str.split('') . join(',') . // insert commas
replace(/\[,/g, '[') . // clean up leading commas
replace(/,]/g, ']') . // clean up trailing commas
replace(/\w/g, '"$&"') // quote strings
+ ']');
}
>> hack('A[B]C')
<< ["A", ["B"], "C"]
Now a stringifier to turn arrays back into the bracketed form:
function stringify(array) {
return Array.isArray(array) ? '[' + array.map(stringify).join('') + ']' : array;
}
Now your problem can be solved by:
stringify(parse("[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ")[0])

Not sure if I get the question right (sorry about that).
So you mean that if you were to have a string of characters X, you would like to check if the string combination Y is contained within X?
Where Y being ABC[D][E[FG]]HIJK[LMN]
If so then you could simply do:
var str = "[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ";
var res = str.match(/ABC\[D]\[E\[FG]]HIJK\[LMN]/);
The above would then return the string literal Y as it matches what is inside str.
It is important that you pay attention to the fact that the symbols [ are being escaped with a \. This is because in regex if you were to have the two square brackets with any letter in between (ie. [asd]) regex would then match the single characters included in the specified set.
You can test the regex here:
https://regex101.com/r/zK3vZ3/1

I think the problem is to get all characters from an opening square bracket up to the corresponding closing square bracket. Balancing groups are not implemented in JavaScript, but there is a workaround: we can use several optional groups between these square brackets.
The following regex will match up to 3 nested [...] groups and you can add the capturing groups to support more:
\[[^\]\[]*(?:
\[[^\]\[]*(?:
\[[^\]\[]*(?:\[[^\]\[]*\])*\]
)*[^\]\[]*
\][^\]\[]*
)*[^\]\[]*
\]
See example here. However, performance may be not that high with such heavy backtracking.
UPDATE
Use XRegExp:
var str = '[ABC[D][E[FG]]HIJK[LMN]]OPQR[STUVW]XYZ';
// First match:
var res = XRegExp.matchRecursive(str, '\\[', ']');
document.body.innerHTML = "Getting the first match:<br/><pre>" + JSON.stringify(res, 0, 4) + "</pre><br/>And now, multiple matches (add \"g\" modifier when defining the XRegExp)";
// Multiple matches:
res = XRegExp.matchRecursive(str, '\\[', ']', 'g');
document.body.innerHTML += "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/2.0.0/xregexp-all-min.js"></script>

Related

How do i make a nested match in regex?

var matches = pattern.match(/\((.+?)\)/g);
matched against:
[e[1]]
returns "[e[1]", i assume it is a problem with nesting, how do I fix this?
If you are only interested in how to match substrings inside fixed delimiters, you may use XRegExp XRegExp.matchRecursive:
Returns an array of match strings between outermost left and right delimiters, or an array of objects with detailed match parts and position data. An error is thrown if delimiters are unbalanced within the data.
Since the delimiters are lost, but you know what they are you can later restore them in all the matches.
var str = '[e[1]] [ [e[[2]34]]]';
document.body.innerHTML = XRegExp.matchRecursive(str, '\\[', ']', 'g').map(x => '[' + x + ']');
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/2.0.0/xregexp-all-min.js"></script>

How can I get a substring located between 2 quotes?

I have a string that looks like this: "the word you need is 'hello' ".
What's the best way to put 'hello' (but without the quotes) into a javascript variable? I imagine that the way to do this is with regex (which I know very little about) ?
Any help appreciated!
Use match():
> var s = "the word you need is 'hello' ";
> s.match(/'([^']+)'/)[1];
"hello"
This will match a starting ', followed by anything except ', and then the closing ', storing everything in between in the first captured group.
http://jsfiddle.net/Bbh6P/
var mystring = "the word you need is 'hello'"
var matches = mystring.match(/\'(.*?)\'/); //returns array
​alert(matches[1]);​
If you want to avoid regular expressions then you can use .split("'") to split the string at single quotes , then use jquery.map() to return just the odd indexed substrings, ie. an array of all single-quoted substrings.
var str = "the word you need is 'hello'";
var singleQuoted = $.map(str.split("'"), function(substr, i) {
return (i % 2) ? substr : null;
});
DEMO
CAUTION
This and other methods will get it wrong if one or more apostrophes (same as single quote) appear in the original string.

Javascript RegEx non-capturing prefix

I am trying to do some string replacement with RegEx in Javascript. The scenario is a single line string containing long comma-delimited list of numbers, in which duplicates are possible.
An example string is: 272,2725,2726,272,2727,297,272 (The end may or may not end in a comma)
In this example, I am trying to match each occurrence of the whole number 272. (3 matches expected)
The example regex I'm trying to use is: (?:^|,)272(?=$|,)
The problem I am having is that the second and third matches are including the leading comma, which I do not want. I am confused because I thought (?:^|,) would match, but not capture. Can someone shed light on this for me? An interesting bit is that the trailing comma is excluded from the result, which is what I want.
For what it is worth, if I were using C# there is syntax for prefix matching that does what I want: (?<=^|,)
However, it appears to be unsupported in JavaScript.
Lastly, I know I could workaround it using string splitting, array manipulation and rejoining, but I want to learn.
Use word boundaries instead:
\b272\b
ensures that only 272 matches, but not 2725.
(?:...) matches and doesn't capture - but whatever it matches will be part of the overall match.
A lookaround assertion like (?=...) is different: It only checks if it is possible (or impossible) to match the enclosed regex at the current point, but it doesn't add to the overall match.
Here is a way to create a JavaScript look behind that has worked in all cases I needed.
This is an example. One can do many more complex and flexible things.
The main point here is that in some cases,
it is possible to create a RegExp non-capturing prefix
(look behind) construct in JavaScript .
This example is designed to extract all fields that are surrounded by braces '{...}'.
The braces are not returned with the field.
This is just an example to show the idea at work not necessarily a prelude to an application.
function testGetSingleRepeatedCharacterInBraces()
{
var leadingHtmlSpaces = ' ' ;
// The '(?:\b|\B(?={))' acts as a prefix non-capturing group.
// That is, this works (?:\b|\B(?=WhateverYouLike))
var regex = /(?:\b|\B(?={))(([0-9a-zA-Z_])\2{4})(?=})/g ;
var string = '' ;
string = 'Message has no fields' ;
document.write( 'String => "' + string
+ '"<br>' + leadingHtmlSpaces + 'fields => '
+ getMatchingFields( string, regex )
+ '<br>' ) ;
string = '{LLLLL}Message {11111}{22222} {ffffff}abc def{EEEEE} {_____} {4444} {666666} {55555}' ;
document.write( 'String => "' + string
+ '"<br>' + leadingHtmlSpaces + 'fields => '
+ getMatchingFields( string, regex )
+ '<br>' ) ;
} ;
function getMatchingFields( stringToSearch, regex )
{
var matches = stringToSearch.match( regex ) ;
return matches ? matches : [] ;
} ;
Output:
String => "Message has no fields"
fields =>
String => "{LLLLL}Message {11111}{22222} {ffffff}abc def{EEEEE} {_____} {4444} {666666} {55555}"
fields => LLLLL,11111,22222,EEEEE,_____,55555

Remove characters from a string [duplicate]

This question already has answers here:
How can I remove a character from a string using JavaScript?
(22 answers)
Closed 7 months ago.
What are the different ways I can remove characters from a string in JavaScript?
Using replace() with regular expressions is the most flexible/powerful. It's also the only way to globally replace every instance of a search pattern in JavaScript. The non-regex variant of replace() will only replace the first instance.
For example:
var str = "foo gar gaz";
// returns: "foo bar gaz"
str.replace('g', 'b');
// returns: "foo bar baz"
str = str.replace(/g/gi, 'b');
In the latter example, the trailing /gi indicates case-insensitivity and global replacement (meaning that not just the first instance should be replaced), which is what you typically want when you're replacing in strings.
To remove characters, use an empty string as the replacement:
var str = "foo bar baz";
// returns: "foo r z"
str.replace(/ba/gi, '');
ONELINER which remove characters LIST (more than one at once) - for example remove +,-, ,(,) from telephone number:
var str = "+(48) 123-456-789".replace(/[-+()\s]/g, ''); // result: "48123456789"
We use regular expression [-+()\s] where we put unwanted characters between [ and ]
(the "\s" is 'space' character escape - for more info google 'character escapes in in regexp')
I know this is old but if you do a split then join it will remove all occurrences of a particular character ie:
var str = theText.split('A').join('')
will remove all occurrences of 'A' from the string, obviously it's not case sensitive
You can use replace function.
str.replace(regexp|substr, newSubstr|function)
Another method that no one has talked about so far is the substr method to produce strings out of another string...this is useful if your string has defined length and the characters your removing are on either end of the string...or within some "static dimension" of the string.
const removeChar = (str: string, charToBeRemoved: string) => {
const charIndex: number = str.indexOf(charToBeRemoved);
let part1 = str.slice(0, charIdx);
let part1 = str.slice(charIdx + 1, str.length);
return part1 + part2;
};

remove unwanted commas in JavaScript

I want to remove all unnecessary commas from the start/end of the string.
eg; google, yahoo,, , should become google, yahoo.
If possible ,google,, , yahoo,, , should become google,yahoo.
I've tried the below code as a starting point, but it seems to be not working as desired.
trimCommas = function(s) {
s = s.replace(/,*$/, "");
s = s.replace(/^\,*/, "");
return s;
}
In your example you also want to trim the commas if there's spaces between them at the start or at the end, use something like this:
str.replace(/^[,\s]+|[,\s]+$/g, '').replace(/,[,\s]*,/g, ',');
Note the use of the 'g' modifier for global replace.
You need this:
s = s.replace(/[,\s]{2,}/,""); //Removes double or more commas / spaces
s = s.replace(/^,*/,""); //Removes all commas from the beginning
s = s.replace(/,*$/,""); //Removes all commas from the end
EDIT: Made all the changes - should work now.
My take:
var cleanStr = str.replace(/^[\s,]+/,"")
.replace(/[\s,]+$/,"")
.replace(/\s*,+\s*(,+\s*)*/g,",")
This one will work with opera, internet explorer, whatever
Actually tested this last one, and it works!
What you need to do is replace all groups of "space and comma" with a single comma and then remove commas from the start and end:
trimCommas = function(str) {
str = str.replace(/[,\s]*,[,\s]*/g, ",");
str = str.replace(/^,/, "");
str = str.replace(/,$/, "");
return str;
}
The first one replaces every sequence of white space and commas with a single comma, provided there's at least one comma in there. This handles the edge case left in the comments for "Internet Explorer".
The second and third get rid of the comma at the start and end of string where necessary.
You can also add (to the end):
str = str.replace(/[\s]+/, " ");
to collapse multi-spaces down to one space and
str = str.replace(/,/g, ", ");
if you want them to be formatted nicely (space after each comma).
A more generalized solution would be to pass parameters to indicate behaviour:
Passing true for collapse will collapse the spaces within a section (a section being defined as the characters between commas).
Passing true for addSpace will use ", " to separate sections rather than just "," on its own.
That code follows. It may not be necessary for your particular case but it might be better for others in terms of code re-use.
trimCommas = function(str,collapse,addspace) {
str = str.replace(/[,\s]*,[,\s]*/g, ",").replace(/^,/, "").replace(/,$/, "");
if (collapse) {
str = str.replace(/[\s]+/, " ");
}
if (addspace) {
str = str.replace(/,/g, ", ");
}
return str;
}
First ping on Google for "Javascript Trim": http://www.somacon.com/p355.php. You seem to have implemented this using commas, and I don't see why it would be a problem (though you escaped in the second one and not in the first).
Not quite as sophisticated, but simple with:
',google,, , yahoo,, ,'.replace(/\s/g, '').replace(/,+/g, ',');
You should be able to use only one replace call:
/^( *, *)+|(, *(?=,|$))+/g
Test:
'google, yahoo,, ,'.replace(/^( *, *)+|(, *(?=,|$))+/g, '');
"google, yahoo"
',google,, , yahoo,, ,'.replace(/^( *, *)+|(, *(?=,|$))+/g, '');
"google, yahoo"
Breakdown:
/
^( *, *)+ # Match start of string followed by zero or more spaces
# followed by , followed by zero or more spaces.
# Repeat one or more times
| # regex or
(, *(?=,|$))+ # Match , followed by zero or more spaces which have a comma
# after it or EOL. Repeat one or more times
/g # `g` modifier will run on until there is no more matches
(?=...) is a look ahead will will not move the position of the match but only verify that a the characters are after the match. In our case we look for , or EOL
match() is much better tool for this than replace()
str = " aa, bb,, cc , dd,,,";
newStr = str.match(/[^\s,]+/g).join(",")
alert("[" + newStr + "]")
When you want to replace ",," ",,,", ",,,," and ",,,,," below code will be removed by ",".
var abc = new String("46590,26.91667,75.81667,,,45346,27.18333,78.01667,,,45630,12.97194,77.59369,,,47413,19.07283,72.88261,,,45981,13.08784,80.27847,,");
var pqr= abc.replace(/,,/g,',').replace(/,,/g, ',');
alert(pqr);

Categories