Need to extract values from a string using regex(for perf reasons).
Cases might be as follows:
RED,100
RED,"100"
RED,"100,"
RED,"100\"ABC\"200"
The resulting separated [label, value] array should be:
['RED','100']
['RED','100']
['RED','100,']
['RED','100"ABC"200']
I looked into solutions and a popular library even, just splits the entire string to get the values,
e.g. 'RED,100'.split(/,/) might just do the thing.
But I was trying to make a regex with comma, which splits only if that comma is not enclosed within a quotes type value.
This isnt a standard CSV behaviour might be. But its very easy for end-user to enter values.
enter label,value. Do whatever inside value, if thats surrounded by quotes. If you wanna contain quotes, use a backslash.
Any help is appreciated.
You can use this regex that takes care of escaped quotes in string:
/"[^"\\]*(?:\\.[^"\\]*)*"|[^,"]+/g
RegEx Explanation:
": Match a literal opening quote
[^"\\]*: Match 0 or more of any character that is not \ and not a quote
(?:\\.[^"\\]*)*: Followed by escaped character and another non-quote, non-\. Match 0 or more of this combination to get through all escaped characters
": Match closing quote
|: OR (alternation)
[^,"]+: Match 1+ of non-quote, non-comma string
RegEx Demo
const regex = /"[^"\\]*(?:\\.[^"\\]*)*"|[^,"]+/g;
const arr = [`RED,100`, `RED,"100"`, `RED,"100,"`,
`RED,"100\\"ABC\\"200"`];
let m;
for (var i = 0; i < arr.length; i++) {
var str = arr[i];
var result = [];
while ((m = regex.exec(str)) !== null) {
result.push(m[0]);
}
console.log("Input:", str, ":: Result =>", result);
}
You could use String#match and take only the groups.
var array = ['RED,100', 'RED,"100"', 'RED,"100,"', 'RED,"100\"ABC\"200"'];
console.log(array.map(s => s.match(/^([^,]+),(.*)$/).slice(1)))
Related
Sorry if the wording is bad. So I'm trying to find out how to pass in a string match of multiple characters long into my dynamic regex expression.
The regex in my else statement works with 1 character being passed in so I'm trying to do the same thing except with multiple characters being passed in the first if statement.
const delimiter = str.slice(0, str.indexOf('\n'));
const strLength = delimiter.length;
if (delimiter[0] === '[' && delimiter.charAt(strLength - 1) === ']') {
const customDelimiter = delimiter.slice(delimiter.indexOf(delimiter[1]), delimiter.indexOf(delimiter.charAt(strLength - 1)));
console.log(customDelimiter) // => '***'
const regex = new RegExp(`,|\\n|\\${customDelimiter}`,'g');
return strArr = str.split(regex).filter(Boolean);
} else {
const firstChar = str.slice(0, 1); // => '*'
const regex = new RegExp(`,|\\n|\\${firstChar}`,'g');
return strArr = str.split(regex).filter(Boolean);
}
So for example I want this string:
'[*]\n11***22***33' to equal 66 b/c it should split it into an array of [11, 22, 33] using the '*' delimiter. I get an error message saying: "SyntaxError: Invalid regular expression: /,|\n|***/: Nothing to repeat".
When you use * as delimeter in your regex, it becomes ,|\\n|\\|\*, which is the correct regex.
It matches ',' or '\n' or a '*' character.
For your string, it matches [***]\n11***22***33.
But when you use *** as a delimiter in your regex, it becomes ,|\\n|\\|\***, which is incorrect. Here it gets two unescaped * at the end. * in regex means 0 or more of the preceding pattern. You cannot have two of them together.
This is a special case because * has a special meaning in regex.
If you would have used any non-regex character, it would work.
A simpler solution would be to use javascript split function to easily get the desired result.
You could first split the string using \n.
let splitStr = str.split('\n');
// This would return ["[***]", "11***22***33"]
and then split the 1st index of the splitStr using the delimeter.
splitStr[1].split('***');
// splitStr[1].split(customDelimiter)
// This would return ["11", "22", "33"]
Using this you wouldn't need to use if or else statement to separate out single character delimiter and multiple character delimiter.
var input = [paul, Paula, george];
var newReg = \paula?\i
for(var text in input) {
if (newReg.test(text) == true) {
input[input.indexOf(text)] = george
}
}
console.log(input)
I don't know what's wrong in my code. it should change paul and Paula to george but when I run it it says there's an illegal character
The backslash (\) is an escape character in Javascript (along with a lot of other C-like languages). This means that when Javascript encounters a backslash, it tries to escape the following character. For instance, \n is a newline character (rather than a backslash followed by the letter n).
So, thats what is causing your error, you need to replace \paula?\i with /paula?/i
You need to replace \ by / in your regexp pattern.
You should wrap the strings inside quotes "
You need to match correctly your array, val is just the index of the word, not the word himself.
var input = ["paul", "Paula", "george"];
var newReg = /paula?/i;
for (var val in input) {
if (newReg.test(input[val]) == true) {
input[input.indexOf(input[val])] = "george";
}
}
console.log(input);
JSFIDDLE
I have a string object that is returned by an API. It looks like this:
{Apple},{"A tree"},{Three2},{123},{A bracket {},{Two brackets {}},{}
I only need to split at commas that have } and { on both sides, which I want to keep them as part of the returned result. Doing split("},{") results in first and last entries having leading and trailing brackets, and when there is only one element returned, I have to make additional checks to ensure I don't add any extra brackets to first and last (which is same as first) elements.
I hope there is an elegant RegExp to split at ,, surrounded by }{.
You need to use a positive lookahead to match only a comma which is followed by curly braces. I've tested this and it works:
var apiResponse = "{Apple},{\"A tree\"},{Three2},{123},{A bracket {},{Two brackets {}},{}";
var split = apiResponse.split(/,(?={)/);
console.log("Split length is "+split.length);
for(i = 0; i < split.length; ++i) {
console.log("split["+i+"] is: "+split[i]);
}
The (?=\{) means "must be immediately followed by an opening curly brace".
To read about lookaheads, see this regex tutorial.
var _data = '{Apple},{"A tree"},{Three2},{123},{A bracket {},{Two brackets {}},{}';
var _items = [];
var re = /(^|,){(.*?)}(?=,{|$)/g;
var m;
while ((m = re.exec(_data)) !== null){
_items.push(m[2]);
}
You can test it out using jsFiddle http://jsfiddle.net/wao20/SgFx7/24/
Regex breakdown:
(^|,) Start of the string or by a comma
{ A literal bracket "{"
(.*?) Non-greedy match between two brackets (for more
info http://javascript.info/tutorial/greedy-and-lazy)
} A literal bracket "}"
(?=,{|$) Look ahead and non-comsuming (match a comma ",{" or end of
string) without the look ahead it will eat up the comma and you end up with only every other items.
Update: Changed regex to address Robin's comments.
/(^|,)\{(.*?)\}(?=,|$)/g to /(^|,){(.*?)}(?=,{|$)/g
This should work for the string as provided - it doesn't account for whitespace between braces and commas, nor does it retain the brace-comma-brace pattern within quotes.
var str = '{Apple},{"A tree"},{Three2},{123},{A bracket {},{Two brackets {}},{}';
var parts = [];
var nextIndex = function(str) {
return (str.search(/},{/) > -1) ? str.search(/},{/) + 1 : null;
};
while (nextIndex(str)) {
parts.push(str.slice(0, nextIndex(str)));
str = str.slice(nextIndex(str) + 1);
}
parts.push(str); // Final piece
console.log(parts);
I have a strings "add_dinner", "add_meeting", "add_fuel_surcharge" and I want to get characters that are preceded by "add_" (dinner, meeting, fuel_surcharge).
[^a][^d]{2}[^_]\w+
I have tried this one, but it only works for "add_dinner"
[^add_]\w+
This one works for "add_fuel_surcharge", but takes "inner" from "add_dinner"
Help me to understand please.
Use capturing groups:
/^add_(\w+)$/
Check the returned array to see the result.
Since JavaScript doesn't support lookbehind assertions, you need to use a capturing group:
var myregexp = /add_(\w+)/;
var match = myregexp.exec(subject);
if (match != null) {
result = match[1];
}
[^add_] is a character class that matches a single character except a, d or _. When applied to add_dinner, the first character it matches is i, and \w+ then matches nner.
The [^...] construct matches any single character except the ones listed. So [^add_] matches any single character other than "a", "d" or "_".
If you want to retrieve the bit after the _ you can do this:
/add_(\w+_)/
Where the parentheses "capture" the part of the expression inside. So to get the actual text from a string:
var s = "add_meeting";
var result = s.match(/add_(\w+)/)[1];
This assumes the string will match such that you can directly get the second element in the returned array that will be the "meeting" part that matched (\w+).
If there's a possibility that you'll be testing a string that won't match you need to test that the result of match() is not null.
(Or, possibly easier to understand: result = "add_meeting".split("_")[1];)
You can filter _ string by JavaScript for loop ,
var str = ['add_dinner', 'add_meeting', 'add_fuel_surcharge'];
var filterString = [];
for(var i = 0; i < str.length; i ++){
if(str[i].indexOf("_")>-1){
filterString.push(str[i].substring(str[i].indexOf("_") + 1, str[i].length));
}
}
alert(filterString.join(", "));
I have a problem. I have a string - "\,str\,i,ing" and i need to split by comma before which not have slash. For my string - ["\,str\,i", "ing"]. I'm use next regex
myString.split("[^\],", 2)
but it's doesn't worked.
Well, this is ridiculous to avoid the lack of lookbehind but seems to get the correct result.
"\\,str\\,i,ing".split('').reverse().join('').split(/,(?=[^\\])/).map(function(a){
return a.split('').reverse().join('');
}).reverse();
//=> ["\,str\,i", "ing"]
Not sure about your expected output but you are specifying string not a regex, use:
var arr = "\,str\,i,ing".split(/[^\\],/, 2);
console.log(arr);
To split using regex, wrap your regex in /..../
This is not easily possible with js, because it does not support lookbehind. Even if you'd use a real regex, it would eat the last character:
> "xyz\\,xyz,xyz".split(/[^\\],/, 2)
["xyz\\,xy", "xyz"]
If you don't want the z to be eaten, I'd suggest:
var str = "....";
return str.split(",").reduce(function(res, part) {
var l = res.length;
if (l && res[l-1].substr(-1) == "\\" || l<2)
// ^ ^^ ^
// not the first was escaped limit
res[l-1] += ","+part;
else
res.push(part);
return;
}, []);
Reading between the lines, it looks like you want to split a string by , characters that are not preceded by \ characters.
It would be really great if JavaScript had a regular expression lookbehind (and negative lookbehind) pattern, but unfortunately it does not. What it does have is a lookahead ((?=) )and negative lookahead ((?!)) pattern. Make sure to review the documentation.
You can use these as a lookbehind if you reverse the string:
var str,
reverseStr,
arr,
reverseArr;
//don't forget to escape your backslashes
str = '\\,str\\,i,ing';
//reverse your string
reverseStr = str.split('').reverse().join('');
//split the array on `,`s that aren't followed by `\`
reverseArr = reverseStr.split(/,(?!\\)/);
//reverse the reversed array, and reverse each string in the array
arr = reverseArr.reverse().map(function (val) {
return val.split('').reverse().join('');
});
You picked a tough character to match- a forward slash preceding a comma is apt to disappear while you pass it around in a string, since '\,'==','...
var s= 'My dog, the one with two \\, blue \\,eyes, is asleep.';
var a= [], M, rx=/(\\?),/g;
while((M= rx.exec(s))!= null){
if(M[1]) continue;
a.push(s.substring(0, rx.lastIndex-1));
s= s.substring(rx.lastIndex);
rx.lastIndex= 0;
};
a.push(s);
/* returned value: (Array)
My dog
the one with two \, blue \,eyes
is asleep.
*/
Find something which will not be present in your original string, say "###". Replace "\\," with it. Split the resulting string by ",". Replace "###" back with "\\,".
Something like this:
<script type="text/javascript">
var s1 = "\\,str\\,i,ing";
var s2 = s1.replace(/\\,/g,"###");
console.log(s2);
var s3 = s2.split(",");
for (var i=0;i<s3.length;i++)
{
s3[i] = s3[i].replace(/###/g,"\\,");
}
console.log(s3);
</script>
See JSFiddle