I need to match all charecters and digits(\w) inside the string which not inside the single quote (\')
For instance I have string:
param : 'test' .param4 'zzzz' param8 * 'rrrr'
from that string I need to get:
- param
- param4
- param8
Thx for any advance.
You can use this lookahead based regex:
/(?=(?:(?:[^']*'){2})*[^']*$)\b\w+\b/gm
RegEx Demo
This regex will match a word if that word is outside single quotes by using a lookahead to make sure there are even number of quotes after each word. This assumes unescaped quotes are balanced.
Code:
var re = /(?=(?:(?:[^']*'){2})*[^']*$)\b\w+\b/gm;
var str = 'param : \'test\' .param4 \'zzzz\' param8 * \'rrrr\' class2.class3*dsaasd';
var m;
var result;
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex)
re.lastIndex++;
res.push(m[0]);
}
console.log(result);
Related
Original string:
some text "some \"string\"right here "
Want to get:
"some \"string\"right here"
I am using the following regex:
/\"(.*?)\"/g
Parsing the string correctly with a parser
With a JavaScript regex, it is impossible to start matching at the correct double quote. You will either match an escaped one, or you will fail to match the correct double quote after a literal \ before a quote. Thus, the safest way is to use a parser. Here is a sample one:
var s = "some text \\\"extras\" some \\\"string \\\" right\" here \"";
console.log("Incorrect (with regex): ", s.match(/"([^"\\]*(?:\\.[^"\\]*)*)"/g));
var res = [];
var tmp = "";
var in_quotes = false;
var in_entity = false;
for (var i=0; i<s.length; i++) {
if (s[i] === '\\' && in_entity === false) {
in_entity = true;
if (in_quotes === true) {
tmp += s[i];
}
} else if (in_entity === true) { // add a match
in_entity = false;
if (in_quotes === true) {
tmp += s[i];
}
} else if (s[i] === '"' && in_quotes === false) { // start a new match
in_quotes = true;
tmp += s[i];
} else if (s[i] === '"' && in_quotes === true) { // append char to match and add to results
tmp += s[i];
res.push(tmp);
tmp = "";
in_quotes = false;
} else if (in_quotes === true) { // append a char to the match
tmp += s[i];
}
}
console.log("Correct results: ", res);
Not-so-safe regex approach
It is not possible to match the string you need with lazy dot matching pattern since it will stop before the first ". If you know your string will never have an escaped quote before a quoted substring, and if you are sure there are no literal \ before double quotes (and these conditions are very strict to use the regex safely), you can use
/"([^"\\]*(?:\\.[^"\\]*)*)"/g
See the regex demo
" - match a quote
([^"\\]*(?:\\.[^"\\]*)*) - 0 or more sequences of
[^"\\]* - 0+ non-\ and non"s
(?:\\.[^"\\]*)* - zero or more sequences of
\\. - any escaped symbol
[^"\\]* - 0+ non-\ and non"s
" - trailing quote
JS demo:
var re = /"([^"\\]*(?:\\.[^"\\]*)*)"/g;
var str = `some text "some \\"string\\"right here " some text "another \\"string\\"right here "`;
var res = [];
while ((m = re.exec(str)) !== null) {
res.push(m[1]);
}
document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>"; // Just for demo
console.log(res); // or another result demo
Safe regex approach
Complementing #WiktorStribiżew's answer, there is a technique to start matching at the correct double quote using regex. It consists of matching both quoted and unquoted text in the form:
/"(quoted)"|unquoted/g
As you can see, the quoted text is matched by a group, so we'll only consider text backreferenced by match[1].
Regex
/"([^"\\]*(?:\\.[^"\\]*)*)"|[^"\\]*(?:\\.[^"\\]*)*/g
Code
var regex = /"([^"\\]*(?:\\.[^"\\]*)*)"|[^"\\]*(?:\\.[^"\\]*)*/g;
var s = "some text \\\"extras\" some \\\"string \\\" right\" here \"";
var match;
var res = [];
while ((match = regex.exec(s)) !== null) {
if (match.index === regex.lastIndex)
regex.lastIndex++;
if( match[1] != null )
res.push(match[1]); //Append to result only group 1
}
console.log("Correct results (regex technique): ",res)
You can use this regex :
/[^\\](\".*?[^\\]\")/g
[^\\] catch any caracter diferent of \. So \" will not be catch as start or end of your match.
In order to match from quote to quote while ignoring any simple escaped quotes (\"):
(:?[^\\]|^)(\"(:?.*?[^\\]){0,1}\")
Meaning (:? start of grouping with no extraction [^\\] match one char that is not a backslash | match the previous char or ^ which is beginning of string. ( start of extraction grouping \" find quotes (that follow non slash or start of string), (:?.*?[^\\] match shortest substring ending with none slash, ){0,1} zero times or one - that actually means one time or an empty substring, that is followed by \" a quote mark.
Edit:
Wiktor Stribiżew Correctly pointed out that some more cases with regex terms in the string will fail in my initial answer. for example \\" that should be matched similar to " in your case. To avoid this specific issue you can use
(:?[^\\]|^)((:?\\\\)*\"(:?.*?[^\\]){0,1}(:?\\\\)*\")
But for actual regex compatibility you will need to refer to Wiktor's answer.
I want to make regular express which will allow only a-z and A-Z character between :: .
Like
My :head: is on :fire:
I have find something like this
/:.+?:/g
Which allow all character between ::
How to Allow only character a-z and A-Z ?
You can use /:[a-zA-Z]*:/g regex.
Have a look at the online demo.
Example code:
var re = /:[a-zA-Z]*:/g;
var str = ':mystring:';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
Just put A-Z, a-z inside a character class and place it between the two colons.
:[A-Za-z]+:
+ after the character class will repeat the previous token one or more times. So this [A-Za-z]+ would match one or more alphabets.
In javascript, you need to use match function like
> var str = 'My :head: is on :fire:'
undefined
> str.match(/:[A-Za-z]+:/g)
[ ':head:', ':fire:' ]
:[a-z]+:
Try this with i modifier.
I need a javascript regex pattern to test a schema variable, so that it should have either of the following.
It can start with any character followed by "_water_glass" and must not be anything after water_glass like "xxxx_water_glass"
or
It can be just "water_glass" not necessary to have character before water_glass and must not be anything after water_glass.
Could anyone help on this please to get the regex pattern.
Try this simply /^.*_?\_water_glass/
var re = /^.*_?_water_glass/mg;
var str = 'horse.mp3_country_code\n4343434_country_code\n_country_code';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
DEMO https://regex101.com/r/gB9zL7/2
Here you are:
^(?:.+_|)water_glass$
Details:
^- start of string
(?:.+_|) - an optional 1+ chars other than line break chars, as many as possible, up to the last _ including it
water_glass - a water_glass substring
$ - end of string.
See this regex demo and a demo code below:
var re = /^(?:.+_|)water_glass$/gm;
var str = 'xxxx_water_glass\nwater_glass';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
I always have a hard time with regex..
I'm trying to select the text between (taking into acount the before and after)
'window.API=' and ';' //for window.API= '--API--';
and other cases like:
'window.img_cdn=' and ';' //for window.img_cdn= '--imgCDN--';
any tips on witch regex concepts I should use would be a great help!
If you want to capture the content between 'xx' you can use a regex like this:
'(.*?)'
working demo
For the sample text:
window.API= '--API--';
window.img_cdn= '--imgCDN--';
You will capture:
MATCH 1
1. [13-20] `--API--`
MATCH 2
1. [40-50] `--imgCDN--`
The javascript code you can use is:
var re = /'(.*?)'/g;
var str = 'window.API= \'--API--\';\nwindow.img_cdn= \'--imgCDN--\';';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
On the other hand, if you specifically want to capture the content for only those entries, then you can use this regex:
window\.(?:API|img_cdn).*?'(.*?)'
If you want to match any text between a <some string>= sign and a semicolon, here you go:
(?:[\w\.]+\s*=\s')(.+)(?:';)$
This regex pattern will match a full string if an escaped apostrophe is present in the string: //for window.img_cdn = '--imgCDN and \'semicolon\'--';
JavaScript code:
var re = /(?:[\w\.]+\s*=\s')(.+)(?:';)$/gm;
var str = '//for window.img_cdn= \'--imgCDN--\';\n//for window.img_cdn = \'--imgCDN and semicolon = ;;;--\';';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// view results
}
The required text is in the 1st captured group. In case there is a semicolon in the text you are looking for, you will correctly match it due to the $ anchor.
See demo here
I know how to find the first occurence of a predefined character like so a.indexOf("R") but what if you would like to find the first occurence of any character A-Z, say that my string contains digits and other special characters and I'm only interested in a "normal" letter?
use regex:
a.match(/[A-Za-z]/);
Use regex.
var a = " $#714 Abcd";
answer = a.match(/[A-Za-z]/)[0]
console.log(answer); // 'A'
You can use regex in order to match this expression:
a.match(/^[A-Za-z][A-Za-z0-9]*$/)
This will check whether the first character is alphabet and then the rest to be alphanumeric
This function will return an array with the positions in the string of all the first occurrence of any letter:
var s = 'asdlkn akn dlkandl nl ndvds';
function getFirstOccurrenceOffset(s) {
s = s.split("").reverse().join(""); // reverse the string
var re = /([A-Z])(?!.*\1)/gi;
var m;
var result = Array();
while ((m = re.exec(s)) !== null)
result.push(s.length - m.index - 1);
return result.reverse();
}
console.log(getFirstOccurrenceOffset(s));
The function is case insensitive, but you can remove the i modifier to make it case sensitive.