Im trying to get all the power calculations out of a string using reg exp's i tried the following code:
var regex = new RegExp('[0-9]+[^]{1}[0-9]+');
regex.exec('1^2');
this works and returns 1^2 but when i try to use the following string:
regex.exec('1+1^2');
it returns 1+1
This is because [^xyz] means "not x, y, or z." ^ is the "not" operator in character classes ([...]). To fix this, simply escape it (one backslash to escape the ^, and another to escape the first backslash since it's in a string and it's a special character):
var regex = new RegExp('[0-9]+[\\^]{1}[0-9]+');
Also, you don't need to use character classes and the {1} if you only have one character; just do this:
var regex = new RegExp('[0-9]+\\^[0-9]+');
Finally, one more improvement - you can use literal regular expression syntax (/.../) so you don't need two backslashes:
var regex = /[0-9]+\^[0-9]+/;
Fiddle
[^] in regex terms is a character class ([]) that's been inverted (^). e.g. [^abc] is "any character that is NOT a, b, or c". You need to escape the carat: [\^].
As well, {1} is redundant. Any character class or individual character in a regex has an implied {1} on it, so /a{1}b{1}c{1}/ is just a very verbose way of saying /abc/.
As well, a single-char character class is also redundant. /[a]/ is exactly the same as /a/.
Related
I am getting a string containing newlines (/n), tabs (/t) and lowercase letters [a-z]. It is possible to do that by matching /\n|\t/. AFAIK the dot represents the wildcard.
Therefore I was wondering, why /\n|\t/ doesn't match the same things as /\\./
var text = 'test1 \ntest2';
text.split(/\n/) //['test1', 'test2']
text.split(/\./) //['test1 \ntest2']
text.split(/\\./) //['test1 \ntest2']
Shouldn't the \\. match the \n (newline)?
Let me try and answer all the points:
AFAIK the dot represents the wildcard.
No, in regex, we do not use the term "wildcard". It is a special regex (meta)character. A dot in JavaScript regex matches any character but a newline.
I was wondering, why /\n|\t/ doesn't match the same things as /\\./
Because /\n|\t/ matches 1 symbol, either a newline or tab, while the regex /\\./ matches a literal \ and a character other than a newline.
The \n and \t are escape sequences. That means that the \ is not a literal backaslash that, together with the following symbol forms a code unit, a string that cannot be written otherwise. Indeed, how can we write a line break on the paper with a pen? No way!
See more about JavaScript character escape sequences here.
Now,
text.split(/\n/) //['test1', 'test2']
True, your input string contains a line break, thus, you get two elements in the resulting array
text.split(/\./) //['test1 \ntest2']
No match was found because \. matches a literal dot. A dot that is escaped (that has a literal \ before it) in the regex stops being a special regex metacharacter, and just matches its literal representation. Your string has no dot, thus, no matches.
text.split(/\\./) //['test1 \ntest2']
Again, no match is found, as /\\./ looks for a literal \ followed by any character but a newline.
A hint: use your expressions at regex101.com, it will tell you what your regex can match on the right.
Here, with regex, you have a literal notation (/.../). In literal notation, \ is considered a literal, thus, you do not have to escape it twice. If you used a constructor notation (i.e. RegExp(....)), you would have to use double escaping. E.g.
var re = /\\./; // is equal to
var re = new RegExp("\\\\.");
See more about constructor and literal notations at MDN RegExp help page.
\n gets evaluated to a new line, so you're essentially matching against an empty string. If you do a quick console.log('\n'); you can see the output of that.
I am a bit new to the regular expressions in Javascript.
I am trying to write a function called parseRegExpression()
which parses the attributes passed and generates a key/value pairs
It works fine with the input:
"iconType:plus;iconPosition:bottom;"
But it is not able to parse the input:
"type:'date';locale:'en-US';"
Basically the - sign is being ignored. The code is at:
http://jsfiddle.net/visibleinvisibly/ZSS5G/
The Regular Expression key value pair is as below
/[a-z|A-Z|-]*\s*:\s*[a-z|A-Z|'|"|:|-|_|\/|\.|0-9]*\s*;|[a-z|A-Z|-]*\s*:\s*[a-z|A-Z|'|"|:|-|_|\/|\.|0-9]*\s*$/gi;
There are a few problems:
A | inside a character class means a literal | character, not an alternation.
A . inside a character class means a literal . character, so there's no need to escape it.
A - as the first or last character inside a character class means a literal - character, otherwise it means a character range.
There's no need to use [a-zA-Z] when you use the case-insensitive modifier (i); [a-z] is enough.
The only difference between your alterations is the last bit; this can be simplified significantly by just limiting your alternation to that part which is different.
This should be equivalent to your original pattern:
/[a-z-]*\s*:\s*[a-z0-9'":_\/.-]*\s*(?:;|$)/gi
You can avoid the regex:
var test1 = "iconType:plus;iconPosition:bottom;";
var test2 = "type:'date';locale:'en-US';";
function toto(str) {
var result = new Array();
var temp = str.split(';');
for (i=0; i<temp.length-1; i++) {
result[i] = temp[i].split(':',1);
}
return result;
}
console.log(toto(test1));
console.log(toto(test2));
Inside a character set atom [...] the pipe char | is just a regular char and doesn't mean "or".
A character set atom lists characters or ranges you want to accept (or exclude if the character set starts with ^) and "or" is implicit.
You can use a backslash in a character set if you need to include/exclude a close bracket ], the ^ sign, the dash - that is used for ranges, the backslash \ itself, an unprintable character or if you want to use a non-ASCII unicode char specifying the code instead of literally.
Regular expression syntax however also lets you to avoid backslash-escaping in a character set atom by placing the character in a position where it cannot have the special meaning... for example a dash - as first or last in the set (it cannot mean a range there).
Note also that if you need to be able to match as values quoted strings, including backslash escaping, the regular expression is more complex, for example
'(?:[^'\\]|\\.)*'|"(?:[^"\\]|\\.)*"
matches a single-quoted or double-quoted string including backslash escaping, the meaning being:
A single quote '
Zero or more of either:
Any char except the single quote ' or the backslash \
A pair composed of a backslash \ followed by any char
A single quote '
or the same with double quotes " instead.
Note that the groups have been delimited with (?:...) instead of plain (...) to avoid capture
It doesn't match hyphens because it interpreting |-| as a range that starts at | and ends at |. (I would have expected that to be treated as a syntax error, but there you have it. It works the same in every regex flavor I've tried, too.)
Have a look at this regex:
/(?:^|;)([a-z-]*)\s*:\s*([a-z'":_\/.0-9-]*)\s*(?=;|$)/ig
As suggested by the other responders, I collapsed it to one alternative, removed the unneeded pipes, and escaped the hyphen by moving it to the end. I also anchored it at the beginning as well as the end. Or anchored it as well as I can, anyway. I used a lookahead to match the trailing semicolon so it will still be there when the next match starts. It's far from foolproof, but it should work okay as long as the input is well formed.
Replace regular expressions in your code as follow:
regExpKeyValuePair = /[-a-z]*\s*:\s*[-a-z'":_\/.0-9]*\s*;|[-a-z]*\s*:\s*[-a-z'":-_\/.0-9]*\s*$/gi;
regExpKey = /[-a-z]*/gi;
regExpValue = /[-a-z:_\/.0-9]*/gi;
You don't need escape . inside [].
No need to put | between elements [].
Because you are using /i flag, [A-Z] is not needed.
- should be at the beginning or at the end.
I tried many ways to get a single backslash from an executed (I don't mean an input from html).
I can get special characters as tab, new line and many others then escape them to \\t or \\n or \\(someother character) but I cannot get a single backslash when a non-special character is next to it.
I don't want something like:
str = "\apple"; // I want this, to return:
console.log(str); // \apple
and if I try to get character at 0 then I get a instead of \.
(See ES2015 update at the end of the answer.)
You've tagged your question both string and regex.
In JavaScript, the backslash has special meaning both in string literals and in regular expressions. If you want an actual backslash in the string or regex, you have to write two: \\.
The following string starts with one backslash, the first one you see in the literal is an escape character starting an escape sequence. The \\ escape sequence tells the parser to put a single backslash in the string:
var str = "\\I have one backslash";
The following regular expression will match a single backslash (not two); again, the first one you see in the literal is an escape character starting an escape sequence. The \\ escape sequence tells the parser to put a single backslash character in the regular expression pattern:
var rex = /\\/;
If you're using a string to create a regular expression (rather than using a regular expression literal as I did above), note that you're dealing with two levels: The string level, and the regular expression level. So to create a regular expression using a string that matches a single backslash, you end up using four:
// Matches *one* backslash
var rex = new RegExp("\\\\");
That's because first, you're writing a string literal, but you want to actually put backslashes in the resulting string, so you do that with \\ for each one backslash you want. But your regex also requires two \\ for every one real backslash you want, and so it needs to see two backslashes in the string. Hence, a total of four. This is one of the reasons I avoid using new RegExp(string) whenver I can; I get confused easily. :-)
ES2015 and ES2018 update
Fast-forward to 2015, and as Dolphin_Wood points out the new ES2015 standard gives us template literals, tag functions, and the String.raw function:
// Yes, this unlikely-looking syntax is actually valid ES2015
let str = String.raw`\apple`;
str ends up having the characters \, a, p, p, l, and e in it. Just be careful there are no ${ in your template literal, since ${ starts a substitution in a template literal. E.g.:
let foo = "bar";
let str = String.raw`\apple${foo}`;
...ends up being \applebar.
Try String.raw method:
str = String.raw`\apple` // "\apple"
Reference here: String.raw()
\ is an escape character, when followed by a non-special character it doesn't become a literal \. Instead, you have to double it \\.
console.log("\apple"); //-> "apple"
console.log("\\apple"); //-> "\apple"
There is no way to get the original, raw string definition or create a literal string without escape characters.
please try the below one it works for me and I'm getting the output with backslash
String sss="dfsdf\\dfds";
System.out.println(sss);
using http://www.regular-expressions.info/javascriptexample.html I tested the following regex
^\\{1}([0-9])+
this is designed to match a backslash and then a number.
It works there
If I then try this directly in code
var reg = /^\\{1}([0-9])+/;
reg.exec("/123")
I get no matches!
What am I doing wrong?
Update:
Regarding the update of your question. Then the regex has to be:
var reg = /^\/(\d+)/;
You have to escape the slash inside the regex with \/.
The backslash needs to be escaped in the string too:
reg.exec("\\123")
Otherwise \1 will be treated as special character.
Btw, the regular expression can be simplified:
var reg = /^\\(\d+)/;
Note that I moved the quantifier + inside the capture group, otherwise it will only capture a single digit (namely 3) and not the whole number 123.
You need to escape the backslash in your string:
"\\123"
Also, for various implementation bugs, you may want to set reg.lastIndex = 0;.
In addition, {1} is completely redundant, you can simplify your regex to /^\\(\d)+/.
One last note: (\d)+ will only capture the last digit, you may want (\d+).
I'm trying to write a regular that will check for numbers, spaces, parentheses, + and -
this is what I have so far:
/\d|\s|\-|\)|\(|\+/g
but im getting this error: unmatched ) in regular expression
any suggestions will help.
Thanks
Use a character class:
/[\d\s()+-]/g
This matches a single character if it's a digit \d, whitespace \s, literal (, literal ), literal + or literal -. Putting - last in a character class is an easy way to make it a literal -; otherwise it may become a range definition metacharacter (e.g. [A-Z]).
Generally speaking, instead of matching one character at a time as alternates (e.g. a|e|i|o|u), it's much more readable to use a character class instead (e.g. [aeiou]). It's more concise, more readable, and it naturally groups the characters together, so you can do e.g. [aeiou]+ to match a sequence of vowels.
References
regular-expressions.info/Character Class
Caveat
Beginners sometimes mistake character class to match [a|e|i|o|u], or worse, [this|that]. This is wrong. A character class by itself matches one and exactly one character from the input.
Related questions
Regex: why doesn’t [01-12] range work as expected?
Here is an awesome Online Regular Expression Editor / Tester! Here is your [\d\s()+-] there.
/^[\d\s\(\)\-]+$/
This expression matches only digits, parentheses, white spaces, and minus signs.
example:
888-111-2222
888 111 2222
8881112222
(888)111-2222
...
You need to escape your parenthesis, because parenthesis are used as special syntax in regular expressions:
instead of '(':
\(
instead of ')':
\)
Also, this won't work with '+' for the same reason:
\+
Edit: you may want to use a character class instead of the 'or' notation with '|' because it is more readable:
[\s\d()+-]
Try this:
[\d\s-+()]