I can't find any definitive information on what / means in a JavaScript regex.
The code replace(/\r/g, '');
What I'm able to figure out is this:
/ = I don't know
\r = carriage return
/g = I don't know but It may mean 'the match must occur at the point where the previous match ended.'
The slashes indicate the start and end of the regular expression.
The g at the end is a flag and indicates it is a global search.
From the docs:
Regular expressions have four optional flags that allow for global and
case insensitive searching. To indicate a global search, use the g
flag. To indicate a case-insensitive search, use the i flag. To
indicate a multi-line search, use the m flag. To perform a "sticky"
search, that matches starting at the current position in the target
string, use the y flag. These flags can be used separately or together
in any order, and are included as part of the regular expression.
To include a flag with the regular expression, use this syntax:
var re = /pattern/flags;
To add a little more detail, the / characters are part of the regular expression literal syntax in JavaScript/ECMAScript. The / characters are used during lexical analysis to determine that a regular expression pattern is present between them and anything immediately following them will be regular expression flags. The ECMAScript standard has defined this in EBNF, for your perusual:
RegularExpressionLiteral :: / RegularExpressionBody /
RegularExpressionFlags
A good analogy for the / in regular expressions is the " or ' that surround string literals in JavaScript.
As others have pointed out, you should read the docs! That said:
Think of the forward slash as quotation marks for regular expressions. The slashes contain the expression but are not themselves part of the expression. (If you want to test for a forward slash, you have to escape it with a backwards slash.) The lowercase g specifies that this is a global search, i.e., find all matches rather than stopping at the first match.
As is indicated here, the forward slashes are not a part of the expression itself, but denote the beginning and ending of the expression.
To add to metadept's answer:
the g bit is the global indicator - see What does the regular expression /_/g mean? - i.e. replace all occurrences, not just the first one
Related
Is it possible to write a regex that returns the converse of a desired result? Regexes are usually inclusive - finding matches. I want to be able to transform a regex into its opposite - asserting that there are no matches. Is this possible? If so, how?
http://zijab.blogspot.com/2008/09/finding-opposite-of-regular-expression.html states that you should bracket your regex with
/^((?!^ MYREGEX ).)*$/
, but this doesn't seem to work. If I have regex
/[a|b]./
, the string "abc" returns false with both my regex and the converse suggested by zijab,
/^((?!^[a|b].).)*$/
. Is it possible to write a regex's converse, or am I thinking incorrectly?
Couldn't you just check to see if there are no matches? I don't know what language you are using, but how about this pseudocode?
if (!'Some String'.match(someRegularExpression))
// do something...
If you can only change the regex, then the one you got from your link should work:
/^((?!REGULAR_EXPRESSION_HERE).)*$/
The reason your inverted regex isn't working is because of the '^' inside the negative lookahead:
/^((?!^[ab].).)*$/
^ # WRONG
Maybe it's different in vim, but in every regex flavor I'm familiar with, the caret matches the beginning of the string (or the beginning of a line in multiline mode). But I think that was just a typo in the blog entry.
You also need to take into account the semantics of the regex tool you're using. For example, in Perl, this is true:
"abc" =~ /[ab]./
But in Java, this isn't:
"abc".matches("[ab].")
That's because the regex passed to the matches() method is implicitly anchored at both ends (i.e., /^[ab].$/).
Taking the more common, Perl semantics, /[ab]./ means the target string contains a sequence consisting of an 'a' or 'b' followed by at least one (non-line separator) character. In other words, at ANY point, the condition is TRUE. The inverse of that statement is, at EVERY point the condition is FALSE. That means, before you consume each character, you perform a negative lookahead to confirm that the character isn't the beginning of a matching sequence:
(?![ab].).
And you have to examine every character, so the regex has to be anchored at both ends:
/^(?:(?![ab].).)*$/
That's the general idea, but I don't think it's possible to invert every regex--not when the original regexes can include positive and negative lookarounds, reluctant and possessive quantifiers, and who-knows-what.
You can invert the character set by writing a ^ at the start ([^…]). So the opposite expression of [ab] (match either a or b) is [^ab] (match neither a nor b).
But the more complex your expression gets, the more complex is the complementary expression too. An example:
You want to match the literal foo. An expression, that does match anything else but a string that contains foo would have to match either
any string that’s shorter than foo (^.{0,2}$), or
any three characters long string that’s not foo (^([^f]..|f[^o].|fo[^o])$), or
any longer string that does not contain foo.
All together this may work:
^[^fo]*(f+($|[^o]|o($|[^fo]*)))*$
But note: This does only apply to foo.
You can also do this (in python) by using re.split, and splitting based on your regular expression, thus returning all the parts that don't match the regex, how to find the converse of a regex
In perl you can anti-match with $string !~ /regex/;.
With grep, you can use --invert-match or -v.
Java Regexps have an interesting way of doing this (can test here) where you can create a greedy optional match for the string you want, and then match data after it. If the greedy match fails, it's optional so it doesn't matter, if it succeeds, it needs some extra data to match the second expression and so fails.
It looks counter-intuitive, but works.
Eg (foo)?+.+ matches bar, foox and xfoo but won't match foo (or an empty string).
It might be possible in other dialects, but couldn't get it to work myself (they seem more willing to backtrack if the second match fails?)
I have a regular expression which removes html code from a String :
var html = "<p>Dear sms,</p><p>This is a test notification for push message from center II.</p>";
text = html.replace(/(<([^>]+)>)/ig, "")
alert(text)
This is the expression working on jsfiddle : http://jsfiddle.net/VgHr3/53/
The regular expression itself is /(<([^>]+)>)/ig . I don't fully understand how this expression works. Can provide an explanation ? I can find what each character by itself behaves by reading a cheatsheet :http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/
But what is the significance of "/ig" ?
Those are global flags. Your cheat sheet actually lists them on the right side:
Regular Expressions Pattern Modifiers
g Global match
i Case-insensitive
m Multiple lines
s Treat string as single line
x Allow comments and white space in pattern
e Evaluate replacement
U Ungreedy pattern
Note that not all of these flags are supported by the JavaScript regular expression engine. For an authoritative list, see this MDN article.
So the "g" flag makes it global, so it replaces this pattern wherever it is found, instead of just the first instance (which is the default behavior of the replace method).
The "i" flag makes it case-insensitive, so a pattern like [a-z]+ will match "foo" and "FOO". However, because your pattern only involves < and > characters, this flag is useless.
I'm really struggling with the Javascript version of Regular Expression matching, despite knowing how to do it in other languages like C# and PHP.
I wish to match {ANYCHARACTERS}.
It must have:
a { at the start
a } at the end
1 or more characters between (any characters, symbols etc.)
So far I have the following:
<script type="text/javascript">
// The string that I want to perform a match on
var str = "{ASTRINGINHERE£$%^&*éáó}";
// Mt Matching expression
var patt1 = ^/{(.*){1,*}/}$/i;
// Write the matched result
document.write(str.match(patt1));
</script>
As written, your current pattern should result in a javascript syntax error. Here are the problems I see:
You have your ^ character outside the actual regular expression.
You have two regular expression ending characters (/).
See #kopischke's answer on why I removed the {1,} portion.
This should resolve your issues:
/^{(.+)}$/i
The string start / string end codes belong inside the regex. Also, your repetition code is unnecessarily complex. Finally, there is no need to indicate case independence when you match any character. This should do:
patt1 = /^{.+}$/
replace(/[^0-9]/g,''));
Replace is a method
What does / indicate?
What does ^ indicate along with 0-9
What does /g indicate?
Do we need to start a regular expression with / or can we start with anything?
The / introduces a regular expression literal (just like " and ' introduce string literals). A regular expression literal is in the form /expression/flags, where expression is the body of the expression, and flags are optional flags (i for case-insensitive, g for global, m for multi-line stuff).
The ^ as the first character within [] means any character not matching the following. So [^0-9] means "any character except 0 through 9".
The /g ends the regular expression literal and includes the "global" flag on it. Without the g, replace would only replace the first match, not all of them.
In all, what that does is replace any character that isn't 0 through 9 with a blank — e.g., removes non-digits. It could be written more simply as:
var result = str.replace(/\D/g, '');
...because \D (note that's an upper-case D) means "non-digit".
MDC has a decent page on regular expressions.
The / and / are the start and end of the regex pattern, the g mean global (anything after the 2nd / is an optional modifier for the regex).
^ means not.
So in this case it'll remove any character that isn't a number.
See the manual for replace
See regular expression literals
See using special characters
See searching with flags
replace is method of string type
/ / indicates there's a regular expression inside of them
^ inside [] means "not"
"g" means to replace globally
regular expressions in javascript should put in to a pair of "/"
This W3 Schools tutorial should cover most of the basics. This other tutorial covers the flasg, such as /g which can be passed to the regex engine.
yes
start and end of regex
not, that just basically means, match any non-integer
global replacement, the effect of not having that is replacement only done for the first encounter.
At least in javascript, yes you have to use /.
/ indicates the beginning and end of the regexp. Hence in your case [^0-9] is the regex.
^ indicates the start of line
/g indicates the substitution to take place for all the match - globl, and not only for the first match.
/g enables "global" matching. When using the replace() method, specify this modifier to replace all matches, rather than only the first one.
/ start regex
^ match except the symbols 0-9
Well, as to creating one, this forum is not the best for that -- it is a rather large question, best left to one of the best resources on RegExp that I know of.
It looks like you're in JS, so:
replace is a method of String. It replaces the provided expression with the second string, in this case nothing.
In JavaScript / must begin and end all RegEx's, all / in the middle must be escaped with a \ (so they look like this: \/). In other languages (PHP, Perl being some of the most prominent), you can use other characters such as ~ and #.
^ inside of [] means negation, - means range, so [^0-9] means "not 0, 1, 2, 3, 4, 5, 6, 7, 8 or 9" [0-9] does have a shorthand of \d. So /[^\d]/g is a valid, alternate way to say the same thing.
/g means "global" as in "match all incidents, not just the first.
Your expression means, "replace all non-digits with nothing".
The / encapsulate your pattern (you need to escape / with \ if you want to use it in pattern)
and the trailing character after the slashes are modifiers. 'g' in this case means global search (i.e. find all matches)
^ is negation.. [0-9] is range indicating all numbers from 0 to 9.
so [^0-9] means anything except numbers
So This regex basically replaces anything except numbers in the string with '' (i.e. remove them)
Regex has lots of other features, you should research them!
What it does: Removes all non-numeric (0-9) characters.
The forward slash (/) is used when you declare RegExp literals
The [^0-9] means any character OTHER THAN 0-9. The ^ means "other than". You can remove it and it'll look for only a character 0-9.
The /g represents global replacement.
So this will look for any non-number character and replace it with nothing.
As Shamim notes, regular-expressions.info/is a great site. Best of luck!
You can try out javascript regex's on this site: http://regexpal.com/
Couples with http://www.regular-expressions.info/tutorial, it's a great resource for learning.
Can someone explain this regular expression to validate email.
var emailExp = /^[\w\-\.\+]+\#[a-zA-Z0-9\.\-]+\.[a-zA-z0-9]{2,4}$/;
I need to know what does this independent elements do
"/^" and "\" and "\.\-" and "$" //Please explain individually
Thanks in advance
Quick explanation
/
JavaScript regular expressions start with a / and end with another one. Everything in-between is a regular expression. After the second / there may be switches like g (global) and/or i (ignore case) ie. var rx = /.+/gi;)
^
Start of a text line (so nothing can be prepended before the email address). This also comes in handy in multi-line texts.
\
Used to escape special characters. A dot/full-stop . is a special character and represents any single character but when presented as \. it means a dot/full-stop itself. Characters that need to escaped are usually used in regular expression syntax. (braces, curly braces, square brackets etc.) You'll know when you learn the syntax.
\.\-
Two escaped characters. Dot/full-stop and a minus/hyphen. So it means .-
$
End of line.
Learn regular expressions
They are one of the imperative things every developer should understand to some extent. At least some basic knowledge is mandatory.
Some resources
General regular expression syntax resource
http://www.regular-expressions.info/
JavaScript related regular expressions
https://developer.mozilla.org/en/Core_JavaScript_1.5_Guide/Regular_Expressions
/
The start of the expression
^
The start of the string (since it appears at the start of the expression)
\
Nothing outside the context of the character that follows it
\.\-
A full stop. A hyphen.
$
The end of the string
The other posters have done an excellent job at explaining this regex, but if your goal is to actually do e-mail validation in JavaScript, please check out this StackOverflow thread.