I am looking for the occurrence of the pattern a:b. This could be of the form
a: b or
a :b or
a : b
(note the optional spaces).
I am new to RegExes and was trying something of the form : a\s:\sb but kinda din work..
Can somebody point me out the right one ?
Thanks..
Your current regex is 'a\s:\sb'. Since you didn't make the '\s' part of the pattern optional, this only matches 'a[SPACE]:[SPACE]b', where I am using [SPACE] as a standin for space, tab, or any other whitespace character. Instead, you can use 'a\s?:\s?b', which makes the whitespace optional.
For more regular expression information, I would recommend the Perl regular expression tutorial.
Try a\s*:\s*b. Also, this is handy to test: http://www.pagecolumn.com/tool/regtest.htm
Related
Is it possible to write a regex that returns the converse of a desired result? Regexes are usually inclusive - finding matches. I want to be able to transform a regex into its opposite - asserting that there are no matches. Is this possible? If so, how?
http://zijab.blogspot.com/2008/09/finding-opposite-of-regular-expression.html states that you should bracket your regex with
/^((?!^ MYREGEX ).)*$/
, but this doesn't seem to work. If I have regex
/[a|b]./
, the string "abc" returns false with both my regex and the converse suggested by zijab,
/^((?!^[a|b].).)*$/
. Is it possible to write a regex's converse, or am I thinking incorrectly?
Couldn't you just check to see if there are no matches? I don't know what language you are using, but how about this pseudocode?
if (!'Some String'.match(someRegularExpression))
// do something...
If you can only change the regex, then the one you got from your link should work:
/^((?!REGULAR_EXPRESSION_HERE).)*$/
The reason your inverted regex isn't working is because of the '^' inside the negative lookahead:
/^((?!^[ab].).)*$/
^ # WRONG
Maybe it's different in vim, but in every regex flavor I'm familiar with, the caret matches the beginning of the string (or the beginning of a line in multiline mode). But I think that was just a typo in the blog entry.
You also need to take into account the semantics of the regex tool you're using. For example, in Perl, this is true:
"abc" =~ /[ab]./
But in Java, this isn't:
"abc".matches("[ab].")
That's because the regex passed to the matches() method is implicitly anchored at both ends (i.e., /^[ab].$/).
Taking the more common, Perl semantics, /[ab]./ means the target string contains a sequence consisting of an 'a' or 'b' followed by at least one (non-line separator) character. In other words, at ANY point, the condition is TRUE. The inverse of that statement is, at EVERY point the condition is FALSE. That means, before you consume each character, you perform a negative lookahead to confirm that the character isn't the beginning of a matching sequence:
(?![ab].).
And you have to examine every character, so the regex has to be anchored at both ends:
/^(?:(?![ab].).)*$/
That's the general idea, but I don't think it's possible to invert every regex--not when the original regexes can include positive and negative lookarounds, reluctant and possessive quantifiers, and who-knows-what.
You can invert the character set by writing a ^ at the start ([^…]). So the opposite expression of [ab] (match either a or b) is [^ab] (match neither a nor b).
But the more complex your expression gets, the more complex is the complementary expression too. An example:
You want to match the literal foo. An expression, that does match anything else but a string that contains foo would have to match either
any string that’s shorter than foo (^.{0,2}$), or
any three characters long string that’s not foo (^([^f]..|f[^o].|fo[^o])$), or
any longer string that does not contain foo.
All together this may work:
^[^fo]*(f+($|[^o]|o($|[^fo]*)))*$
But note: This does only apply to foo.
You can also do this (in python) by using re.split, and splitting based on your regular expression, thus returning all the parts that don't match the regex, how to find the converse of a regex
In perl you can anti-match with $string !~ /regex/;.
With grep, you can use --invert-match or -v.
Java Regexps have an interesting way of doing this (can test here) where you can create a greedy optional match for the string you want, and then match data after it. If the greedy match fails, it's optional so it doesn't matter, if it succeeds, it needs some extra data to match the second expression and so fails.
It looks counter-intuitive, but works.
Eg (foo)?+.+ matches bar, foox and xfoo but won't match foo (or an empty string).
It might be possible in other dialects, but couldn't get it to work myself (they seem more willing to backtrack if the second match fails?)
I'm formatting a datetime string in javascript by Regex, so I want:
Find and replace all d characters when d is not within other alphabetic characters. like this:
Find and replace all dd characters when dd is not within other alphabetic characters. like this:
I tested /\bd\b/mg pattern but its result is not which I want everytime.
How should I exclude unwanted cases in the following command:
str = str.replace(/\bd\b/mg, number);
The regular expression You posted does not consider _ as a word boundary, so it does not replace the character as expected.
In order to include this character as well, either before or after the d character to be replaced, You can use expressions similar to these:
To replace d:
/(\b|_)(d)(\b|_)/mg
To replace dd:
/(\b|_)(dd)(\b|_)/mg
Or to replace both in the same way (if it's acceptable):
/(\b|_)(d|dd)(\b|_)/mg
In comments under this answer in another thread on StackOverflow, it was also suggested to use a library that can format dates, instead of implementing it by Yourself.
UPDATE: As someone mentioned, the issue with this is also that including _ in the regular expression, removes it after the replacement. However, You can call replace and use capturing parentheses references, like this:
str.replace(/(\b|_)(d)(\b|_)/mg, "$1" + number + "$3")
I've updated earlier expressions posted in this answer to work with this method.
Please note that I'm not aware of all the cases You want to consider, so if You have any problems using the suggested solution or it does not work as expected in Your case, please let me know, so I can try to help.
I could use a lookahead and if you are not using JavaScript then a lookbehind as well.
example lookahead which checks if there is no following alpha character:
(?=[^a-zA-Z])
If you are using JavaScript it doesn't support lookbehind so you will need to use a capturing group and backreferencing.
For JS capture the part in the outermost parentheses and then use \1, \2... to target:
[^a-zA-Z](d(?=[^a-zA-Z]))
non-JS can use lookbehind:
(?<=[^a-zA-Z])d(?=[^a-zA-Z])
I'm trying to create a function with a regex that can decide if my string value is correct or not. It should be true, if the string begins with lower or uppercase alphabetical characters or underscore. If it begins with any others, the function must return false.
My test input is something like this: ".dasfh"
The expressions, what I tried to use: [_a-zA-Z]..., [:alpha:]..., but both of them returned true.
I tried a bit easier task also:
"Hadfg" where the expression is [a-z]...: returns true
BUT
"hadfg" where the expression is [A-Z]...: returns false
Could anybody help me to understand this behaviour?
You're trying to match the first character in the string to be something in particular, this means you have to tell regex that it has to be the first character in the string.
The regex engine just tries to find any match in the entire string.
All you're telling it with [a-z] is "find me a lowercase character anywhere in the string". This means that:
"Hadfg" will equal true because it can find a, d, f or g as a match.
"HADFG" will equal false because there are no lowercase letters.
the same will happen for "hADFG" when matched with [A-Z] for instance, it will be able to find an A, D, F or G as a match whereas "hadfg" will return false because there is no uppercase character.
What you are looking for here is ^ in your regex, it is a special kind of modifier that indicates "start of line"
So when you apply this to your regex it will look like this: /^[a-z]/.
The regex on the previous line basically says "from the start of the string, is the first character following up a lowercase a-z?"
Try it out and you'll see.
For your solution you'd need /^[_a-zA-Z]/ to check if the first character is an _, a-z or A-Z character.
For reference, you can find cheatsheets within these tools (and test your regexes with it ofcourse!)
Regexr - My personal favorite (Uses your browsers JS regex engine)
Rubular - A Ruby regex tester
Regex101 - A Python / PCRE / PHP / JavaScript
And for a reference or tutorial (I'd recommend reading from start to finish if you want to start understanding regexp and how they work) theres regular-expressions.info.
Regex is never easy and be careful with what you do with it, it's a powerful but sometimes ugly beast to deal with :)
PS
I see you tagged your question as email-validation so I'll add a little bonus regex that validates the minimum requirements for an email address to be absolutely correct, I use this one personally:
.+#.+\..{2,}
which when broken up, looks like this:
.+ - one or more of any character
# - followed by a literal # character
.+ - one or more of any character
\. - followed by a literal . character
.{2,} - two or more of any character
Optionally you could replace {2,} with a + to make it one or more but this would allow a TLD with 1 character.
To see a RFC email-regex at work check this link.
When I look at that regex I basically just want to cry in a corner somewhere, there are definitely things you cannot do in an email address that my regex doesn't address but at least it makes sure it's something that looks like it's e-mailable anyways, if a new user decides to fill in some bull that's not my problem anymore and I wouldn't want to force them to change that 1 character just because the huge regex doesn't agree with it either.
Is there any reason the following string should fail the regular expression below?
String: "http://devices/"
Expression:
/^(http:\/\/|https:\/\/|ftp:\/\/|www.|pop:\/\/|imap:\/\/){1}([0-9A-Za-z]+\.)/.test(input.val())
Thank you for your consideration.
Yes it will fail because of the last dot . in your regular expression.
/^ ... \.)/
^^
There is not one in the string you are validating against.
http://devices
^ Should be a dot, not a forward slash
Live Demo
If you are planning on using regex to do this, I would probably prefer using a RegExp Object to avoid all the escaping, or group the prefixes together using a non-capturing group.
/^((?:https?|ftp|pop|imap):\/{2}|www\.) ... $/
The last character in the string must be a period. see "\." at the end of the regex.
You can use http://rubular.com/ to test simple regex expressions and what their matches are.
The reason why it's failing is because, you are using:
^(http:\/\/|https:\/\/|ftp:\/\/|www.|pop:\/\/|imap:\/\/){1}([0-9A-Za-z]+\.)
and you should use:
^(http:\/\/|https:\/\/|ftp:\/\/|www.|pop:\/\/|imap:\/\/){1}([0-9A-Za-z]+.)
You don't have to escape . --------^
You need to close the regex with a $.
On this two last: .), this dot should be optional, as it is needed to validade.
to satisfy this "http://devices/" the regex in java at least is:
^((http://)|(https://)|(ftp://)|(pop://)|(imap://)){1}(www.)?([0-9A-Za-z]+)(.)?([0-9A-Za-z]+)?$
Are those / at the beggining and the end code delimiters?
I want to match a string with following regular expression -
^\d{4}-\d{5}$|^\d{4}-\d{6}$
which is regex for a zip code with 4 digits-then 5 OR 6 digits after dash.
I am hoping my regex is correct as I have tested it on some online RegEx tester.
and for matching my string with above regex in jquery, I am using:
var regExpTest = new RegExp("^\d{4}-\d{5}$|^\d{4}-\d{6}$");
alert(regExpTest.test("1234-123456"));
But I am always getting false, can anyone please guide what is going wrong here?
Thank you!
Because the regular expression constructor takes a string as its argument, you need to escape the backslash \ wherever you use it. In your example, anywhere you have a \d needs to be \\d. You can see what happens if you don't by testing your code in Firebug or Chrome's developer tools:
new RegExp("^\d{4}-\d{5}$|^\d{4}-\d{6}$");
//-> /^d{4}-d{5}$|^d{4}-d{6}$/
Notice the slashes are gone? Now watch what happens when we escape each backslash:
new RegExp("^\\d{4}-\\d{5}$|^\\d{4}-\\d{6}$");
//-> /^\d{4}-\d{5}$|^\d{4}-\d{6}$/
So that should fix your problem. However, it's much easier to use the literal grammar for regular expressions when you're not using a variable to create them:
var regExpTest = /^\d{4}-\d{5}$|^\d{4}-\d{6}$/;
alert(regExpTest.test("1234-123456"));
//-> "true"
This way, you can write the expression without having to worry about double-escaping.