I'm after a regular expression that matches a UK Currency (ie. £13.00, £9,999.99 and £12,333,333.02), but does not allow negative (-£2.17) or zero values (£0.00 or 0).
I've tried to create one myself, but I've got in a right muddle!
Any help greatfully received.
Thanks!
This'll do it (well mostly...)
/^£?[1-9]{1,3}(,\d{3})*(\.\d{2})?$/
Leverages the ^ and $ to make sure no negative or other character is in the string, and assumes that commas will be used. The pound symbol, and pence are optional.
edit: realised you said non-zero so replaced the first \d with [1-9]
Update: it's been pointed out the above won't match £0.01. The below improvement will but now there's a level of complexity where it may quite possibly be better to test /[1-9]/ first and then the above - haven't benchmarked it.
/^£?(([1-9]{1,3}(,\d{3})*(\.\d{2})?)|(0\.[1-9]\d)|(0\.0[1-9]))$/
Brief explanation:
Match beginning of string followed by optional "£"
Then match either:
a >£1 amount with potential for comma separated groupings and optional pence
OR a <£1 >=£0.10 amount
OR a <=£0.09 amount
Then match end of line
The more fractions of pence (zero in the above) you require adding to the regex the less efficient it becomes.
Under Unix/Linux, it's not always possible to type in the '£' sign in a JavaScript file, so I tend to use its hexadecimal representation, thus:
/^\xA3?\d{1,3}?([,]\d{3}|\d)*?([.]\d{1,2})?$/
This seems to take care of all combinations of UK currency amounts representation that I have come across.
/^\xA3?\d{1,}(?:\,?\d+)*(?:.\d{1,2})?$/;
Explanation:
^ Matches the beginning of the string, or the beginning of a line.
xA3 Matches a "£" character (char code 163)
? Quantifier for match between 0 and 1 of the preceding token.
\d Matches any digit character (0-9).
{1,} Match 1 or more of the preceding token.
(?: Groups multiple tokens together without creating a capture group.
\, Matches a "," character (char code 44).
{1,2} Match between 1 and 2 of the preceding token.
$ Matches the end of the string, or the end of a line if the multiline flag (
You could just make two passes:
/^£\d{1,3}(,\d{3})*(\.\d{2})?$/
to validate the format, and
/[1-9]/
to ensure that at least one digit is non-zero.
This is less efficient than doing it in one pass, of course (thanks, annakata, for the benchmark information), but for a first implementation, just "saying what you want" can significantly reduce developing time.
Related
My requirement is to validate a NON-ZERO number. The regular expression that I used is the following.
^[1-9]?\d+(\.\d)?\d*$
VALID VALUES should be
2
22
2222 (up to any number of digits)
2.2222 (up to any number of decimal points)
INVALID VALUES
0
(. without decimal values)
0.1 (any number of decimals, but start digit is 0)
2.4.5 (more than one .)
basically any values starting with 0 or has more than one . or no decimal points provided when . is added, are INCORRECT.
^[1-9]\d*(?:\.\d+)?$
https://regex101.com/r/wHZUoW/1
-Since you don't want the number to start with 0, you shouldn't make the [1-9] at the beginning optional with ?.
-As general good practice, a non-capturing group (?: ... ) is used instead of a capturing group, because the contents do not need to be referenced later.
You can use this regex to validate your numbers:
^(?=[1-9])\d*\.?\d+$
It uses a regex proposed by #WiktorStribiżew as a comment to this question to match decimal numbers (^\d*\.?\d+$), and adds a positive lookahead to ensure that the first character of the number is not 0. Note that if you want to allow numbers such as .3, you should add . to the lookahead character class i.e. ^(?=[1-9.])\d*\.?\d+$
Demo on regex101
[1-9]+[0-9]*(\.[0-9]+)? should work for your case.
[1-9]+ will make sure that the expression starts with a number different than 0.
[0-9]* will make sure to allow the expression to have zeros after the first digit.
(\.[0-9]+)? will allow an extension to the expression, which has to have a . and at least 1 number after it. The ? in the end makes it optional.
By the way, I really like this website to test my regular expressions: https://regexr.com/, you should try it.
I've written a regular expression that matches any number of letters with any number of single spaces between the letters. I would like that regular expression to also enforce a minimum and maximum number of characters, but I'm not sure how to do that (or if it's possible).
My regular expression is:
[A-Za-z](\s?[A-Za-z])+
I realized it was only matching two sets of letters surrounding a single space, so I modified it slightly to fix that. The original question is still the same though.
Is there a way to enforce a minimum of three characters and a maximum of 30?
Yes
Just like + means one or more you can use {3,30} to match between 3 and 30
For example [a-z]{3,30} matches between 3 and 30 lowercase alphabet letters
From the documentation of the Pattern class
X{n,m} X, at least n but not more than m times
In your case, matching 3-30 letters followed by spaces could be accomplished with:
([a-zA-Z]\s){3,30}
If you require trailing whitespace, if you don't you can use: (2-29 times letter+space, then letter)
([a-zA-Z]\s){2,29}[a-zA-Z]
If you'd like whitespaces to count as characters you need to divide that number by 2 to get
([a-zA-Z]\s){1,14}[a-zA-Z]
You can add \s? to that last one if the trailing whitespace is optional. These were all tested on RegexPlanet
If you'd like the entire string altogether to be between 3 and 30 characters you can use lookaheads adding (?=^.{3,30}$) at the beginning of the RegExp and removing the other size limitations
All that said, in all honestly I'd probably just test the String's .length property. It's more readable.
This is what you are looking for
^[a-zA-Z](\s?[a-zA-Z]){2,29}$
^ is the start of string
$ is the end of string
(\s?[a-zA-Z]){2,29} would match (\s?[a-zA-Z]) 2 to 29 times..
Actually Benjamin's answer will lead to the complete solution to the OP's question.
Using lookaheads it is possible to restrict the total number of characters AND restrict the match to a set combination of letters and (optional) single spaces.
The regex that solves the entire problem would become
(?=^.{3,30}$)^([A-Za-z][\s]?)+$
This will match AAA, A A and also fail to match AA A since there are two consecutive spaces.
I tested this at http://regexpal.com/ and it does the trick.
You should use
[a-zA-Z ]{20}
[For allowed characters]{for limiting of the number of characters}
Thanks for taking a look.
My goal is to come up with a regexp that will match input that contains no digits, whitespace or the symbols !#£$%^&*()+= or any other symbol I may choose.
I am however struggling to grasp precisely how regular expressions work.
I started out with the simple pattern /\D/, which from my understanding will match the first non-digit character it can find. This would match the string 'James' which is correct but also 'James1' which I don't want.
So, my understanding is that if I want to ensure that a pattern is not found anywhere in a given string, I use the ^ and $ characters, as in /^\D$/. Now because this will only match a single character that is not a digit, I needed to use + to specify that 1 or more digits should not be founds in the entire string, giving me the expression /^\D+$/. Brilliant, it no longer matches 'James1'.
Question 1
Is my reasoning up to this point correct?
The next requirement was to ensure no whitespace is in the given string. \s will match a single whitespace and [^\s] will match the first non-whitespace character. So, from my understanding I just had to add this to what I have already to match strings that contain no digits and no whitespace. Again, because [^\s] will only match a single non-white space character, I used + to match one or more whitespace characters, giving the new regexp of /^\D+[^\s]+$/.
This is where I got lost, as the expression now matches 'James1' or even 'James Smith25'. What? Massively confused at this point.
Question 2
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Question 3
How would I go about writing the regular expression I'm trying to solve?
While I am keen to solve the problem I am more interested in figuring where my understanding of regular expressions is lacking, so any explanations would be helpful.
Not quite; ^ and $ are actually "anchors" - they mean "start" and "end", it's actually a little more complicated, but you can consider them to mean the start and end of a line for now - look up the various modifiers on regular expressions if you're interested in learning more about this. Unfortunately ^ has an overloaded meaning; if used inside square brackets it means "not", which is the meaning you are already acquainted with. It's very important that you understand the difference between these two meanings and that the definition in your head actually applies only to character range matching!
Contributing further to your confusion is that \d means "a numerical digit" and \D means "not a numerical digit". Similarly \s means "a whitespace (space/tab/newline/etc.) character" and \S means "not a whitespace character."
It's worth noting that \d is effectively a shortcut for [0-9] (note that - has a special meaning inside square brackets), and \D is a shortcut for [^0-9].
The reason it's matching strings that contain spaces is that you've asked for "1+ non-numerical digits followed by 1+ non-space characters" - so it'll match lots of strings! I think that perhaps you don't understand that regular expressions match bits of strings, you're not adding constraints as you go, but rather building up bots of matchers that will match bits of corresponding strings.
/^[^\d\s!#£$%^&*()+=]+$/ is the answer you're looking for - I'd look at it like this:
i. [] - match a range of characters
ii. []+ - match one or more of that range of characters
iii. [^\d\s]+ - match one or more characters that do not match \d (numerical digit) or \s (whitespace)
iv. [^\d\s!#£$%^&*()+=]+ - here's a bunch of other characters I don't want you to match
v. ^[^\d\s!#£$%^&*()+=]+$ - now there are anchors applied, so this matcher has to apply to the whole line otherwise it fails to match
A useful website to explore regexs is http://regexr.com/3b9h7 - which I supply with my suggested solution as an example. Edit: Pruthvi Raj's link to debuggerx is awesome!
Is my reasoning up to this point correct?
Almost. /\D/ matches any character other than a digit, but not just the first one (if you use g option).
and [^\s] will match the first non-whitespace character
Almost, [^\s] will match any non-whitespace character, not just the first one (if you use g option).
/^\D+[^\s]+$/ matching strings that contain spaces?
Yes, it does, because \D matches a space (space is not a digit).
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Because \D+ in /^\D+[^\s]+$/can match spaces.
Conclusion:
Use
^[^\d\s!#£$%^&*()+=]+$
It will match strings that have no digits and spaces, and the symbols you do not allow.
Mind that to match a literal -, ] or [ with a character class, you either need to escape them, or use at the start or end of the expression. To play it safe, escape them.
Just insert every character you don't want to include in a negated character class as follows:
^[^\s\d!#£$%^&*()+=]*$
DEMO
Debuggex Demo
^ - start of the string
[^...] - matches one character that is not in `...`
\s - matches a whitespace (space, newline,tab)
\d - matches a digit from 0 to 9
* - a quantifier that repeats immediately preceeding element by 0 or more times
so the regex matches any string that has
1. string that has a beginning
2. containing 0 or more number of characters that is not whitesapce, digit, and all the symbols included in the character class ( In this example !#£$%^&*()+=) i.e., characters that are not included in the character class `[...]`
3.that has ending
NOTE:
If the symbols you don't want it to have also includes - , a hyphen, don't put it in between some other characters because it is a metacharacter in character class, put it at last of character class
I need to build a JavaScript regular expression with the following constraints:
The input string needs to be at least 6 characters long
The input string needs to contain at least 1 alphabetical character
The input string needs to contain at least 1 non-alphabetical character
I'm seriously lacking a lookback feature in JavaScript. The thing I came up with:
((([a-zA-Z][^a-zA-Z])|([^a-zA-Z][a-zA-Z]))....)|
(.(([a-zA-Z][^a-zA-Z])|([^a-zA-Z][a-zA-Z]))...)|
(..(([a-zA-Z][^a-zA-Z])|([^a-zA-Z][a-zA-Z]))..)|
(...(([a-zA-Z][^a-zA-Z])|([^a-zA-Z][a-zA-Z])).)|
(....(([a-zA-Z][^a-zA-Z])|([^a-zA-Z][a-zA-Z])))
This looks pretty long. Is there a better way?
How I came to this:
Regex for alphabetical character is [a-zA-Z]
Regex for non-alphabetical character is [^a-zA-Z]
So I need to look for a [a-zA-Z][^a-zA-Z] or [^a-zA-Z][a-zA-Z] so (([a-zA-Z][^a-zA-Z])|([^a-zA-Z][a-zA-Z])).
I need to check for n preceding characters and 6-n succeeding characters.
/^(?=.{6})(?=.*[a-zA-Z])(?=.*[^a-zA-Z])/
This means:
^ - start of the string
(?= ... ) - followed by (i.e. an independent submatch; it won't move the current match position)
.{6} - six characters ("start of string followed by six characters" implements the "must be at least six characters long" rule)
.* - 0 or more of any character (except newline - may need to fix this?)
[a-zA-Z] - a letter (.*[a-zA-Z] therefore finds any string with a letter anywhere in it (technically it finds the last letter in it))
[^a-zA-Z] - a non-letter character
In summary: Starting from the beginning of the string, we try to match each of the following in turn:
6 characters (if we find those, the string must be 6 characters long (or more))
an arbitrary string followed by a letter
an arbitrary string followed by a non-letter
Use this regex...
/^(?=.{6,})(?=.*[a-zA-Z])(?=.*[^a-zA-Z]).*$/
-------- ------------- --------------
^ ^ ^
| | |->checks for a single non-alphabet
| |->checks for a single alphabet
|->checks for 6 to many characters
(?=) is a zero width look ahead which checks for a match.It doesn't consume characters.This is the reason why we can use multiple lookaheads back to back
Similar answer to others, thus this doesn't need much explanation, I think the best way is to do
/^(?=.*[a-zA-Z])(?=.*[^a-zA-Z]).{6,}$/
This starts at the beginning of the string, looks ahead for an alphabetical character, looks ahead for a non-alphabetical character and, in the end, it finds a string of 6+ chars, I think there's no need for lookaheads about length
I'm trying to get regex for minimum requirements of a password to be minimum of 6 characters; 1 uppercase, 1 lowercase, and 1 number. Seems easy enough? I have not had any experience in regex's that "look ahead", so I would just do:
if(!pwStr.match(/[A-Z]+/) || !pwStr.match(/[a-z]+/) || !pwStr.match(/[0-9]+/) ||
pwStr.length < 6)
//was not successful
But I'd like to optimize this to one regex and level up my regex skillz in the process.
^.*(?=.{6,})(?=.*[a-zA-Z])(?=.*\d)(?=.*[!&$%&? "]).*$
^.*
Start of Regex
(?=.{6,})
Passwords will contain at least 6 characters in length
(?=.*[a-zA-Z])
Passwords will contain at least 1 upper and 1 lower case letter
(?=.*\d)
Passwords will contain at least 1 number
(?=.*[!#$%&? "])
Passwords will contain at least given special characters
.*$
End of Regex
here is the website that you can check this regex - http://rubular.com/
Assuming that a password may consist of any characters, have a minimum length of at least six characters and must contain at least one upper case letter and one lower case letter and one decimal digit, here's the one I'd recommend: (commented version using python syntax)
re_pwd_valid = re.compile("""
# Validate password 6 char min with one upper, lower and number.
^ # Anchor to start of string.
(?=[^A-Z]*[A-Z]) # Assert at least one upper case letter.
(?=[^a-z]*[a-z]) # Assert at least one lower case letter.
(?=[^0-9]*[0-9]) # Assert at least one decimal digit.
.{6,} # Match password with at least 6 chars
$ # Anchor to end of string.
""", re.VERBOSE)
Here it is in JavaScript:
re_pwd_valid = /^(?=[^A-Z]*[A-Z])(?=[^a-z]*[a-z])(?=[^0-9]*[0-9]).{6,}$/;
Additional: If you ever need to require more than one of the required chars, take a look at my answer to a similar password validation question
Edit: Changed the lazy dot star to greedy char classes. Thanks Erik Reppen - nice optimization!
My experience is if you can separate out Regexes, the better the code will read. You could combine the regexes with positive lookaheads (which I see was just done), but... why?
Edit:
Ok, ok, so if you have some configuration file where you could pass string to compile into a regex (which I've seen done and have done before) I guess it is worth the hassle. But otherwise, Even if the answers provided are corrected to match what you need, I'd still advise against it unless you intend to create such a thing. Separate regexes are just so much nicer to deal with.
I haven't tested thoroughly but here's a more efficient version of Amit's. I think his also allowed unspecified characters into the mix (which wasn't technically listed as a rule). This one won't go berserk on you if you accidentally target a large hunk of text, it will fail sooner on strings that are too long and it only allows the characters in the final class.
'.' should be used sparingly. Think of the looping it has to do to determine a match with all the characters it can represent. It's much more efficient to use negating classes.
`^(?=[^0-9]{0,9}[0-9])(?=[^a-z]{0,9}[a-z])(?=[^A-Z]{0,9}[A-Z])(?=[^##$%]{0,9}[##$%])[0-9a-zA-Z##$%]{6,10`}$
There's nothing wrong with trying to find the ideal regEx. But split it up when you need to.
RegEx tends to be explained poorly. I'll add a breakdown:
a - a single 'a' character
ab - a single 'a' character followed by a single b character
a* - 0 or more 'a' characters
a+ - one or more 'a' characters
a+b - one or any number of a characters followed by a single b character.
a{6,} - at least 6 'a' characters (would match more)
a{6,10} - 6-10 'a' characters
a{10} - exactly 10 'a' characters iirc - not very useful
^ - beginning of a string - so ^a+ would not math 'baaaa'
$ - end of a string - b$ would not find a match 'aaaba'
[] signifies a character class. You can put a variety of characters inside it and every character will be checked. By itself only whatever string character you happen to be on is matched against. It can be modified by + and * as above.
[ab]+c - one or any number of a or b characters followed by a single c character
[a-zA-Z0-9] - any letter, any number - there are a bunch of \<some key> characters representing sets like \d for 'digits' I'm guessing. \w iirc is basically [a-zA-Z_]
note: '\' is the escape key for character classes. [a\-z] for 'a' or '-' or 'z' rather than anything from a to z which is what [a-z] means
[^<stuff>] a character class with the caret in front means everything but the characters or <stuff> listed - this is critical to performance in regEx matches hitting large strings.
. - wildcard character representing most characters (exceptions are a handful of really old-school whitespace characters). Not a big deal in very small sets of characters but avoid using it.
(?=<regex stuff>) - a lookahead. Doesn't move the parser further down the string if it matches. If a lookahead fails, the whole match fails. If it succeeds, you go back to the same character before it. That's why we can string a bunch together to search if there's at least one of a given character.
So:
^ - at the beginning followed by whatever is next
(?=[^0-9]{0,9}[0-9]) - look for a digit from 0-9 preceded by up to 9 or 0 instances of anything that isn't 0-9 - next lookahead starts at the same place
etc. on the lookaheads
[0-9a-zA-Z##$%]{6,10} - 6-10 of any letter, number, or ##$% characters
No '$' is needed because I've limited everything to 10 characters anyway