I've written a regular expression that matches any number of letters with any number of single spaces between the letters. I would like that regular expression to also enforce a minimum and maximum number of characters, but I'm not sure how to do that (or if it's possible).
My regular expression is:
[A-Za-z](\s?[A-Za-z])+
I realized it was only matching two sets of letters surrounding a single space, so I modified it slightly to fix that. The original question is still the same though.
Is there a way to enforce a minimum of three characters and a maximum of 30?
Yes
Just like + means one or more you can use {3,30} to match between 3 and 30
For example [a-z]{3,30} matches between 3 and 30 lowercase alphabet letters
From the documentation of the Pattern class
X{n,m} X, at least n but not more than m times
In your case, matching 3-30 letters followed by spaces could be accomplished with:
([a-zA-Z]\s){3,30}
If you require trailing whitespace, if you don't you can use: (2-29 times letter+space, then letter)
([a-zA-Z]\s){2,29}[a-zA-Z]
If you'd like whitespaces to count as characters you need to divide that number by 2 to get
([a-zA-Z]\s){1,14}[a-zA-Z]
You can add \s? to that last one if the trailing whitespace is optional. These were all tested on RegexPlanet
If you'd like the entire string altogether to be between 3 and 30 characters you can use lookaheads adding (?=^.{3,30}$) at the beginning of the RegExp and removing the other size limitations
All that said, in all honestly I'd probably just test the String's .length property. It's more readable.
This is what you are looking for
^[a-zA-Z](\s?[a-zA-Z]){2,29}$
^ is the start of string
$ is the end of string
(\s?[a-zA-Z]){2,29} would match (\s?[a-zA-Z]) 2 to 29 times..
Actually Benjamin's answer will lead to the complete solution to the OP's question.
Using lookaheads it is possible to restrict the total number of characters AND restrict the match to a set combination of letters and (optional) single spaces.
The regex that solves the entire problem would become
(?=^.{3,30}$)^([A-Za-z][\s]?)+$
This will match AAA, A A and also fail to match AA A since there are two consecutive spaces.
I tested this at http://regexpal.com/ and it does the trick.
You should use
[a-zA-Z ]{20}
[For allowed characters]{for limiting of the number of characters}
Related
I have a regex which finds all kind of money denoted in dollars,like $290,USD240,$234.45,234.5$,234.6usd
(\$)[0-9]+\.?([0-9]*)|usd+[0-9]+\.?([0-9]*)|[0-9]+\.?[0-9]*usd|[0-9]+\.?[0-9]*(\$)
This seems to works, but how can i avoid the complexity in my regex?
It is possible to make the regex a bit shorter by collapsing the currency indicators:
You can say USD OR $ amount instead of USD amount OR $ amount. This results in the following regex:
((\$|usd)[0-9]+\.?([0-9]*))|([0-9]+\.?[0-9]*(\$|usd))
Im not sure if you'll find this less complex, but at least it's easier to read because it's shorter
The character set [0-9] can also be replaced by \d -- the character class which matches any digit -- making the regex even shorter.
Doing this, the regex will look as follows:
((\$|usd)\d+\.?\d*)|(\d+\.?\d*(\$|usd))
Update:
According to #Toto this regex would be more performant using non-capturing groups (also removed the not-necessary capture group as pointed out by #Simon MᶜKenzie):
(?:\$|usd)\d+\.?\d*|\d+\.?\d*(?:\$|usd)
$.0 like amounts are not matched by the regex as #Gangnus pointed out. I updated the regex to fix this:
((\$|usd)((\d+\.?\d*)|(\.\d+)))|(((\d+\.?\d*)|(\.\d+))(\$|usd))
Note that I changed \d+\.?\d* into ((\d+\.?\d*)|(\.\d+)): It now either matches one or more digits, optionally followed by a dot, followed by zero or more digits; OR a dot followed by one or more digits.
Without unnecessary capturing groups and using non-capturing groups:
(?:\$|usd)(?:\d+\.?\d*|\.\d+)|(?:\d+\.?\d*|\.\d+)(?:\$|usd)
Try this
^(?:\$|usd)?(?:\d+\.?\d*)(?:\$|usd)?$
Reducing the complexity you are reducing the correctness. The following regex works correctly, but even it doesn't take lowcase. (but that could be managed by a key). All other current answers here simply haven't the correct substring for the decimal number.
^\s*(?:(?:(?:-?(?:usd|\$)|(?:usd|\$)-)(?:(?:0|[1-9]\d*)?(?:\.\d+)?(?<=\d)))|(?:-?(?:(?:0|[1-9]\d*)?(?:\.\d+)?(?<=\d))(?:usd|\$)))\s*$
Look here at the test results.
Make a correct line and only after that try to shorten it.
I am trying to create a regular expression (Java/JavaScript) that matches the following regex, but only when there are fewer than 13 characters total (and a minimum of 4).
(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ DDD)?) ← originally posted
(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ [A-Z]+)?)
These values should (and do) match:
MED-123
COTA-1224
MED4
COTB-892K777
MED-33 DDD
MED-234J5678
This value matches, but I don't want it to (I want to only match if there are fewer than 12 characters total):
COT-1111J11111111111111
See http://regexr.com/3bs7b http://regexr.com/3bsfv
I have tried grouping my expression and putting {4,12} at the end, but that just makes it look for 4 to 12 instances of the whole expression matching.
I feel like I am missing something simple...thanks in advance for your help!
You can use negative look-ahead:
(?!.{13,})(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ DDD)?)
Since your expression already make sure that a match starts with COT or MED and there is at least one digit after that, it already guarantees that there are at least 4 characters
I have tried grouping my expression and putting {4,12} at the end, but
that just makes it look for 4 to 12 instances of the whole expression
matching.
This looks for 4 to 12 instances of the whole expression because you didn't add a word boundary \b. Your regex works fine, just add a word boundary and your desired outcome would be achieved. Take a look at this DEMO.
Your regex seems to be very clumsy and looks a little bit hard to read. It is also very limited to certain characters example JK except if you want it to be that way. For a more general pattern, you can check this out
(COT|MED)[AB]?-?[\dJK]{1,8}(\s+D{1,3})?\b
(COT|MED): matches either COT or MED
[AB]?: matches A or B which is optional because of the presence of ?
-?: matches - which is also optional
[\dJK]{1,8}: This matches a number,or J or K with a length of at least one character and a maximum of eight characters.
(\s+D{1,3})?: matches a space or a D at least one time and a maximum of 3 times and this is optional
\b: with respect to your question this seems to be the most important and it creates a boundary for the words that have already been matched. This means that anything exceeding the matched pattern would not be captured.
See the demo here DEMO2
The answer you are looking for is
(?!\S{13})(?:COT|MED)[ABCD]?-?\d{1,4}(?:[JK]+\d*|(?: [A-Z]+)?)
See regex demo
Note that it is almost impossible to check the length of a phrase that is not a whole string or that has spaces inside since boundaries are a bit "blurred". Thus, (?!\S{13}) is a kind of a workaround that just makes sure you do not have a string without whitespace that is 13 characters long or longer.
The regex breakdown:
(?!\S{13}) - Check if the substring that follows does not consist of 13 non-whitespace characters
(?:COT|MED) - Any of the values in the alternation (COTorMED`)
[ABCD]?-? - Optional A, B, C, D and then an optional -
\d{1,4} - 1 to 4 digits
(?:[JK]+\d*|(?: [A-Z]+)?) - a group of 2 alternatives:
[JK]+\d* - J or K, 1 or more times, and then 0 or more digits
(?: [A-Z]+)? - optional space and 1 or more Latin uppercase letters
As this answer suggests, you could solve this this way:
(?=(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ DDD)?))(?={4 , 12})
It's been a while that I am juggling around this. Hope you can give me
some pointers.
All I want to achieve is, the string should contain EXACTLY 4 '-' and 10 digits in any giver order.
I created this regex : ^(-\d-){10}$
It does enforce max-length of 10 on digits but I am not getting a way to implement max-length of 4 for '-'
Thanks
Ok, here's a pattern:
^(?=(?:\d*?-){4}\d*$)(?=(?:-*?\d){10}-*$).{14}$
Demo
Explanation:
The main part is ^.{14}$ which simply checks there are 14 characters in the string.
Then, there are two lookaheads at the start:
(?=(?:\d*?-){4}\d*$)
(?=(?:-*?\d){10}-*$)
The first one checks the hyphens, and the second one checks the digits and make sure the count is correct. Both match the entire input string and are very similar so let's just take a look at the first one.
(?:\d*?-){4} matches any number of digits (or none) followed by a hyphen, four times. After this match, we know there are four hyphens. (I used an ungreedy quantifier (*?) just to prevent useless backtracking, as an optimization)
\d*$ just makes sure the rest of the string is only made of digits.
Two quick questions:
What would be a RegEx string for three letters and two numbers with space before and after them (i.e. " LET 12 ")?
Would you happen to know any good RegEx resources/tools?
For a good resource, try this website and the program RegexBuddy. You may even be able to figure out the answer to your question yourself using these sites.
To start you off you want something like this:
/^[a-zA-Z]{3}\s+[0-9]{2}$/
But the exact details depend on your requirements. It's probably a better idea that you learn how to use regular expressions yourself and then write the regular expression instead of just copying the answers here. The small details make a big difference. Examples:
What is a "letter"? Just A-Z or also foreign letters? What about lower case?
What is a "number"? Just 0-9 or also foreign numerals? Only integers? Only positive integers? Can there be leading zeros?
Should there be a single space between the letters and numbers? Or any amount of any whitespace? Even none?
Do you want to search for this string in a larger text? Or match a line exactly?
etc..
The answers to these questions will change the regular expression. It would be much faster for you in the long run to learn how to create the regular expression than to completely specify your requirements and wait for other people to reply.
I forgot to mention that there will be a space before and after. How do I include that?
Again you need to consider the questions:
Do you mean just one space or any amount of spaces? Possibly not always a space but only sometimes?
Do you mean literally a space character or any whitespace characters?
My guess is:
/^\s+[a-zA-Z]{3}\s+[0-9]{2}\s+$/
/[a-z]{3} [0-9]{2}/i will match 3 letters followed by a whitespace character, and then 2 numbers. [a-z] is a character class containing the letters a through z, and the {3} means that you want exactly 3 members of that class. The space character matches a literal space (alternately, you could use \s, which is a "shorthand" character class that matches any whitespace character). The i at the end is a pattern modifier specifying that your pattern is case-insenstive.
If you want the entire string to only be that, you need to anchor it with ^ and $:
/^[a-z]{3} [0-9]{2}$/i
Regular expression resources:
http://www.regular-expressions.info - great tutorial with a lot of information
http://rexv.org/ - online regular expression tester that supports a variety of engines.
^([A-Za-z]{3}) ([0-9]{2})$ assuming one space between the letters/numbers, as in your example. This will capture the letters and numbers separately.
I use http://gskinner.com/RegExr/ - it allows you to build a regex and test it with your own text.
As you can probably tell from the wide variety of answers, RegEx is a complex subject with a wide variety of opinions and preferences, and often more than one way of doing things. Here's my preferred solution.
^[a-zA-Z]{3}\s*\d{2}$
I used [a-zA-Z] instead of \w because \w sometimes includes underscores.
The \s* is to allow zero or more spaces.
I try to use character classes wherever possible, which is why I went with \d.
\w{3}\s{1}\d{2}
And I like this site.
EDIT:[a-zA-Z]{3}\s{1}\d{2} - The \w supports numeric characters too.
try this regularexpression
[^"\r\n]{3,}
I'm after a regular expression that matches a UK Currency (ie. £13.00, £9,999.99 and £12,333,333.02), but does not allow negative (-£2.17) or zero values (£0.00 or 0).
I've tried to create one myself, but I've got in a right muddle!
Any help greatfully received.
Thanks!
This'll do it (well mostly...)
/^£?[1-9]{1,3}(,\d{3})*(\.\d{2})?$/
Leverages the ^ and $ to make sure no negative or other character is in the string, and assumes that commas will be used. The pound symbol, and pence are optional.
edit: realised you said non-zero so replaced the first \d with [1-9]
Update: it's been pointed out the above won't match £0.01. The below improvement will but now there's a level of complexity where it may quite possibly be better to test /[1-9]/ first and then the above - haven't benchmarked it.
/^£?(([1-9]{1,3}(,\d{3})*(\.\d{2})?)|(0\.[1-9]\d)|(0\.0[1-9]))$/
Brief explanation:
Match beginning of string followed by optional "£"
Then match either:
a >£1 amount with potential for comma separated groupings and optional pence
OR a <£1 >=£0.10 amount
OR a <=£0.09 amount
Then match end of line
The more fractions of pence (zero in the above) you require adding to the regex the less efficient it becomes.
Under Unix/Linux, it's not always possible to type in the '£' sign in a JavaScript file, so I tend to use its hexadecimal representation, thus:
/^\xA3?\d{1,3}?([,]\d{3}|\d)*?([.]\d{1,2})?$/
This seems to take care of all combinations of UK currency amounts representation that I have come across.
/^\xA3?\d{1,}(?:\,?\d+)*(?:.\d{1,2})?$/;
Explanation:
^ Matches the beginning of the string, or the beginning of a line.
xA3 Matches a "£" character (char code 163)
? Quantifier for match between 0 and 1 of the preceding token.
\d Matches any digit character (0-9).
{1,} Match 1 or more of the preceding token.
(?: Groups multiple tokens together without creating a capture group.
\, Matches a "," character (char code 44).
{1,2} Match between 1 and 2 of the preceding token.
$ Matches the end of the string, or the end of a line if the multiline flag (
You could just make two passes:
/^£\d{1,3}(,\d{3})*(\.\d{2})?$/
to validate the format, and
/[1-9]/
to ensure that at least one digit is non-zero.
This is less efficient than doing it in one pass, of course (thanks, annakata, for the benchmark information), but for a first implementation, just "saying what you want" can significantly reduce developing time.