Regex to match exactly 5 numbers and one optional space - javascript

I recently needed to create a regular expression to check input in JavaScript. The input could be 5 or 6 characters long and had to contain exactly 5 numbers and one optional space, which could be anywhere in the string. I am not regex-savvy at all and even though I tried looking for a better way, I ended up with this:
(^\d{5}$)|(^ \d{5}$)|(^\d{5} $)|(^\d{1} \d{4}$)|(^\d{2} \d{3}$)|(^\d{3} \d{2}$)|(^\d{4} \d{1}$)
This does what I need, so the allowed inputs are (if 0 is any number)
'00000'
' 00000'
'0 0000'
'00 000'
'000 00'
'0000 0'
'00000 '
I doubt that this is the only way to achieve such matching with regex, but I haven't found a way to do it in a cleaner way. So my question is, how can this be written better?
Thank you.
Edit:
So, it is possible! Tom Lord's answer does what I needed with regular expressions, so I marked it as a correct answer to my question.
However, soon after I posted this question, I realized that I wasn't thinking right, since every other input in the project was easily 'validatable' with regex, I was immediately assuming I could validate this one with it as well.
Turns out I could just do this:
const validate = function(value) {
const v = value.replace(/\s/g, '')
const regex = new RegExp('^\\d{5}$');
return regex.test(v);
}
Thank you all for the cool answers and ideas! :)
Edit2: I forgot to mention a possibly quite important detail, which is that the input is limited, so the user can only enter up to 6 characters. My apologies.

Note: Using a regular expression to solve this problem might not be
the best answer. As answered
below, it may be
easier to just count the digits and spaces with a simple function!
However, since the question was asking for a regex answer, and in some
scenarios you may be forced to solve this with a regex (e.g. if
you're tied down to a certain library's implementation), the following
answer may be helpful:
This regex matches lines containing exactly 5 digits:
^(?=(\D*\d){5}\D*$)
This regex matches lines containing one optional space:
^(?=[^ ]* ?[^ ]*$)
If we put them together, and also ensure that the string contains only digits and spaces ([\d ]*$), we get:
^(?=(\D*\d){5}\D*$)(?=[^ ]* ?[^ ]*$)[\d ]*$
You could also use [\d ]{5,6} instead of [\d ]* on the end of that pattern, to the same effect.
Demo
Explanation:
This regular expression is using lookaheads. These are zero-width pattern matchers, which means both parts of the pattern are "anchored" to the start of the string.
\d means "any digit", and \D means "any non-digit".
means "space", and [^ ] means "any non-space".
The \D*\d is being repeated 5 times, to ensure exactly 5 digits are in the string.
Here is a visualisation of the regex in action:
Note that if you actually wanted the "optional space" to include things like tabs, then you could instead use \s and \S.
Update: Since this question appears to have gotten quite a bit of traction, I wanted to clarify something about this answer.
There are several "simpler" variant solutions to my answer above, such as:
// Only look for digits and spaces, not "non-digits" and "non-spaces":
^(?=( ?\d){5} *$)(?=\d* ?\d*$)
// Like above, but also simplifying the second lookahead:
^(?=( ?\d){5} *$)\d* ?\d*
// Or even splitting it into two, simpler, problems with an "or" operator:
^(?:\d{5}|(?=\d* \d*$).{6})$
Demos of each line above: 1 2 3
Or even, if we can assume that the string is no more than 6 characters then even just this is sufficient:
^(?:\d{5}|\d* \d*)$
So with that in mind, why might you want to use the original solution, for similar problems? Because it's generic. Look again at my original answer, re-written with free-spacing:
^
(?=(\D*\d){5}\D*$) # Must contain exactly 5 digits
(?=[^ ]* ?[^ ]*$) # Must contain 0 or 1 spaces
[\d ]*$ # Must contain ONLY digits and spaces
This pattern of using successive look-aheads can be used in various scenarios, to write patterns that are highly structured and (perhaps surprisingly) easy to extend.
For example, suppose the rules changed and you now wanted to match 2-3 spaces, 1 . and any number of hyphens. It's actually very easy to update the regex:
^
(?=(\D*\d){5}\D*$) # Must contain exactly 5 digits
(?=([^ ]* ){2,3}[^ ]*$) # Must contain 2 or 3 spaces
(?=[^.]*\.[^.]*$) # Must contain 1 period
[\d .-]*$ # Must contain ONLY digits, spaces, periods and hyphens
...So in summary, there are "simpler" regex solutions, and quite possibly a better non-regex solution to OP's specific problem. But what I have provided is a generic, extensible design pattern for matching patterns of this nature.

I suggest to first check for exactly five numbers ^\d{5}$ OR look ahead for a single space between numbers ^(?=\d* \d*$) among six characters .{6}$.
Combining those partial expressions yields ^\d{5}$|^(?=\d* \d*$).{6}$:
let regex = /^\d{5}$|^(?=\d* \d*$).{6}$/;
console.log(regex.test('00000')); // true
console.log(regex.test(' 00000')); // true
console.log(regex.test('00000 ')); // true
console.log(regex.test('00 000')); // true
console.log(regex.test(' 00000')); // false
console.log(regex.test('00000 ')); // false
console.log(regex.test('00 000')); // false
console.log(regex.test('00 0 00')); // false
console.log(regex.test('000 000')); // false
console.log(regex.test('0000')); // false
console.log(regex.test('000000')); // false
console.log(regex.test('000 0')); // false
console.log(regex.test('000 0x')); // false
console.log(regex.test('0000x0')); // false
console.log(regex.test('x00000')); // false
Alternatively match the partial expressions separately via e.g.:
/^\d{5}$/.test(input) || input.length == 6 && /^\d* \d*$/.test(input)

This seems more intuitive to me and is O(n)
function isInputValid(input) {
const length = input.length;
if (length != 5 && length != 6) {
return false;
}
let spaceSeen = false;
let digitsSeen = 0;
for (let character of input) {
if (character === ' ') {
if (spaceSeen) {
return false;
}
spaceSeen = true;
}
else if (/^\d$/.test(character)) {
digitsSeen++;
}
else {
return false;
}
}
return digitsSeen == 5;
}

You can split it in half:
var input = '0000 ';
if(/^[^ ]* [^ ]*$/.test(input) && /^\d{5,6}$/.test(input.replace(/ /, '')))
console.log('Match');

Here's a simple regex to do the job:
^(?=[\d ]{5,6}$)\d*\s?\d*$
Explanation:
^ asserts position at start of the string
Positive Lookahead (?=[\d ]{5,6}$)
Assert that the Regex below matches
Match a single character present in the list below [\d ]{5,6}
{5,6} Quantifier — Matches between 5 and 6 times, as many times as possible, giving back as needed (greedy)
\d matches a digit (equal to [0-9])
matches the character literally (case sensitive)
$ asserts position at the end of the string
\d* matches a digit (equal to [0-9])
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equal to [\r\n\t\f\v ])
\d* matches a digit (equal to [0-9])
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of the string

string="12345 ";
if(string.length<=6 && string.replace(/\s/g, '').length<=5 && parseInt(string,10)){
alert("valid");
}
You could simply check the length and if its a valid number...

This is how I would do it without regex:
string => [...string].reduce(
([spaces,digits], char) =>
[spaces += char == ' ', digits += /\d/.test(char)],
[0,0]
).join(",") == "1,5";

Related

regex lookahead confusion [duplicate]

I want a regex which returns true when there is at least 5 characters et 2 digits. For that, I use a the lookahead (i. e. (?=...)).
// this one works
let pwRegex = /(?=.{5,})(?=\D*\d{2})/;
let result = pwRegex.test("bana12");
console.log("result", result) // true
// this one won't
pwRegex = /(?=.{5,})(?=\d{2})/;
result = pwRegex.test("bana12");
console.log("result", result) // false
Why we need to add \D* to make it work ?
For me, \d{2} is looser than \D*\d{2} so it should not allow an acceptance of the test?
Your lookaheads only test from the current match position. Since you don't match anything, this means from the start. Since bana12 doesn't start with two digits, \d{2} fails. Its as simple as that ;)
Also, note that having \d{2} means your digits has to be adjacent. Is that your intention?
To simply require 2 digits, that doesn't need to be adjacent, try
/(?=.{5,})(?=\D*\d\D*\d)/
Note that lookaheads are zero-width assertions and when their patterns are matched the regex index stays at the same place where it has been before. The lookaheads in the patterns above are executed at the same locations.
The /(?=.{5,})(?=\d{2})/ pattern will match a location that has any 5 chars other than line break chars immediately to the right of the current location and the first 2 chars in this 5 char substring are digits.
You need to add \D* to let other types of chars before the 2 digits.
See more about that behavior at Lookarounds Stand their Ground.

Regex expression for serial numbers

I have the following specifications for a regex:
-> The string starts with a string of three numbers
-> It is followed by a '-'
-> That is followed by three uppercase vowels
-> That is followed by a '-'
-> That is followed by three numbers
-> That is followed by a final '-'
-> That is followed by the last three uppercase vowels.
-> Second set of numbers can not equal the first.
-> The second group of letters can not equal the first.
-> The groups of numbers may not contain zero.
A passable string is:
368-IOU-789-AIO.
An invalid string is:
368-AEO-368-AEI
354-AOU-431-AOU
Currently, I have something like this:
([0-9]+[0-9]+[0-9]+[/AEIOU/]+[0-9]+[0-9]+[0-9])
What you have won't work since + means "one or more of". For example, the sequence [0-9]+[0-9]+[0-9]+ will match anywhere between three and an infinite number of digits.
In addition, your current attempt:
allows for one to an infinite number of vowels (and possibly / character);
doesn't require a vowel set at the end;
doesn't require the - separators; and
may allow for arbitrary content before and after the match.
You should be able to use the {count} specifier to get an exact quantity. All but one of those limitations can be done with any basic regex engine, with something like:
^[1-9]{3}-[AEIOU]{3}-[1-9]{3}-[AEIOU]{3}$
The ^ and $ anchors means start and end of string, [1-9]{3} gives you exactly three non-zero digits, [AEIOU]{3} gives you exactly three vowels, and - gives you the literal separator character.
The "groups cannot be identical" rule is a little more problematic. I would just post process for that to ensure it's not violated. The following pseudo-code is what I mean:
def isValid(str):
if not str.regex_match("^[1-9]{3}-[AEIOU]{3}-[1-9]{3}-[AEIOU]{3}$"):
return false
return str[0..2] != str[8..10]
and str[4..6] != str[12..14]
The alternative will be a rather complex regex that future developers will probably curse you for inflicting on them :-)
Note that your "The groups of numbers may not contain zero" is a little ambiguous in that it may mean no zeros are allowed or just 000 is not allowed. I've assumed the former but it's easy adjustable to cater for the latter:
def isValid(str):
if not str.regex_match("^[0-9]{3}-[AEIOU]{3}-[0-9]{3}-[AEIOU]{3}$"):
return false
return str[0..2] != str[8..10]
and str[4..6] != str[12..14]
and str[0..2] != "000"
and str[8..10] != "000"
You can use capture group and backreferences
^([0-9]{3})-([AEIOU]{3})-(?:(?!\1)[0-9]){3}-(?:(?!\2)[AEIOU]){3}$
Regex Demo
The groups of numbers may not contain zero. From this if you meant only digits between 1 to 9 then you can replace [0-9] with [1-9]
If you don't want to have 000 then you can add a negative lookahead ^(?!.*000) to avoid matching 000

How does the following code mean two consecutive numbers?

This is from an exercise on FCC beta and i can not understand how the following code means two consecutive numbers seeing how \D* means NOT 0 or more numbers and \d means number, so how does this accumulate to two numbers in a regexp?
let checkPass = /(?=\w{5,})(?=\D*\d)/;
This does not match two numbers. It doesn't really match anything except an empty string, as there is nothing preceding the lookup.
If you want to match two digits, you can do something like this:
(\d)(\d)
Or if you really want to do a positive lookup with the (?=\D*\d) section, you will have to do something like this:
\d(?=\D*\d)
This will match against the last digit which is followed by a bunch of non-digits and a single digit. A few examples (matched numbers highlighted):
2 hhebuehi3
^
245673
^^^^^
2v jugn45
^ ^
To also capture the second digit, you will have to put brackets around both numbers. Ie:
(\d)(?=\D*(\d))
Here it is in action.
In order to do what your original example wants, ie:
number
5+ \w characters
a non-number character
a number
... you will need to precede your original example with a \d character. This means that your lookups will actually match something which isn't just an empty string:
\d(?=\w{5,})(?=\D*\d)
IMPORTANT EDIT
After playing around a bit more with a JavaScript online console, I have worked out the problem with your original Regex.
This matches a string with 5 or more characters, including at least 1 number. This can match two numbers, but it can also match 1 number, 3 numbers, 12 numbers, etc. In order to match exactly two numbers in a string of 5-or-more characters, you should specify the number of digits you want in the second half of your lookup:
let regex = /(?=\w{5,})(?=\D*\d{2})/;
let string1 = "abcd2";
let regex1 = /(?=\w{5,})(?=\D*\d)/;
console.log("string 1 & regex 1: " + regex1.test(string1));
let regex2 = /(?=\w{5,})(?=\D*\d{2})/;
console.log("string 1 & regex 2: " + regex2.test(string1));
let string2 = "abcd23";
console.log("string 2 & regex 2: " + regex2.test(string2));
My original answer was about Regex in a vacuum and I glossed over the fact that you were using Regex in conjunction with JavaScript, which works a little differently when comparing Regex to a string. I still don't know why your original answer was supposed to match two numbers, but I hope this is a bit more helpful.
?= Positive lookahead
w{5,} matches any word character (equal to [a-zA-Z0-9_])
{5,}. matches between 5 and unlimited
\D* matches any character that\'s not a digit (equal to [^0-9])
* matches between zero and unlimited
\d matches a digit (equal to [0-9])
This expression is global - so tries to match all
You can always check your expression using regex101

JQuery match with RegEx not working

I have a filename that will be something along the lines of this:
Annual-GDS-Valuation-30th-Dec-2016-082564K.docx
It will contain 5 numbers followed by a single letter, but it may be in a different position in the file name. The leading zero may or may not be there, but it is not required.
This is the code I come up with after checking examples, however SelectedFileClientID is always null
var SelectedFileClientID = files.match(/^d{5}\[a-zA-Z]{1}$/);
I'm not sure what is it I am doing wrong.
Edit:
The 0 has nothing to do with the code I am trying to extract. It may or may not be there, and it could even be a completely different character, or more than one, but has nothing to do with it at all. The client has decided they want to put additional characters there.
There are at least 3 issues with your regex: 1) the pattern is enclosed with anchors, and thus requires a full string match, 2) the d matches a letter d, not a digit, you need \d to match a digit, 3) a \[ matches a literal [, so the character class is ruined.
Use
/\d{5}[a-zA-Z]/
Details:
\d{5} - 5 digits
[a-zA-Z] - an ASCII letter
JS demo:
var s = 'Annual-GDS-Valuation-30th-Dec-2016-082564K.docx';
var m = s.match(/\d{5}[a-zA-Z]/);
console.log(m[0]);
All right, there are a few things wrong...
var matches = files.match(/\-0?(\d{5}[a-zA-Z])\.[a-z]{3,}$/);
var SelectedFileClientID = matches ? matches[1] : '';
So:
First, I get the matches on your string -- .match()
Then, your file name will not start with the digits - so drop the ^
You had forgotten the backslash for digits: \d
Do not backslash your square bracket - it's here used as a regular expression token
no need for the {1} for your letters: the square bracket content is enough as it will match one, and only one letter.
Hope this helps!
Try this pattern , \d{5}[a-zA-Z]
Try - 0?\d{5}[azA-Z]
As you mentioned 0 may or may not be there. so 0? will take that into account.
Alternatively it can be done like this. which can match any random character.
(\w+|\W+|\d+)?\d{5}[azA-Z]

Regex to match only when expression match is no more than 12 characters long

I am trying to create a regular expression (Java/JavaScript) that matches the following regex, but only when there are fewer than 13 characters total (and a minimum of 4).
(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ DDD)?) ← originally posted
(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ [A-Z]+)?)
These values should (and do) match:
MED-123
COTA-1224
MED4
COTB-892K777
MED-33 DDD
MED-234J5678
This value matches, but I don't want it to (I want to only match if there are fewer than 12 characters total):
COT-1111J11111111111111
See http://regexr.com/3bs7b http://regexr.com/3bsfv
I have tried grouping my expression and putting {4,12} at the end, but that just makes it look for 4 to 12 instances of the whole expression matching.
I feel like I am missing something simple...thanks in advance for your help!
You can use negative look-ahead:
(?!.{13,})(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ DDD)?)
Since your expression already make sure that a match starts with COT or MED and there is at least one digit after that, it already guarantees that there are at least 4 characters
I have tried grouping my expression and putting {4,12} at the end, but
that just makes it look for 4 to 12 instances of the whole expression
matching.
This looks for 4 to 12 instances of the whole expression because you didn't add a word boundary \b. Your regex works fine, just add a word boundary and your desired outcome would be achieved. Take a look at this DEMO.
Your regex seems to be very clumsy and looks a little bit hard to read. It is also very limited to certain characters example JK except if you want it to be that way. For a more general pattern, you can check this out
(COT|MED)[AB]?-?[\dJK]{1,8}(\s+D{1,3})?\b
(COT|MED): matches either COT or MED
[AB]?: matches A or B which is optional because of the presence of ?
-?: matches - which is also optional
[\dJK]{1,8}: This matches a number,or J or K with a length of at least one character and a maximum of eight characters.
(\s+D{1,3})?: matches a space or a D at least one time and a maximum of 3 times and this is optional
\b: with respect to your question this seems to be the most important and it creates a boundary for the words that have already been matched. This means that anything exceeding the matched pattern would not be captured.
See the demo here DEMO2
The answer you are looking for is
(?!\S{13})(?:COT|MED)[ABCD]?-?\d{1,4}(?:[JK]+\d*|(?: [A-Z]+)?)
See regex demo
Note that it is almost impossible to check the length of a phrase that is not a whole string or that has spaces inside since boundaries are a bit "blurred". Thus, (?!\S{13}) is a kind of a workaround that just makes sure you do not have a string without whitespace that is 13 characters long or longer.
The regex breakdown:
(?!\S{13}) - Check if the substring that follows does not consist of 13 non-whitespace characters
(?:COT|MED) - Any of the values in the alternation (COTorMED`)
[ABCD]?-? - Optional A, B, C, D and then an optional -
\d{1,4} - 1 to 4 digits
(?:[JK]+\d*|(?: [A-Z]+)?) - a group of 2 alternatives:
[JK]+\d* - J or K, 1 or more times, and then 0 or more digits
(?: [A-Z]+)? - optional space and 1 or more Latin uppercase letters
As this answer suggests, you could solve this this way:
(?=(COT|MED)[ABCD]?-?[0-9]{1,4}(([JK]+[0-9]*)|(\ DDD)?))(?={4 , 12})

Categories