regex to extract number and date from string - javascript

I'm trying to extract date, percentage or number from string. Strings can be:
the response value 10 (from here I want to extract 10)
the response value 10/12/2014 (from here I want to extract 10/12/2014)
the response value 08/2015 (from here I want to extract 08/2015)
I've written regex as (?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?) Regex is satisfying 12/12/2014, 10, 02/2012.
I'm also trying to modifying same regex to get 10, 08/2015 and 10/10/2015 but not getting how to get.
How can this be achieved?

To match your example data, you could use an alternation matching either 2 digits / 4 digits, or match 2 digits with an optional part that matches 2 digits and 4 digits.
\b(?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?)\b
Explanation
\b Word boundary, prevent the word char being part of a larger word
(?: Non capture group
\d{2}\/\d{4} Match 2 digits/4 digits
| Or
\d{2} Match 2 digits
(?:\/\d{2}\/\d{4})? Optionally match /2 digits/4 digits
) Close group
\b Word boundary
Regex demo
Note that 2 and 4 digits could also match 99 and 9999. If you want to make your match more specific, this page can be helpful https://www.regular-expressions.info/dates.html
const pattern = /\b(?:\d{2}\/\d{4}|\d{2}(?:\/\d{2}\/\d{4})?)\b/;
[
"the response value 10",
"the response value 10/12/2014",
"the response value 08/2015"
].forEach(s => console.log(s.match(pattern)[0]));

Just for fun (regex is fun) an alternative to the accepted answer:
\b(?:(?:\d\d\/){1,2}\d{4}|\d\d)\b
See the Online Demo
\b - Match a word boundary.
(?: - 1st Non-capturing group.
(?: - 2nd Non-capturing group.
\d\d\/ - Match two digits and a literal forward slash.
){1,2} - Close 2nd non-capturing group and use it once or twice.
\d{4} - Match four digits.
| - Alternation (OR).
\d\d) - Two digits and close 1st non capturing group.
\b - Match a word boundary.
Maybe we can do this even without alternation:
\b\d\d(?:(?:\/\d\d){1,2}\d\d)?\b
See the Online Demo
\b - Match a word boundary.
\d\d - Match two digits.
(?: - 1st Non-capturing group.
(?: - 2nd Non-capturing group.
\/\d\d - Match a literal slash and two digits.
){1,2} - Close 2nd non-capturing group and use it once or twice.
\d\d - Match two digits.
)? - Close 1st non-capturing group and make it optional.
\b - Match a word boundary.

Match method supports regExp and will return an array with the items you are looking for:
var date = "12/12/2014"
var arr = date.match(/(\d{2})[\/](\d{2})[\/](\d{4})/);
console.log(arr[0]);
console.log(arr[1]);
console.log(arr[2]);
console.log(arr[3]);

Related

REGEX for matching number with two and three numbered patterns together

I have an array of 5 numbers, I'd like to match as long as there are three of the same number and two of the same different number in the array, placement does not matter. Number sequences can be any random string of 5 numbers between 1 - 5.
Examples of matches would be:
33322
24422
52225
44111
54545
*basically any grouping of 2 and 3 of the same numbers needs to match.
Best I've come up with so far:
^([0-9])\1{2}|([0-9])\1{1}$
I am not so good with regex, any help would be greatly appreciated.
You can use
^(?=[1-5]{5}$)(?=.*(\d)(?:.*\1){2})(?=.*(?!\1)(\d).*\2)\d+$
^(?=.*(\d)(?:.*\1){2})(?=.*(?!\1)(\d).*\2)[1-5]{5}$
See the regex demo.
If you want to allow any digits, replace [1-5] with \d.
Details:
^ - start of string
(?=[1-5]{5}$) - there must be five digits from 1 to 5 till end of string (this lookahead makes non-matching strings fail quicker)
(?=.*(\d)(?:.*\1){2}) - a positive lookahead that requires any zero or more chars as many as possible, followed with a digit (captured into Group 1) and then two sequences of any zero or more chars as many as possible and the same digit as captured into Group 1 immediately to the right of the current location
(?=.*(?!\1)(\d).*\2) - a positive lookahead that requires any zero or more chars as many as possible, followed with a digit (captured into Group 2) that is not equal to the digit in Group 1, and then any zero or more chars as many as possible and the same digit as captured into Group 2 immediately to the right of the current location
\d+ - one or more digits
$ - end of string.
There are many ways to do that. One is to match the following regular expression.
^(?=([1-5]).*\1)(?=.+(?!\1)([1-5]).*\2)(?:\1|\2){5}$
The idea is as follows.
use a positive lookahead to match and save to capture group 1 the first digit and require it to appear at least twice;
use a positive lookahead to match and save to capture group 2 a digit that is different from the digit in capture group 1 and require it to appear at least twice;
match a five-character string that contains only the digits in the two capture groups.
Demo
The regular expression can be broken down as follows.
^ # match beginning of string
(?= # begin a positive lookahead
([1-5]) # match a digit 1-5 and save to capture group 1
.* # match zero or more characters
\1 # match the digit in capture group 1
) # end positive lookahead
(?= # begin a positive lookahead
.+ # match one or more characters
(?!\1) # next character is not the digit in capture group 1
([1-5]) # match a digit 1-5 and save to capture group 2
.* # match zero or more characters
\2 # match the digit in capture group 2
) # end positive lookahead
(?:\1|\2){5}$ # match a 5-character string comprised of the digits
# in the two capture groups
Here's a second expression that could be used:
^(?=([1-5])\1*(?!\1)([1-5])(?:\1*\2){1,2}\1*$).{5}$
Demo

Javascript: Regex to exclude whitespace and special characters

I need a regex to validate,
Should be of length 18
First 5 characters should be either (xyz34|xyz12)
Remaining 13 characters should be alphanumeric only letters and numbers, no whitespace or special characters is allowed.
I have a pattern like here, '/^(xyz34|xyz12)((?=.*[a-zA-Z])(?=.*[0-9])){13}/g'
But this is allowing whitespace and special characters like ($,% and etc) which is violating the rule #3.
Any suggestion to exclude this whitespace and special characters and to strictly check that it must be letters and numbers?
You should not quantify lookarounds. They are non-consuming patterns, i.e. the consecutive positive lookaheads check the presence of their patterns but do not advance the regex index, they check the text at the same position. It makes no sense repeating them 13 times. ^(xyz34|xyz12)((?=.*[a-zA-Z])(?=.*[0-9])){13} is equal to ^(xyz34|xyz12)(?=.*[a-zA-Z])(?=.*[0-9]), and means the string can start with xyz34 or xyz12 and then should have at least 1 letter and at least 1 digits.
You may consider fixing the issue by using a consuming pattern like this:
If you do not care if the last 13 chars contain only digits or only letters, use the patterns suggested by other users, like /^(?:xyz34|xyz12)[a-zA-Z\d]{13}$/ or /^xyz(?:34|12)[a-zA-Z0-9]{13}$/
If there must be at least 1 digit and at least 1 letter among those 13 alphanumeric chars, use /^xyz(?:34|12)(?=[a-zA-Z]*\d)(?=\d*[a-zA-Z])[a-zA-Z\d]{13}$/.
See the regex demo #1 and the regex demo #2.
NOTE: these are regex literals, do not use them inside single- or double quotes!
Details
^ - start of string
xyz - a common prefix
(?:34|12) - a non-capturing group matching 34 or 12
(?=[a-zA-Z]*\d) - there must be at least 1 digit after any 0+ letters to the right of the current location
(?=\d*[a-zA-Z]) - there must be at least 1 letter after any 0+ digtis to the right of the current location
[a-zA-Z\d]{13} - 13 letters or digits
$ - end of string.
JS demo:
var strs = ['xyz34abcdefghijkl1','xyz341bcdefghijklm','xyz34abcdefghijklm','xyz341234567890123','xyz14a234567890123'];
var rx = /^xyz(?:34|12)(?=[a-zA-Z]*\d)(?=\d*[a-zA-Z])[a-zA-Z\d]{13}$/;
for (var s of strs) {
console.log(s, "=>", rx.test(s));
}
.* will match any string, for your requirment you can use this:
/^xyz(34|12)[a-zA-Z0-9]{13}$/g
regex fiddle
/^(xyz34|xyz12)[a-zA-Z0-9]{13}$/
This should work,
^ asserts position at the start of a line
1st Capturing Group (xyz34|xyz12)
1st Alternative xyz34 matches the characters xyz34 literally (case sensitive)
2nd Alternative xyz12 matches the characters xyz12 literally (case sensitive)
Match a single character present in the list below [a-zA-Z0-9]{13}
{13} Quantifier — Matches exactly 13 times

How do I enforce that certain characters must be present when there is an optional character before them?

I would like to capture a string that meets the criteria:
may be empty
if it is not empty it must have up to three digits (-> \d{1,3})
may be optionally followed by a uppercase letter ([A-Z]?)
may be optionally followed by a forward slash (i.e. /) (-> \/?); if it is followed by a forward slash it must have from one to three digits
(-> \d{1,3})
Here's a valid input:
35
35A
35A/44
Here's invalid input:
34/ (note the lack of a digit after '/')
I've come up with the following ^\d{0,3}[A-Z]{0,1}/?[1,3]?$ that satisfies conditions 1-3. How do I deal with 4 condition? My Regex fails at two occassions:
fails to match when there is a digit and a forward slash and a digit e.g .77A/7
matches but it shouldn't when there isa digit and a forward slash, e.g. 77/
You may use
/^(?:\d{1,3}[A-Z]?(?:\/\d{1,3})?)?$/
See the regex demo
Details
^ - start of string
(?:\d{1,3}[A-Z]?(?:\/\d{1,3})?)? - an optional non-capturing group:
\d{1,3} - one to three digits
[A-Z]? - an optional uppercase ASCII letter
(?:\/\d{1,3})? - an optional non-capturing group:
\/ - a / char
\d{1,3} - 1 to 3 digits
$ - end of string.
Visual graph (generated here):
This should work. You were matching an optional slash and then an optional digit from 1 to 3; this matches an optional combination of a slash and 1-3 of any digits. Also, your original regex could match 0 digits at the beginning; I believe that this was in error, so I fixed that.
var regex = /^(\d{1,3}[A-Z]{0,1}(\/\d{1,3})?)?$/g;
console.log("77A/7 - "+!!("77A/7").match(regex));
console.log("77/ - "+!!("77/").match(regex));
console.log("35 - "+!!("35").match(regex));
console.log("35A - "+!!("35A").match(regex));
console.log("35A/44 - "+!!("35A/44").match(regex));
console.log("35/44 - "+!!("35/44").match(regex));
console.log("34/ - "+!!("34/").match(regex));
console.log("A/3 - "+!!("A/3").match(regex));
console.log("[No string] - "+!!("").match(regex));

Regex validation for mixed digits for a max of 6 characters

I need a regex validation for mixed length, a total length of 6 characters in that 4-6 characters in caps/numbers and 0-2 spaces.
I tried like
^[A-Z0-9]{4,6}+[\s]{0,2}$
but it results in a max length of 8 characters, but I need a max of 6 characters.
If the alphanumeric chars should only appear at the start of the string and the whitespaces can appear at the end (i.e. the order of the alphanumerics and whitespaces matters), you may use
/^(?=.{6}$)[A-Z0-9]{4,6}\s*$/
See the regex demo
Details
^ - start of string
(?=.{6}$) - the string length is restricted to exactly 6 non-line break chars
[A-Z0-9]{4,6} - 4, 5 or 6 uppercase ASCII letters or digits
\s* - 0+ whitespaces (but actually, only 0, 1 or 2 will be possible to add as the total length is already validated with the lookahead)
$ - end of string.
If you want to match the alphanumeric and whitespaces anywhere inside the string, you need a lookaround based regex like
^(?=(?:[^A-Z0-9]*[A-Z0-9]){4,6}[^A-Z0-9]*$)(?=(?:\S*\s){0,2}\S*$)[A-Z0-9\s]{6}$
See the regex demo
Details
^ - start of string
(?=(?:[^A-Z0-9]*[A-Z0-9]){4,6}[^A-Z0-9]*$) - a positive lookahead that requires the presence of 4 to 6 letters or digits anywhere inside the string
(?=(?:\S*\s){0,2}\S*$) - a positive lookahead that requires the presence of 0 to 2 whitespaces anywhere inside the string
[A-Z0-9\s]{6} - 6 ASCII uppercase letters, digits or whitespaces
$ - end of string.
To shorten the pattern, the second lookahead can be written as (?!(?:\S*\s){3}), it will fail the match if there are 3 whitespace chars anywhere inside the string. See the regex demo.
You can use | characters to accommodate several cases into one.
const regex = /(^[A-Z0-9]{4}\s{2}$)|(^[A-Z0-9]{5}\s$)|(^[A-Z0-9]{6}$)/g;
alert(regex.test(prompt('Enter input, including space(s)')));
If you want to match zero, one or two spaces at the end, you could use an alternation for those 3 cases.
^(?:[A-Z0-9]{4}[ ]{2}|[A-Z0-9]{5}[ ]|[A-Z0-9]{6})$
Regex demo
Explanation
^ Assert the start of the string
(?: Non capturing group
[A-Z0-9]{4}[ ]{2} Match uppercase or digit 4 times followed by 2 spaces
| Or
[A-Z0-9]{5} Match uppercase or digit 5 times followed by 1 space
| Or
[A-Z0-9]{6} Match uppercase or digit 6 times
) Close non capturing group
$ Assert the end of the string

Capture between pattern of digits

I'm stuck trying to capture a structure like this:
1:1 wefeff qwefejä qwefjk
dfjdf 10:2 jdskjdksdjö
12:1 qwe qwe: qwertyå
I would want to match everything between the digits, followed by a colon, followed by another set of digits. So the expected output would be:
match 1 = 1:1 wefeff qwefejä qwefjk dfjdf
match 2 = 10:2 jdskjdksdjö
match 3 = 12:1 qwe qwe: qwertyå
Here's what I have tried:
\d+\:\d+.+
But that fails if there are word characters spanning two lines.
I'm using a javascript based regex engine.
You may use a regex based on a tempered greedy token:
/\d+:\d+(?:(?!\d+:\d)[\s\S])*/g
The \d+:\d+ part will match one or more digits, a colon, one or more digits and (?:(?!\d+:\d)[\s\S])* will match any char, zero or more occurrences, that do not start a sequence of one or more digits followed with a colon and a digit. See this regex demo.
As the tempered greedy token is a resource consuming construct, you can unroll it into a more efficient pattern like
/\d+:\d+\D*(?:\d(?!\d*:\d)\D*)*/g
See another regex demo.
Now, the () is turned into a pattern that matches strings linearly:
\D* - 0+ non-digit symbols
(?: - start of a non-capturing group matching zero or more sequences of:
\d - a digit that is...
(?!\d*:\d) - not followed with 0+ digits, : and a digit
\D* - 0+ non-digit symbols
)* - end of the non-capturing group.
you can use or not the ñ-Ñ, but you should be ok this way
\d+?:\d+? [a-zñA-ZÑ ]*
Edited:
If you want to include the break lines, you can add the \n or \r to the set,
\d+?:\d+? [a-zñA-ZÑ\n ]*
\d+?:\d+? [a-zñA-ZÑ\r ]*
Give it a try ! also tested in https://regex101.com/
for more chars:
^[a-zA-Z0-9!##\$%\^\&*)(+=._-]+$

Categories