I have a set of IDs separated by hyphen which can have minimum 6 characters containing alphanumeric values and some special characters at the end, where only numeric values are not allowed.
LIKE THE FOLLOWING:
YUIO-10GB-BG4 ==> Should match
U-VI1.1-100-WX-Y9 ==> Should match
1-800-553-6387 ==> Shouldn't match because all are digits
T-Series ==> Shouldn't match as all only 2 letters are capital
I am trying a following pattern given below with following rules, but facing difficulties for some testing queries..
((?=\S{6,})[A-Z]{1,}(([A-Z0-9./+~]+){0,}-){1,}[A-Z0-9./+~]+=*)
https://regex101.com/r/d8MFRE/5
You may use the following regex to get all occurrences and then filter out those that do not contain a letter with /[A-Z]/ regex:
/(?:^|\s)(?=\S{6,})(?=\S*[A-Z])([A-Z0-9./+~]+(?:-[A-Z0-9./+~]+)+=*)(?!\S)/g
See the regex demo.
Details
(?:^|\s) - a start of string or whitespace
(?=\S{6,}) - 6 or more chars then
(?=\S*[A-Z]) - there must be at least 1 uppercase ASCII letter after 0+ non-whitespace chars
([A-Z0-9./+~]+(?:-[A-Z0-9./+~]+)+=*) - Group 1:
[A-Z0-9./+~]+ - 1+ uppercase ASCII letters, digits, ., /, +, ~
(?:-[A-Z0-9./+~]+)+ - 1+ occurrences of:
- - a - char
[A-Z0-9./+~]+ - 1+ uppercase ASCII letters, digits, ., /, +, ~
=* - 0+ = symbols
(?!\S) - a whitespace or end of string.
See the JS demo:
var s = "1-2-444555656-54545 800-CVB-4=\r\nThe ABC-CD40N= is also supported onslots GH-K on the 4000 Series ISRs using the \r\nXYZ-X-THM . This SM-X-NIM-REW34= information is not captured in the table above \r\nTERMS WITH ONLY DIGITS SHOUD NOT MATCH --> 1-800-553-6387 \r\nNumber of chars less than 6 SHOULD NOT match ---> IP-IP \r\nGH-K\r\nVA-V etc\r\n\r\nFollowing Should match\r\nYUIO-10GB-BG4: Supports JK-X6824-UIO-XK++= U-VI1.1-100-WX-Y9\r\nXX-123-UVW-3456\r\nVA-V-W-K9\r\nVA-V-W\r\n\r\nThe following term is not matching as there is no Alphabet in first term-----------> 800-CVB-4= \r\nThis should match\r\n\r\nCD-YT-GH-40G-R9(=) \r\nCRT7.0-TPS8K-F\r\nJ-SMBYTRAS-SUB=\r\n===============================\r\n\r\nBelow terms should NOT match\r\nGH-K\r\nVA-V-W\r\nST-M UCS T-Series <-- Should NOT match\r\n\r\n";
var m, res=[];
var rx = /(?:^|\s)(?=\S{6,})(?=\S*[A-Z])([A-Z0-9./+~]+(?:-[A-Z0-9./+~]+)+=*)(?!\S)/g;
while(m=rx.exec(s)) {
res.push(m[1]);
}
console.log(res);
Related
Allowable characters:
uppercase A to Z
lowercase a to z
hyphen
apostrophe
single quote
space
full stop
numerals 0 to 9
validations:
Must contain alphabetic characters
Cannot have consecutive non alpha characters except for full stop followed by a space OR apostrophe can be followed by a space
Cannot have non-alphabetic characters at the start (except for apostrophe)
Can end with a full stop
I have a regex with the below validation. Use this as reference
/^(?=[a-zA-Z0-9`'. -]+$)(?!.*[0-9'` -]{2})[a-zA-Z'][^\r\n.]*(?:\.[ a-z][^\r\n.]*)*$/;
Need to add the below validations to the above regex
Can end with a full stop
An apostrophe can be followed by a space.
Examples Valid
'Andy
Andy.
Andy' De'Wall
Andy. DeWa2e
A2dy'
Examples Invalid
2Andy
A2'ndy.
Andy'-Wall
Andy. DeWa23
A24dy'
You could:
assert not 2 consecutive digits . ' or -
assert not a digit or hyphen or space followed by a space
assert at least a char A-Z a-z
start the match with either ' or a char A-Z a-z
For example
^(?!.*[0-9.'-]{2})(?!.*[0-9 -] )(?=[^A-Za-z\n]*[A-Za-z])[A-Za-z'][A-Za-z0-9.' -]*$
Regex demo
I'm trying to create a regular expression that matches strings such as:
N1-112S
So far I have succeeded with the following (although I'm not really sure why it works):
item.match(/^\D.-/)
I'd like to further bolster the results by ensuring that the last character is A-Z as well.
I'd appreciate some help on a good regular expression for matching this pattern. Thanks!
If you plan to match a string that starts with an uppercase ASCII letter, then has a digit, then a hyphen, then 1 or more digits and then an ASCII letter at the end of the string use
/^[A-Z]\d-\d+[A-Z]$/.test(item)
See the regex demo. Also, to test if a regex matches some string or not, I'd recommend RegExp#test.
Pattern details
^ - start of string
[A-Z] - an uppercase ASCII letter
\d - an ASCII digit
- - a hyphen
\d+ - 1+ digits
[A-Z] - an ASCII letter
$ - end of string.
Variations
To match any alphanumeric chars after hyphen till the end of string, you need to change the above pattern a bit:
/^[A-Z]\d-[\dA-Z]*[A-Z]$/
The second \d+ is changed to [\dA-Z]*, any 0 or more ASCII digits or letters.
If there can be any chars after -, use .* or [^] instead of a \d+:
/^[A-Z]\d-.*[A-Z]$/
I need a regex validation for mixed length, a total length of 6 characters in that 4-6 characters in caps/numbers and 0-2 spaces.
I tried like
^[A-Z0-9]{4,6}+[\s]{0,2}$
but it results in a max length of 8 characters, but I need a max of 6 characters.
If the alphanumeric chars should only appear at the start of the string and the whitespaces can appear at the end (i.e. the order of the alphanumerics and whitespaces matters), you may use
/^(?=.{6}$)[A-Z0-9]{4,6}\s*$/
See the regex demo
Details
^ - start of string
(?=.{6}$) - the string length is restricted to exactly 6 non-line break chars
[A-Z0-9]{4,6} - 4, 5 or 6 uppercase ASCII letters or digits
\s* - 0+ whitespaces (but actually, only 0, 1 or 2 will be possible to add as the total length is already validated with the lookahead)
$ - end of string.
If you want to match the alphanumeric and whitespaces anywhere inside the string, you need a lookaround based regex like
^(?=(?:[^A-Z0-9]*[A-Z0-9]){4,6}[^A-Z0-9]*$)(?=(?:\S*\s){0,2}\S*$)[A-Z0-9\s]{6}$
See the regex demo
Details
^ - start of string
(?=(?:[^A-Z0-9]*[A-Z0-9]){4,6}[^A-Z0-9]*$) - a positive lookahead that requires the presence of 4 to 6 letters or digits anywhere inside the string
(?=(?:\S*\s){0,2}\S*$) - a positive lookahead that requires the presence of 0 to 2 whitespaces anywhere inside the string
[A-Z0-9\s]{6} - 6 ASCII uppercase letters, digits or whitespaces
$ - end of string.
To shorten the pattern, the second lookahead can be written as (?!(?:\S*\s){3}), it will fail the match if there are 3 whitespace chars anywhere inside the string. See the regex demo.
You can use | characters to accommodate several cases into one.
const regex = /(^[A-Z0-9]{4}\s{2}$)|(^[A-Z0-9]{5}\s$)|(^[A-Z0-9]{6}$)/g;
alert(regex.test(prompt('Enter input, including space(s)')));
If you want to match zero, one or two spaces at the end, you could use an alternation for those 3 cases.
^(?:[A-Z0-9]{4}[ ]{2}|[A-Z0-9]{5}[ ]|[A-Z0-9]{6})$
Regex demo
Explanation
^ Assert the start of the string
(?: Non capturing group
[A-Z0-9]{4}[ ]{2} Match uppercase or digit 4 times followed by 2 spaces
| Or
[A-Z0-9]{5} Match uppercase or digit 5 times followed by 1 space
| Or
[A-Z0-9]{6} Match uppercase or digit 6 times
) Close non capturing group
$ Assert the end of the string
I have
/^[a-zA-Z][a-zA-Z '-]*[a-zA-Z]$/g
This regex doesn't allow a string to end or begin with a space , ' , - characters.
However, if I pass one string like a it will also be detected as invalid.
Please suggest how to pass one string but not space, ', -.
Thanks lot.
a - correct
a - incorrect
'a - incorrect
Your regex requires an input that starts with a letter, then has 0+ chars like letters, space, single quote and hyphe, and then an obligatory letter. Wrap the last 2 parts of the pattern with an optional non-capturing group:
/^[a-z](?:[a-z '-]*[a-z])?$/i
^^^^^^^^^^^^^^^^^^^
The i case insensitive modifier will make the pattern a bit shorter.
Details
^ - start of string
[a-z] - an ASCII letter
(?:[a-z '-]*[a-z])? - 1 or 0 occurrences of:
[a-z '-]* - 0+ ASCII letters, spaces, ' or -
[a-z] - an ASCII letter
$ - end of string.
I'm stuck trying to capture a structure like this:
1:1 wefeff qwefejä qwefjk
dfjdf 10:2 jdskjdksdjö
12:1 qwe qwe: qwertyå
I would want to match everything between the digits, followed by a colon, followed by another set of digits. So the expected output would be:
match 1 = 1:1 wefeff qwefejä qwefjk dfjdf
match 2 = 10:2 jdskjdksdjö
match 3 = 12:1 qwe qwe: qwertyå
Here's what I have tried:
\d+\:\d+.+
But that fails if there are word characters spanning two lines.
I'm using a javascript based regex engine.
You may use a regex based on a tempered greedy token:
/\d+:\d+(?:(?!\d+:\d)[\s\S])*/g
The \d+:\d+ part will match one or more digits, a colon, one or more digits and (?:(?!\d+:\d)[\s\S])* will match any char, zero or more occurrences, that do not start a sequence of one or more digits followed with a colon and a digit. See this regex demo.
As the tempered greedy token is a resource consuming construct, you can unroll it into a more efficient pattern like
/\d+:\d+\D*(?:\d(?!\d*:\d)\D*)*/g
See another regex demo.
Now, the () is turned into a pattern that matches strings linearly:
\D* - 0+ non-digit symbols
(?: - start of a non-capturing group matching zero or more sequences of:
\d - a digit that is...
(?!\d*:\d) - not followed with 0+ digits, : and a digit
\D* - 0+ non-digit symbols
)* - end of the non-capturing group.
you can use or not the ñ-Ñ, but you should be ok this way
\d+?:\d+? [a-zñA-ZÑ ]*
Edited:
If you want to include the break lines, you can add the \n or \r to the set,
\d+?:\d+? [a-zñA-ZÑ\n ]*
\d+?:\d+? [a-zñA-ZÑ\r ]*
Give it a try ! also tested in https://regex101.com/
for more chars:
^[a-zA-Z0-9!##\$%\^\&*)(+=._-]+$