RegEx start and finish with letter, allow commas and dashes - javascript

I've got this regex:
/^[\a-zøåæäöüß][\a-z0-9øåæäöüß]*(?:\-?[\a-z0-9øåæäöüß,]+)*$/i
It works fine for a crazy input like "K61-283øÅ,æk-ken,a-sd", but it fails on the cases "word," (so, when there's just one comma).
Also - how can I restrict it that it should start with a letter after every comma or dash (so basically - every word)?
The rule is: start with a letter and end with alphanumeric; allow alphanumeric, dashes and commas; after each dash or comma there should be a letter

You may use
/^[a-zøåæäöüß][a-z0-9øåæäöüß]*(?:[-,][a-zøåæäöüß][a-z0-9øåæäöüß]*)*$/i
See the regex demo
Details:
^ - start of string
[a-zøåæäöüß] - a letter from the defined set
[a-z0-9øåæäöüß]* - 0+ digits or letters from the defined set
(?:[-,][a-zøåæäöüß][a-z0-9øåæäöüß]*)* - zero or more sequences of:
[-,] - a - or ,
[a-zøåæäöüß] - a letter from the defined set
[a-z0-9øåæäöüß]* - 0+ digits or letters from the defined set
$ - end of string.

Update 2:
There are two ways to look at your requirements.
The top-down view
We treat the input as a list of one or more words, separated by comma or dash:
INPUT = WORD (?: [,\-] WORD )*
Each word consists of a letter, followed by zero or more letters or digits:
WORD = LETTER [ LETTER DIGIT ]*
Translated into JavaScript regex syntax this gives us:
WORD = [a-zøåæäöüß][a-zøåæäöüß\d]*
And for the whole input (with anchors):
/^[a-zøåæäöüß][a-zøåæäöüß\d]*(?:[,\-][a-zøåæäöüß][a-zøåæäöüß\d]*)*$/i
(This is Wiktor Stribiżew's answer.)
The bottom-up view
We start by looking at the allowed characters. We know that the first character has to be a letter. After that, there can be zero or more input elements:
INPUT = LETTER ELEMENT*
Each element is either
a letter or digit, or
a comma or dash, followed by a letter:
ELEMENT = [ LETTER DIGIT ] | [ COMMA DASH ] LETTER
Translating this into JavaScript gives us:
/^[a-zøåæäöüß](?:[a-zøåæäöüß\d]|[,\-][a-zøåæäöüß])*$/i
These two regexes are equivalent. The bottom-up regex is shorter and contains less repetitive code. On the other hand, the top-down regex may run faster on some regex engines if the input strings are mostly alphanumeric, with relatively few dashes/commas. On the gripping hand, if your inputs are short, you probably don't care about minuscule performance differences.
Here's a direct encoding of your (revised) requirements:
/^[a-zøåæäöüß](?:(?:[a-zøåæäöüß\d]|[,\-][a-zøåæäöüß])*[,\-]?[a-zøåæäöüß])?$/i
The idea is to match a letter, followed by either
the end of the string (this handles input strings of length 1), or
a list of 0 or more intermediates, optionally followed by a comma or dash, followed by another letter
Each intermediate is either
a letter, or
a digit, or
a comma or a dash followed by a letter

Try this out: (allows letters and digits after comma or dash)
/^[a-zøåæäöüß]([a-z0-9øåæäöüß]|(,|-)[a-z0-9øåæäöüß])*[a-zøåæäöüß]$/i
or this: (allows letters after comma or dash)
/^[a-zøåæäöüß]([a-z0-9øåæäöüß]|(,|-)[a-zøåæäöüß])*[a-zøåæäöüß]$/i

Related

Javascript: Regex to exclude whitespace and special characters

I need a regex to validate,
Should be of length 18
First 5 characters should be either (xyz34|xyz12)
Remaining 13 characters should be alphanumeric only letters and numbers, no whitespace or special characters is allowed.
I have a pattern like here, '/^(xyz34|xyz12)((?=.*[a-zA-Z])(?=.*[0-9])){13}/g'
But this is allowing whitespace and special characters like ($,% and etc) which is violating the rule #3.
Any suggestion to exclude this whitespace and special characters and to strictly check that it must be letters and numbers?
You should not quantify lookarounds. They are non-consuming patterns, i.e. the consecutive positive lookaheads check the presence of their patterns but do not advance the regex index, they check the text at the same position. It makes no sense repeating them 13 times. ^(xyz34|xyz12)((?=.*[a-zA-Z])(?=.*[0-9])){13} is equal to ^(xyz34|xyz12)(?=.*[a-zA-Z])(?=.*[0-9]), and means the string can start with xyz34 or xyz12 and then should have at least 1 letter and at least 1 digits.
You may consider fixing the issue by using a consuming pattern like this:
If you do not care if the last 13 chars contain only digits or only letters, use the patterns suggested by other users, like /^(?:xyz34|xyz12)[a-zA-Z\d]{13}$/ or /^xyz(?:34|12)[a-zA-Z0-9]{13}$/
If there must be at least 1 digit and at least 1 letter among those 13 alphanumeric chars, use /^xyz(?:34|12)(?=[a-zA-Z]*\d)(?=\d*[a-zA-Z])[a-zA-Z\d]{13}$/.
See the regex demo #1 and the regex demo #2.
NOTE: these are regex literals, do not use them inside single- or double quotes!
Details
^ - start of string
xyz - a common prefix
(?:34|12) - a non-capturing group matching 34 or 12
(?=[a-zA-Z]*\d) - there must be at least 1 digit after any 0+ letters to the right of the current location
(?=\d*[a-zA-Z]) - there must be at least 1 letter after any 0+ digtis to the right of the current location
[a-zA-Z\d]{13} - 13 letters or digits
$ - end of string.
JS demo:
var strs = ['xyz34abcdefghijkl1','xyz341bcdefghijklm','xyz34abcdefghijklm','xyz341234567890123','xyz14a234567890123'];
var rx = /^xyz(?:34|12)(?=[a-zA-Z]*\d)(?=\d*[a-zA-Z])[a-zA-Z\d]{13}$/;
for (var s of strs) {
console.log(s, "=>", rx.test(s));
}
.* will match any string, for your requirment you can use this:
/^xyz(34|12)[a-zA-Z0-9]{13}$/g
regex fiddle
/^(xyz34|xyz12)[a-zA-Z0-9]{13}$/
This should work,
^ asserts position at the start of a line
1st Capturing Group (xyz34|xyz12)
1st Alternative xyz34 matches the characters xyz34 literally (case sensitive)
2nd Alternative xyz12 matches the characters xyz12 literally (case sensitive)
Match a single character present in the list below [a-zA-Z0-9]{13}
{13} Quantifier — Matches exactly 13 times

JS regex for proper names: How to force capital letter at the start of each word?

I want a JS regex that only matches names with capital letters at the beginning of each word and lowercase letters thereafter. (I don't care about technical accuracy as much as visual consistency — avoiding people using, say, all caps or all lower cases, for example.)
I have the following Regex from this answer as my starting point.
/^[a-z ,.'-]+$/gmi
Here is a link to the following Regex on regex101.com.
As you can see, it matches strings like jane doe which I want to prevent. And only want it to match Jane Doe instead.
How can I accomplish that?
Match [A-Z] initially, then use your original character set afterwards (sans space), and make sure not to use the case-insensitive flag:
/^[A-Z][a-z,.'-]+(?: [A-Z][a-z,.'-]+)*$/g
https://regex101.com/r/y172cv/1
You might want the non-word characters to only be permitted at word boundaries, to ensure there are alphabetical characters on each side of, eg, ,, ., ', and -:
^[A-Z](?:[a-z]|\b[,.'-]\b)+(?: [A-Z](?:[a-z]|\b[,.'-]\b)+)*$
https://regex101.com/r/nP8epM/2
If you want a capital letter at the beginning and lowercase letters following where the name can possibly end on one of ,.'- you might use:
^[A-Z][a-z]+[,.'-]?(?: [A-Z][a-z]+[,.'-]?)*$
^ Start of string
[A-Z][a-z]+ Match an uppercase char, then 1+ lowercase chars a-z
[,.'-]? Optionally match one of ,.'-
(?: Non capturing group
[A-Z][a-z]+[,.'-]? Match a space, then repeat the same pattern as before
)* Close group and repeat 0+ times to also match a single name
$ End of string
Regex demo
Here's my solution to this problem
const str = "jane dane"
console.log(str.replace(/(^\w{1})|(\s\w{1})/g, (v) => v.toUpperCase()));
So first find the first letter in the first word (^\w{1}), then use the PIPE | operator which serves as an OR in regex and look for the second block of the name ie last name where the it is preceded by space and capture the letter. (\s\w{1}). Then to close it off with the /g flag you continue to run through the string for any iterations of these conditions set.
Finally you have the function to uppercase them. This works for any name containing first, middle and lastname.

Regex allowing no leading whitespace but allowing anywhere else and requiring at least one uppercase and one lowercase letters

I want regex code to validate usernames that have:
length between 6 and 30
contain at least one letter from A-Z
contain at least one digit from 0-9
not contain a space at the beginning but it might have at
the end or in the middle.
may contain special characters
So far I have tried this:
^[\S](?=.*\d)(?=.*[A-Z]).{6,30}$
It works quite good but when I choose an uppercase letter ONLY at the beginning it doesnt validate my password.
Test12 34 ----> Doesnt accept but should accept
TesT12 34 ----> Accept
tesT12 34 ----> Accept
The problem arises because the \S is at the pattern start before the lookaheads. That means, that the lookaheads that require an uppercase ASCII letter and a digit only check them after the first character in string.
Now, there can be two scenarios: 1) there may be any amount of whitespace characters in the string, but at its beginning, 2) only one whitespace char is allowed in the string, and not in the beginning
Scenario 1
Put \S after the lookaheads and decrement the limiting quantifier values to set the limits to 6-30 as \S already matches and consumes the first char:
^(?=.*\d)(?=.*[A-Z])\S.{5,29}$
^^ ^^^^
See the regex demo.
JS test:
var rx = /^(?=.*\d)(?=.*[A-Z])\S.{5,29}$/;
var vals = [ "Test12 34", "TesT12 34", "tesT12 34" ];
for (var s of vals) {
console.log(s, "=>", rx.test(s));
}
Scenario 2
Use
^(?=.{6,30}$)(?=.*\d)(?=.*[A-Z])\S+(?:\s\S*)?$
The length is restricted with the positive lookahead at the beginning ((?=.{6,30}$)) and the consuming \S+(?:\s\S*)? pattern will only allow a single \s whitespace (1 or 0 due to the last ? - one or zero occurrences quantifier) and it can be in the middle (as the first \S is quantified with +, one or more occurrences quantifier) or end (as the \S after \s is *-quantified, zero or more occurrences quantifier).
See the regex demo.

C# Regex for three digit and an alpha

I have a field where I need to have a regex where the first 3 digits are numeric and the fourth character should be alpha- letter only, I need to have regex in both c# and javascript.
My following regex is good for three numeric number
#"\A(\d){3}\Z";
How to add for the fourth character which has to be alpha
If by alpha you mean only latin letters, you can do this:
^\d{3}[a-zA-Z]$
You can't use \A and \Z in JavaScript but they're equivalent to ^ and $ unless you use the m option.
If you need full Unicode character range, use \p{L} instead of [a-zA-Z], but you're out of luck for JavaScript support. You'd have to include the relevant Unicode ranges by hand into the character class...
I think what you want to know about is Character Classes or Character Sets
With a "character class", also called "character set", you can tell the regex engine to match only one out of several characters. Simply place the characters you want to match between square brackets. [...]
and
[...] You can use a hyphen inside a character class to specify a range of characters. [0-9] matches a single digit between 0 and 9. You can use more than one range. [0-9a-fA-F] matches a single hexadecimal digit, case insensitively. You can combine ranges and single characters. [0-9a-fxA-FX] matches a hexadecimal digit or the letter X. Again, the order of the characters and the ranges does not matter. [...]
So basically you can match character "A" to "Z" by placing them in square brackets and using a hyphen to indicate range
[A-Z]
You can match multiple sets, so if you also need "a" to "z" (lowercase) you can include
[A-Za-z]

RegExp extract specific string followed by any number with leading / trailing whitespace

I want to extract a string from another using JavaScript / RegExp.
Here is what I got:
var string = "wp-button wp-image-45 wp-label";
string.match(/(?:(?:.*)?\s+)?(wp-image-([0-9]+))(:?\s(?:.*)?)?/);
// returnes: ["wp-button ", "wp-image-45", "45", undefined]
I just want to have "wp-image-45", so:
(Optional) any character
(Optional) followed by whitespace
(Required) followed by "wp-image-"
(Required) followed by any number
(Optional) followed by whitespacy
(Optional) followed by any character
What is missing here? Is it just some kind of bracketing or more?
I also tried
string.match(/(?:(?:.*)?\s+)?(?=(wp-image-([0-9]+)))(?=(:?\s(?:.*)?)?)/)
Edit: In the end I just want to have the number. But I'd also make this step in between.
Regexps are not required to start matching at the beginning of the string, so your attempts to match whitespace and any character aren't necessary. Also, "any character" includes whitespace (except newlines in certain modes).
This should be all you need:
string.match(/\bwp-image-(\d+)\b/)
This will capture, for example, "wp-image-123" into matching group 0, and "123" into matching group 1.
\b means "word boundary", which ensures that you won't match "abcwp-image-123def". A word boundary is defined as any place where a non-word character is followed by a word character, or vice versa. A word character is consists of a letter, a number or an underscore.
Also, I used \d instead of [0-9] simply out of convenience. They have slightly different meaning (\d also matches characters considered numbers in other languages), but that won't make a difference in your case.
If all of that surrounding stuff is optional and all you want is the number then there's no point to matching for any of that stuff except for that "wp-image-" prefix, just do:
var string = "wp-button wp-image-45 wp-label";
string.match(/wp-image-([0-9]+)/);

Categories