Difficult regular expression for name validation

Difficult regular expression for name validation - javascript

I'm trying to write a regular expression to check whether or not a proposed name is valid in a gaming platform.
Rules:
Name must contain at least 3 and no more than 20 letters
Name must start with a uppercaseletter
Name must never have two uppercase letters in a row
Spaces are allowed, but must be preceded by a letter and be followed by an uppercase letter
Hyphens are allowed, but must be preceded by a letter and be followed by a lowercase letter
All uppercase letters must be followed by a lowercase letter unless they are followed by a space or hyphen
I know I can check separately for the length of the string so the first rule is irrelevant, but I figured I'd list it for good measure.
Test cases (Pass):
Foo
Hello World
Hello-world
Bigsby Platt-slatt
Test cases (Fail):
foo
Hello world
Hello-World
33333333333
What regular expression can I use to solve this? Is it reasonable to expect to do this using only regular expressions, or will the pattern need to be analyzed using a different method?
Thanks

This is a possible regular expression:
(?!.*[A-Z]{2})(?!.*[^A-Za-z][ -])(?!.* ([^A-Z]|$))(?!.*-([^a-z]|$))^[A-Z].{2,19}$
See demo on regex101.com.
Explanation:
Several of the rules can be expressed as "cannot contain" kind of rules, and they are easy to implement with negative look-ahead ((?! ... )):
No two capitals in sequence:
(?!.*[A-Z]{2})
No non-letter followed by either a space or hyphen:
(?!.*[^A-Za-z][ -])
No space that is followed by a non-capital or end of string ($):
(?!.* ([^A-Z]|$)
No hyphen followed by a non-lowercase or end of string:
(?!.*-([^a-z]|$))
Finally, the actual match is done with this: a capital followed by 2 - 19 characters:
^[A-Z].{2,19}$

Related

JS regex for proper names: How to force capital letter at the start of each word?

I want a JS regex that only matches names with capital letters at the beginning of each word and lowercase letters thereafter. (I don't care about technical accuracy as much as visual consistency — avoiding people using, say, all caps or all lower cases, for example.)
I have the following Regex from this answer as my starting point.
/^[a-z ,.'-]+$/gmi
Here is a link to the following Regex on regex101.com.
As you can see, it matches strings like jane doe which I want to prevent. And only want it to match Jane Doe instead.
How can I accomplish that?

Match [A-Z] initially, then use your original character set afterwards (sans space), and make sure not to use the case-insensitive flag:
/^[A-Z][a-z,.'-]+(?: [A-Z][a-z,.'-]+)*$/g
https://regex101.com/r/y172cv/1
You might want the non-word characters to only be permitted at word boundaries, to ensure there are alphabetical characters on each side of, eg, ,, ., ', and -:
^[A-Z](?:[a-z]|\b[,.'-]\b)+(?: [A-Z](?:[a-z]|\b[,.'-]\b)+)*$
https://regex101.com/r/nP8epM/2

If you want a capital letter at the beginning and lowercase letters following where the name can possibly end on one of ,.'- you might use:
^[A-Z][a-z]+[,.'-]?(?: [A-Z][a-z]+[,.'-]?)*$
^ Start of string
[A-Z][a-z]+ Match an uppercase char, then 1+ lowercase chars a-z
[,.'-]? Optionally match one of ,.'-
(?: Non capturing group
[A-Z][a-z]+[,.'-]? Match a space, then repeat the same pattern as before
)* Close group and repeat 0+ times to also match a single name
$ End of string
Regex demo

Here's my solution to this problem
const str = "jane dane"
console.log(str.replace(/(^\w{1})|(\s\w{1})/g, (v) => v.toUpperCase()));
So first find the first letter in the first word (^\w{1}), then use the PIPE | operator which serves as an OR in regex and look for the second block of the name ie last name where the it is preceded by space and capture the letter. (\s\w{1}). Then to close it off with the /g flag you continue to run through the string for any iterations of these conditions set.
Finally you have the function to uppercase them. This works for any name containing first, middle and lastname.

JavaScript regular expressions to match no digits, whitespace and selected symbols

Thanks for taking a look.
My goal is to come up with a regexp that will match input that contains no digits, whitespace or the symbols !#£$%^&*()+= or any other symbol I may choose.
I am however struggling to grasp precisely how regular expressions work.
I started out with the simple pattern /\D/, which from my understanding will match the first non-digit character it can find. This would match the string 'James' which is correct but also 'James1' which I don't want.
So, my understanding is that if I want to ensure that a pattern is not found anywhere in a given string, I use the ^ and $ characters, as in /^\D$/. Now because this will only match a single character that is not a digit, I needed to use + to specify that 1 or more digits should not be founds in the entire string, giving me the expression /^\D+$/. Brilliant, it no longer matches 'James1'.
Question 1
Is my reasoning up to this point correct?
The next requirement was to ensure no whitespace is in the given string. \s will match a single whitespace and [^\s] will match the first non-whitespace character. So, from my understanding I just had to add this to what I have already to match strings that contain no digits and no whitespace. Again, because [^\s] will only match a single non-white space character, I used + to match one or more whitespace characters, giving the new regexp of /^\D+[^\s]+$/.
This is where I got lost, as the expression now matches 'James1' or even 'James Smith25'. What? Massively confused at this point.
Question 2
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Question 3
How would I go about writing the regular expression I'm trying to solve?
While I am keen to solve the problem I am more interested in figuring where my understanding of regular expressions is lacking, so any explanations would be helpful.

Not quite; ^ and $ are actually "anchors" - they mean "start" and "end", it's actually a little more complicated, but you can consider them to mean the start and end of a line for now - look up the various modifiers on regular expressions if you're interested in learning more about this. Unfortunately ^ has an overloaded meaning; if used inside square brackets it means "not", which is the meaning you are already acquainted with. It's very important that you understand the difference between these two meanings and that the definition in your head actually applies only to character range matching!
Contributing further to your confusion is that \d means "a numerical digit" and \D means "not a numerical digit". Similarly \s means "a whitespace (space/tab/newline/etc.) character" and \S means "not a whitespace character."
It's worth noting that \d is effectively a shortcut for [0-9] (note that - has a special meaning inside square brackets), and \D is a shortcut for [^0-9].
The reason it's matching strings that contain spaces is that you've asked for "1+ non-numerical digits followed by 1+ non-space characters" - so it'll match lots of strings! I think that perhaps you don't understand that regular expressions match bits of strings, you're not adding constraints as you go, but rather building up bots of matchers that will match bits of corresponding strings.
/^[^\d\s!#£$%^&*()+=]+$/ is the answer you're looking for - I'd look at it like this:
i. [] - match a range of characters
ii. []+ - match one or more of that range of characters
iii. [^\d\s]+ - match one or more characters that do not match \d (numerical digit) or \s (whitespace)
iv. [^\d\s!#£$%^&*()+=]+ - here's a bunch of other characters I don't want you to match
v. ^[^\d\s!#£$%^&*()+=]+$ - now there are anchors applied, so this matcher has to apply to the whole line otherwise it fails to match
A useful website to explore regexs is http://regexr.com/3b9h7 - which I supply with my suggested solution as an example. Edit: Pruthvi Raj's link to debuggerx is awesome!

Is my reasoning up to this point correct?
Almost. /\D/ matches any character other than a digit, but not just the first one (if you use g option).
and [^\s] will match the first non-whitespace character
Almost, [^\s] will match any non-whitespace character, not just the first one (if you use g option).
/^\D+[^\s]+$/ matching strings that contain spaces?
Yes, it does, because \D matches a space (space is not a digit).
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Because \D+ in /^\D+[^\s]+$/can match spaces.
Conclusion:
Use
^[^\d\s!#£$%^&*()+=]+$
It will match strings that have no digits and spaces, and the symbols you do not allow.
Mind that to match a literal -, ] or [ with a character class, you either need to escape them, or use at the start or end of the expression. To play it safe, escape them.

Just insert every character you don't want to include in a negated character class as follows:
^[^\s\d!#£$%^&*()+=]*$
DEMO
Debuggex Demo
^ - start of the string
[^...] - matches one character that is not in `...`
\s - matches a whitespace (space, newline,tab)
\d - matches a digit from 0 to 9
* - a quantifier that repeats immediately preceeding element by 0 or more times
so the regex matches any string that has
1. string that has a beginning
2. containing 0 or more number of characters that is not whitesapce, digit, and all the symbols included in the character class ( In this example !#£$%^&*()+=) i.e., characters that are not included in the character class `[...]`
3.that has ending
NOTE:
If the symbols you don't want it to have also includes - , a hyphen, don't put it in between some other characters because it is a metacharacter in character class, put it at last of character class

JavaScript Regular Expression OR Operator

Saw a challenge on Twitter so I've been working my way through it, granted I am not the best with Regular Expressions. This is what I have so far:
var pass_regex = new RegExp(/^[a-z][A-Z][0-9]|[!##$%^&*()_]+$/);
I am trying to match a password input that contains:
1 Lowercase Letter
1 Uppercase Letter
1 Digit OR Special Character
Where I am getting stuck is on the 'OR' part, I thought the pipe separator between [0-9] and my set of special characters would work but it doesn't seem to. Trying to better understand how you would use regular expressions to to check for 1 Digit OR 1 Special Character. Thank you in advance for any help provided.

Atleast one:
You need to use a positive lookahead based regex for checking multiple conditions.
^(?=.*?[A-Z])(?=.*?[a-z]).*?[\W\d].*
OR
^(?=.*?[A-Z])(?=.*?[a-z]).*?[!##$%^&*()_\d].*
(?=.*?[A-Z]) Asserts that there must be atleast one uppercase letter.
(?=.*?[a-z]) Atleast one lowercase letter.
.*? non-greedy match of any character zero or more times.
If the above conditions are satisfied then match that corresponding string and also the string must contain atleast a single character from the given list [!##$%^&*()_\d] . \d in this list matches any digit character.
.* matches the following zero or more characters.
DEMO

How to write Regular expression for minimum one character in javascript?

I have small requirement in Regular expression,here I need minimum of one letter of Alphabets and followed by numbers and special characters. I tried the following regular expressions but I'm not getting the solution.
/^[a-zA-Z0-9\-\_\/\s,.]+$/
and
/^([a-zA-Z0-9]+)$/

I need minimum of one letter of Alphabets
[a-z]+
and followed by numbers and special characters.
[0-9_\/\s,.-]+
Combined together you would get this:
/^[a-z]+[0-9_\/\s,.-]+$/i
The /i modifier is added for case insensitive matching of alphabetical characters.

Try this regex:
/^[a-z][\d_\s,.]+$/i
To clarify what this does:
^[a-z] // must start with a letter (only one) add '+' for "at least one"
[\d_\s,.]+$ // followed by at least one number, underscore, space, comma or dot.
/i // case-insensitive

You need the other character selection to be separate. I'm confused as to what "numbers and special characters" means, but try:
/^[a-z]+[^a-z]+$/i

regex to disallow ._ or .- (email validation / javascript)

Here's my current regex:
^([-a-zA-Z0-9'_+\/]+([-.'_+\/][-a-zA-Z0-9'_+\/]+)*)#(([a-zA-Z0-9]+((\.|[-]{1,2})[a-zA-Z0-9]+)*)\.[a-zA-Z]{2,6})$
to validate an email address (and yes I know I shouldn't try and validate email addresses except on the simplest of terms, however our email vendor will reject special characters, etc. ).
This regex satisfies all of the requirements except one -
"No hypen or underscore directly after a period"
Regex is not my specialty, although I was able to get here. Any help would be appreciated.
Thanks.

Your regex (besides of grouping (...) starts with ^[-a-zA-Z0-9'_+\/]+, which means the beginning ^ is followed with one or more + allowed characters [...]. In this case they are hyphen, lowercase/uppercase letters, numbers, apostrophe, underscore, plus or foreslash.
Second part is what you need to change. In you regex it is ([-.'_+\/][-a-zA-Z0-9'_+\/]+)*, which is a pattern that may occur multiple times, but also does not have to *. The pattern has two parts: one of allowed characters: hyphen, period/dot, apostrophe, underscore, plus or foreslash; followed by one or more of hyphen, lowercase/uppercase letters, numbers, apostrophe, underscore, plus or foreslash.
If you remove period/dot from that first part of pattern, then this character will not be allowed. But because you want period/dot to be allowed, but not with same character sets, alternative pattern has to be defined.
If this second part will be changed from your ([-.'_+\/][-a-zA-Z0-9'_+\/]+)* to pattern that has an alternative for period/dot: ([-'_+\/][-a-zA-Z0-9'_+\/]+|\.[a-zA-Z0-9'+\/]+)*, then the final regex will do what you need. As you can see, |\.[a-zA-Z0-9'+\/]+ has been added, which reads: or | pattern single period/dot followed by one or more of lowercase/uppercase letters, numbers, apostrophe, plus or foreslash.
The final regex then is:
^([-a-zA-Z0-9'_+\/]+([-'_+\/][-a-zA-Z0-9'_+\/]+|\.[a-zA-Z0-9'+\/]+)*)#(([a-zA-Z0-9]+((\.|[-]{1,2})[a-zA-Z0-9]+)*)\.[a-zA-Z]{2,6})$

We Keep Coding

JavaScript is the programming language of the Web.