Regex for first and last name - javascript

I'm trying to make a regexp where it has to match a name like: John Smith. The only rules are first and last name should start with capital letter and has to be at least 2 characters long. Also the last name has a limit of 20 characters maximum and there's a coma or white space between the names. So far I have this:
/[A-Z][a-z]+(\s|,)[A-Z][a-z]{19}/
It doesn't work when I tested it in this site: http://www.regular-expressions.info/javascriptexample.html. I'm not sure what I missed. Any ideas?

Change the {19} to {1,19}. By itself, {19} means "match exactly 19 of the previous character". {1,19} means "match between 1 and 19 of the previous character".
/[A-Z][a-z]+(\s|,)[A-Z][a-z]{1,19}/
UPDATE: People are commenting that this does not meet your requirements. As you described them, it's possibly a naive implementation of your requirements, but it is just your original implementation with the bug fixed. If you are actually looking for names, a less naive implementation might be:
/^[A-Z][-'a-zA-Z]+,?\s[A-Z][-'a-zA-Z]{0,19}$/
This will catch names with apostrophes or dashes, allows a space after the comma between the names if they are separated by a comma, and allows for single-letter last names. But as the commenters have pointed out, this still fails to match a bunch of legitimate names and matches stuff that is definitely not name-like.
It also adds anchors ^ and $ to mean the entire string must match. If you are looking for a substring, you can remove those anchors and add in word boundary checks instead:
/\b[A-Z][-'a-zA-Z]+,?\s[A-Z][-'a-zA-Z]{0,19}\b/

You need {0,19} not just {19}. The latter means "exactly 19 chars".
"John Smith".match(/^[A-Z][a-z]{0,19}[\s,][A-Z][a-z]{0,19}$/)
Of course, this regexp doesn't match many totally valid names like "José Ortega y Gasset" or "Charles de Batz-Castelmore d'Artagnan".
Depending upon how long acceptable surnames can be, you can replace "{0,19}" with "{1,19}" or "{2,19}". The same applies to first names.

Here are some expressions that might help for more complex names
(^[A-Z][a-z]*$) - A typical First Name or Last Name like Thomas
(^[A-Z][a-z][A-Z][a-z]*$) - names like McDonald
(^[A-Z][a-z]*(-|\s)[A-Z][a-z]*$) - names like Tonya-Smith or Tonya Smith
(^[A-Z]('|’)[A-Z][a-z]*$) - names like Tim O’Reilly
(^[a-z]+\s[A-Z][a-z]*$) - names like von Gogh

/^[A-Z][a-z]+[\s|,][A-Z][a-z]{1,19}$/.test("John Smith") // true

The {19} means that the last name must have exactly 19 lowercase characters after the first uppercase character.

This should work for you
\b[A-Z]+.+[?^ ][A-Z].{1,19}|\b[A-Z]+.+[?^,][A-Z].{1,19}
This starts with the beginning of a word, checks that the first letter is caps, matches the first word up to a white space or comma, then checks to make sure the first letter of the next word is capitalized, and matches everything up to 19 characters after that. Also makes sure each name is 2 or more characters long.

Related

How to create regex to check user name with numbers and space?

I want to create a regex that can check user name in this terms:
two first letters are in a-z (no numbers allowed).
numbers after two letters are allowed.
space is allowed - not in the beginning.
after the space can be only a-z, but not number/special char (john 2doe )), but john d2oe is okay.
those rules above is apply to every word in the string.
no special characters allow such ~!##$%^&*()
This is what I have done so far:
/^([A-Za-z]{2,})/.test('john#~ doe') // true. - not good. it should be false.
the first and the second I solve it. but how to do the rest?
Here's one that should work for you
^([A-z]{2}([A-z0-9]|\s[A-z]{2})*)*$
but note: your rules have a bit of contradiction in them. This regex does not let the john d2oe name, as the first two letters of d2oe are not letters. This can be changed by using
^([A-z]{2}([A-z0-9]|\s[A-z])*)*$
Here, we define the rules for each word, and then say we will match any amount of them. Each word starts with 2 letters, then is followed by any amount of either any amount of letters and numbers, or a space followed by at least 1 (or in the first statement 2) letter(s).
Take a look at the tests here, I hope this helps!
^([a-zA-Z]{2})([a-zA-z0-9]*)([ ]*)([a-zA-Z]+)([a-zA-Z0-9]*)$
Clears:
john d2oe
jo hnd2oe
Fails:
john 2doe
j2hn d2oe
ohn d2oe

Regex - Allow capital letters just at the beginning of words

I have to check for capital letters to exist just at the beginning of words.
My regex now looks like this:
/^([A-ZÁÉÚŐÓÜÖÍ]([a-záéúőóüöí]*\s?))+$/
It's at the words beginning works good, but if the problem not at the beginning of the word it's fails.
For example: John JohnJ got validated.
What should i alternate in my regex to works well?
In your regex pattern the space is optional, allowing combinations like JJohn or JohnJ - the key is to make it required between words. There are two ways to do this:
Roll out your pattern:
/^[A-ZÁÉÚŐÓÜÖÍ][a-záéúőóüöí]*(?:\s[A-ZÁÉÚŐÓÜÖÍ][a-záéúőóüöí]*)*$/
Or make the space in your pattern required, but alternatively allow it to be the end of line (this allows a trailing space though).
/^(?:[A-ZÁÉÚŐÓÜÖÍ][a-záéúőóüöí]*(?:\s|$))+$/
In both patterns I have removed some superfluous groups of your original and turned all groups into non-capturing ones.
You can do this: /^([A-ZÁÉÚŐÓÜÖÍ]{0,1}([a-záéúőóüöí]*\s?))+$/
With {a,b}, a is the least amount of characters it will match, whereas b is the most amount of characters it will match.
If there is ALWAYS going to be a capital letter at the beginning, instead you can simply use: /^([A-ZÁÉÚŐÓÜÖÍ]{1}([a-záéúőóüöí]*\s?))+$/
In this preceding case, {c}, c is the exact number of characters it will match.
Here is a resource with good information.

Regex to allow any language characters in the form of full name and starting with letter

I try to validate a name field, and for this field I like to allow the end user to add anything like Merianos Nikos, Μέριανος Νίκος (greek), or characters from any other language in the same form.
The form is first letter capital, rest letters of the word lower, and at least two words.
Currectly I have this regex /^([A-Z][a-z]*((\s)))+[A-Z][a-z]*$/ that works perfectly with english, but not with greeks and perhaps with other languages.
Finally, I'd like to validate another field with at least on word, with the frist letter capital, but this field can also contains characters after the word.
For the moment I use the followign regex /^[\s\w\.\-_]+$/ that works, but again I have problem with greek and other languages.
You could do this through the use of Unicode Categories. Thus, the regular expression ^\p{Lu}\p{Ll}+( \p{Lu}\p{Ll}+)*$ will match an upper case letter followed by one or more lower case letters, followed by 0 or more other names, seperated by spaces. An example of the expression is shown here.
With regards to your second point, you could use something of the sort ^\p{Lu}\p{Ll}*$, which will expect at least 1 upper case letter followed by 0 or more lower case ones.
The expressions above assume that you do not have quotation marks, such as O'Brian or dashes Foo-bar in your names. If you want to handle specifically Greek names, and you know for a fact that Greek names have neither quotation marks nor dashes in them, then this should not be much of a problem.
Usually one simply ensures that the name provided is not empty, rather than specifying some strict form. Please refer to this question for more information on the matter.
^[{\p{L}}{0-9}]$
This regex matches any kind of letter from any language (and also numbers).
function isFullname($fullname) {
return preg_match("/^((?:\p{Ll}|\p{Lu}){2,30}\s?){2,4}$/g", $fullname);
}
This is useful for me. Because the username may also be written in lowercase letters.
And it can have a name or surname of at least 2 characters. Also, I accept a name with a maximum of 30 characters. And I make it repeatable at least 2 times at most 4 times.
It could have a name like McCésy (realy? =)) ...

RegEx for exactly two or three words

Could someone help me out with this?
At first I was trying to figure out how to simply check for input containing one or two words, and I was able to find that that would be with \w* ?\w+ and for containing exactly two words would be with \w+ \w+ And I got to something like this (which is not working):
/^$|^([a-zA-ZčČćĆđĐšŠžŽ -])\w+ \w+$/
What I've since figured out is that it should contain not one or two, but two or three words. And since I was unable to figure out the RegEx for two words to start with, I had to ask for help here.
Like I said, I need it to allow entering only two or three words with no numbers and with the addition of these letters čČćĆđĐšŠžŽ and a -
Also I need it to ignore a blank input, that's why ^$| is there.
I am really, really new at this, so any help would be appreciated.
EXAMPLES:
Marko Marković
John Smith
Mary-Jane Austin
John III Johnson
Just replace your new definition of "word" character to all the \w. This is for exactly 2 words, with exactly 1 space in between:
/^$|^[a-zA-ZčČćĆđĐšŠžŽ-]+ [a-zA-ZčČćĆđĐšŠžŽ-]+$/
For exactly 2 or 3 words:
/^$|^[a-zA-ZčČćĆđĐšŠžŽ-]+ [a-zA-ZčČćĆđĐšŠžŽ-]+( [a-zA-ZčČćĆđĐšŠžŽ-]+)?$/
Note that I have removed the space in your character class, since it shouldn't be considered part of a "word", or your "word" count will mess up.
You can use Unicode regex to filter it out.
[\p{L}\s-]+
\p{L} : This will match any unicode alphabet from any language.
\s : Space character.
- : Dash ( - ).
You can see how it matches here.
For more about unicode regex you can refer this.

Javascript regex

I was trying to do a regex for someone else when I ran into this problem. The requirement was that the regex should return results from a set of strings that has, let's say, "apple" in it. For example, consider the following strings:
"I have an apple"
"You have two Apples"
"I give you one more orange"
The result set should have the first two strings.
The regex(es) I tried are:
/[aA]pple/ and /[^a-zA-Z0-9][aA]pple/
The problem with the first one is that words like "aapple", "bapple", etc (ok, so they are meaningless, but still...) test positive with it, and the problem with the second one is that when a string actually starts with the word "apple", "Apples and oranges", for example, it tests negative. Can someone explain why the second regex behaves this way and what the correct regex would be?
/(^.*?\bapples?\b.*$)/i
Edit: The above will match the entire string containing the word "apples", which I thought is what you were asking for. If you are just trying to see if the string contains the word, the following will work.
/\bapples?\b/i
The regex(es) I tried are:
/[aA]pple/ and /[^a-zA-Z0-9][aA]pple/
The first one just checks for the existence of the following characters, in order: a-p-p-l-e, regardless of what context they are used in. The \b, or word-boundary character, matches any spot where a non-word character and a word character meet, ala \W\w.
The second one is trying to match other characters before the occurrance of a-p-p-l-e, and is essentially the same as the first, except it requires other characters in front of it.
The one I answered with works like following. From the beginning of the string, matches any characters (if they exist) non-greedily until it encounters a word boundary. If the string starts with apple, the beginning of a string is a word-boundary, so it still matches. It then matches the letters a-p-p-l-e, and s if it exists, followed by another word boundary. It then matches all characters to the end of the string. The /i at the end means it's case-insensitive, so 'Apple', 'APPLE', and 'apple' are all valid.
If you have the time, I would highly recommend walking through the tutorial at http://regular-expressions.info. It really goes in-depth and talks about how the regular expression engines match different expressions, it helped me a ton.
To build on #tj111, the reason your second regex fails is that [^a-zA-Z0-9] requires that a character matches; that is, there is some character in that position, and its value is not contained in the set [a-zA-Z0-9]. Markers like \b are called "zero-width assertions". \b, in particular, matches against boundaries between characters or at the beginning or end of a string. Because it is not matching against any character, its "width" is zero.
In sum, [^a-zA-Z0-9] requires a character that does not take a particular value be present, while \b requires only that a boundary be present.
Edit: #tj111 has added most of this to his response. I'm in too late, again :)
This works for apple and apples and its case-insensitive spellings:
var strings = ["I have an apple", "You have two Apples", "I give you one more orange"];
var result = [];
var pattern = /\bapples?\b/i;
for (var i=0; i<strings.length; i++) {
if (pattern.test(strings[i])) {
result.push(strings[i]);
}
}
Your second regex requires a nonalphanumeric character before the first a in apple. "apple" doesn't satisfy this. As others note, "\b" matches not a character, but a word boundary position.
/\bapple/i
\b is a word boundary.
To explain why your attempts do not work, the first one does not check that it is the beginning of the word, so it can have something before it. The second regex you gave says that something must be before the word "apple", but it can't be alphanumeric.

Categories