I'm trying to create a good regular expression for people's Surname.
It should be valid if a Surname is:
abcd
abcd'efg
abcd-efg
abcd, .efg
etc...
I also need to test if symbols do not repeat... so for example:
abcd''efg
abcd-',
Are invalid but the one:
abcd, .efg
Can be valid.
At the moment I just created this:
^[a-z .',-_]+$
And now I'm trying to check for all the double symbols but I cannot go ahead successfully.
It's a bad idea. There is no international list of allowed characters that people could use in their names. Some surnames even contain Unicode symbols — it will not be possible to write a regex that would perfectly validate all of them correctly. Even if you can come up with a regex, it might be too generic that it wouldn't be effective.
Read this article for why you shouldn't be doing this: Falsehoods Programmers Believe About Names
If after reading this insightful post by Amal Murali and you still want to do this with a regex, please see this:
/^(?![^'\-_\n]*['\-_][^'\-_\n]*['\-_])[a-z .',-_]+$/m
View a regex demo!
Related
I want to test if a user string is "ok so far", in that it might not be valid as a whole but it is a subset of a valid one.
I have a regex say ^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]$
such that "1234-1234-5678-5678" is valid
"1234-12" or even "1" does not match pattern but its a valid subset of a valid format, in other words the input is ok so far.
is there a neat way of doing this without making many many regexes, its friday.
Not sure if I understood well your problem, but I think you want to have something like this:
^([0-9]{4}-){1,3}[0-9]{1,4}$
Working demo
This will match set of 4 digits and can have the last set from 1 to 4 digits
You can also shorten your regex with:
^(\d{4}-){1,3}\d{1,4}$
You could possibly use one final regex for validation of the form you currently have, and a on the fly regex for the user input being valid for each subset.
My idea would be to have ([0-9]{1,4}-)+
For your case this will check as one types:
/^(\d(\d(\d(\d(-(\d(\d(\d(\d(-(\d(\d(\d(\d(-(\d)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?$/
This regex will match key for key as you type, although it is a little cumbersome.
^([0-9]{1,4}|[0-9]{4}-[0-9]{0,4}|[0-9]{4}-[0-9]{4}-[0-9]{0,4}|[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{0,4})$
Here is a live example
^([a-z0-9_\.-])+#[yahoo]{5}\.([com]{3}\.)?[com]{3}$
this currently matches xxxx#yahoo.com , how can I rewrite this to match some additional domains? for example, gmail.com and deadforce.com. I tried the following but it did not work, what am I doing wrong?
^([a-z0-9_\.-])+#[yahoo|gmail|deadforce]{5,9}\.([com]{3}\.)?[com]{3}$
Thank you in advance!
Your regex doesn't say what you think it says.
^([a-z0-9_\.-])+#[yahoo]{5}\.([com]{3}\.)?[com]{3}$
Says any characters a-z, 0-9, ., - one or more times.
That later part where you are trying match yahoo.com is incorrect. It says y, a, h, or o, any of those characters are allowed 5 times. Same with the com so aaaaa.ooo would be valid here. I'm not sure what the ([com]{3}\.)?[com]{3} was trying to say but I presume you wanted to check for .com.
See character classes documentation here, http://www.regular-expressions.info/charclass.html.
What you want is
^([a-z0-9_.\-])+#yahoo\.com$
or for more domains use grouping,
^([a-z0-9_.\-])+#(yahoo|gmail|deadforce)\.com$
You haven't stated what language you are using so a real demo can't be given.
Functional demo, https://jsfiddle.net/qa9x9hua/1/
Email validation is a notoriously difficult problem, and many people have failed quite horribly at trying to validate them themselves.
Filter var has a filter just for emails. Use that to check for email address validity. See http://php.net/manual/en/function.filter-var.php
if (filter_var('bob#example.com', FILTER_VALIDATE_EMAIL)) {
// Email is valid
}
There's probably no downside to doing the domain check the easy way. Just check for the domain strings in the email address. e.g.
if (
filter_var($email, FILTER_VALIDATE_EMAIL) &&
preg_match("/#(yahoo|gmail|deadforce)\.com/", $email)
) {
// Email is valid
}
In terms of your original regular expression, quite a lot of it was incorrect, which is why you were having trouble changing it.
regexper shows what you've created.
([a-z0-9_\.-])+ should be [a-z0-9_\.-]+ or ([a-z0-9_\.-]+)
The () are only capturing results in this section. If you want results move the brackets, if not remove them.
[yahoo]{5} should be yahoo
That's matching 5 characters that are one of y,a,h,o so it would match hayoo etc.
\.([com]{3}\.)?[com]{3} should be \.com
Dunno what this was trying to accomplish but you only wanted .com
Take a look at http:// www.regular-expressions.info /tutorial.html for a guide to regular expressions
I am try to validate a form to make the user enter their full name like the following..
First, Last
The user must have some string of alphabetic only chars, then a comma, then a space, then a last name, which can again be any string of chars.
This is my current regex..
var alphaExp = /^[a-zA-Z,+ a-zA-Z]+$/;
However, it lets the user submit something as simple as john. I want it to force them to submit something such as john, smith.
What you are doing is creating a character class ([]) that matches any of the following characters: a-zA-Z,+. This allows john to match the whole regex.
This is what you want:
/^[a-zA-Z]+, [a-zA-Z]+$/
However, I would like to advise you that making assumptions about names is a little wrong. What if some guy is named John O'Connor? Or Esteban Pérez? Or just Ng (only a first name)?
"Code that believes someone’s name can only contain certain characters is stupid, offensive, and wrong" - tchrist
Sure, you don't want to let people to enter just gibberish, but leave an option for users to enter something that doesn't necessarily fit your idea of correctness, but is nonetheless correct.
That's not how character sets work:
/^[a-zA-Z]+, [a-zA-Z]+$/
Things to consider:
Any validation you do on the client can be bypassed
Some people may have names with accented letters
Some cultures don't use just two names
^[a-zA-Z]+, [a-zA-Z]+$
Should work, however, do you want to prevent '.'? As in J. Hooker? And more words, like 'Jan van Hoogstrum'? Note also that you are preventing any accented characters being used. An option (although it allows underscores) is to use \w:
^(\w+ )$
Which would give you 'name name name...'. 'First, last' is not a common way to enter your name, so you'll frustrate a lot of users that way.
The correct regexp for allowing only the first letter to be capital would be:
/^[A-Z][a-z]+, [A-Z][a-z]+$/
With this RegExp I can easily check if an email is valid or not:
RegExp(/^([\w-\.]+#([\w-]+\.)+[\w-]{2,4})?$/);
However, this just return true for such addresses:
example#example.com
I also want to accept:
*#example.com
What changes I need to apply on my RegExp?
Thanks in advance
To answer your question literally, you can "augment" your regex:
RegExp(/^([\w.*-]+#([\w-]+\.)+[\w-]{2,4})?$/);
But this is a terrible regex for e-mail validation. Regex is the wrong tool for this. Why do you insist on doing it this way?
A couple of things: to accept *#foo.bar:
var expression = /^([\w-\.*]+#([\w-]+\.)+[\w-]{2,4})?$/;//no need to pass it to the RegExp constructor
But this expression does accept -#-.--, but then again, regex and email aren't all too good a friends. But based on your expression, here's a slightly less unreliable version:
var expression = /^[\w-\.\d*]+#[\w\d]+(\.\w{2,4})$/;
There is an expression that validates all valid types of email addresses, somewhere on the net, though. Look into that, to see why regex validating is almost always going to either exclude valid input or be too forgiving
Checking email addresses is not that straightforward, cf. RFC 822, sec 6.1.
A good list of regexes can be found at http://www.regular-expressions.info/email.html, describing tradeoffs between RFC conformance and practicality.
i have this Regex pattern
\=[a-zA-Z\.\:\[\]_\(\)\&\$\%#\-\#\!0-9;=\?/\+\xBF\~]+[?\s+|?>]
and i have this HTML
1.esc#xyz.com
2.johnross#zys.com
3.johnross#wen.com
Here the problem is,
I need to avoid first and second as it has white space as well and it is valid attributes.
But only the third one is working as it does't has white spaces.
means nothing should be selected with the above pattern.
here is direct link to test
http://regexr.com?31r61
Please help!
Thanks,
EDIT:
If you just want to match unquoted attributes, this should work:
[<\s]+[\w]+(=[^\"][^\s>]*)
Kind of inelegant but let me know if that does what you want.
Which pattern are you trying to match? All three? And if so, which portion? The subject or the email? If you're just trying to match the subject, try using this as the pattern to match:
\=\"mailto:[^?]*\?subject=([^\"]*)\"\>
That will return a match where the group is the subject itself.
That is a wicked character class....
why don't you try something a bit more reasonable. Try this...
\=".*?(?<!\\)"
that will match anything in the parenthesis after href if that's what you're trying to get. If you're looking for more than that, this regex can easily by modified.