Regex validation quick fix most likely - javascript

I am try to validate a form to make the user enter their full name like the following..
First, Last
The user must have some string of alphabetic only chars, then a comma, then a space, then a last name, which can again be any string of chars.
This is my current regex..
var alphaExp = /^[a-zA-Z,+ a-zA-Z]+$/;
However, it lets the user submit something as simple as john. I want it to force them to submit something such as john, smith.

What you are doing is creating a character class ([]) that matches any of the following characters: a-zA-Z,+. This allows john to match the whole regex.
This is what you want:
/^[a-zA-Z]+, [a-zA-Z]+$/
However, I would like to advise you that making assumptions about names is a little wrong. What if some guy is named John O'Connor? Or Esteban Pérez? Or just Ng (only a first name)?
"Code that believes someone’s name can only contain certain characters is stupid, offensive, and wrong" - tchrist
Sure, you don't want to let people to enter just gibberish, but leave an option for users to enter something that doesn't necessarily fit your idea of correctness, but is nonetheless correct.

That's not how character sets work:
/^[a-zA-Z]+, [a-zA-Z]+$/
Things to consider:
Any validation you do on the client can be bypassed
Some people may have names with accented letters
Some cultures don't use just two names

^[a-zA-Z]+, [a-zA-Z]+$
Should work, however, do you want to prevent '.'? As in J. Hooker? And more words, like 'Jan van Hoogstrum'? Note also that you are preventing any accented characters being used. An option (although it allows underscores) is to use \w:
^(\w+ )$
Which would give you 'name name name...'. 'First, last' is not a common way to enter your name, so you'll frustrate a lot of users that way.

The correct regexp for allowing only the first letter to be capital would be:
/^[A-Z][a-z]+, [A-Z][a-z]+$/

Related

How to select and replace certain words in input strings regardless of the full string. (More info for clarification below)

I am making a filter for a chat room I own.
I was succesful in having it turn NSFW words into a bunch of symbols and astericks to censor it, but many people bypass it by simply putting a backslash, period, or other symbol/letter after it because I only put in the words without the punctation and symbols. They also come up with a bit more creative methods such as eeeNSFWeee so the filter doesn't count it as a word.
Is there a way to make it so that the filter will select certain characters that form a word in a string and replace them (with or without replacing the extra characters connected to the message)?
The Filter is made in javascript and Socket.io
Filter code:
const array = [
"NSFW",
"Bad Word"
"Innapropiate Word"
];
message = message
.split(" ")
.map((word) => (array.includes(word.toLowerCase()) ? "$#!%" : word))
.join(" ");
For an example if somebody typed "Bad Word" exactly like that (caps are not a problem), it would censor it succesfully.
But if somebody typed "Bad Word." that would be a problem because since it has a period it would count it as a different word, thats what I need fixed.
There are a number of approaches you could take here.
You could use replace() if you just want to remove symbols. For example:
word.replace(/[&\/\\#,+()$~%.`'"!;\^:*?<>{}_\[\]]/g, '')
You could use Regular Expressions in general, which allows you to match on patterns instead of exact string matching.
You could also use more complex fuzzy matching libraries or custom fuzzy matching to accomplish your goal. This post may be helpful.

How to get the valid part of a regex match

I want to test if a user string is "ok so far", in that it might not be valid as a whole but it is a subset of a valid one.
I have a regex say ^[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]$
such that "1234-1234-5678-5678" is valid
"1234-12" or even "1" does not match pattern but its a valid subset of a valid format, in other words the input is ok so far.
is there a neat way of doing this without making many many regexes, its friday.
Not sure if I understood well your problem, but I think you want to have something like this:
^([0-9]{4}-){1,3}[0-9]{1,4}$
Working demo
This will match set of 4 digits and can have the last set from 1 to 4 digits
You can also shorten your regex with:
^(\d{4}-){1,3}\d{1,4}$
You could possibly use one final regex for validation of the form you currently have, and a on the fly regex for the user input being valid for each subset.
My idea would be to have ([0-9]{1,4}-)+
For your case this will check as one types:
/^(\d(\d(\d(\d(-(\d(\d(\d(\d(-(\d(\d(\d(\d(-(\d)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?)?$/
This regex will match key for key as you type, although it is a little cumbersome.
^([0-9]{1,4}|[0-9]{4}-[0-9]{0,4}|[0-9]{4}-[0-9]{4}-[0-9]{0,4}|[0-9]{4}-[0-9]{4}-[0-9]{4}-[0-9]{0,4})$
Here is a live example

Rewrite regex to accept conditional terms

^([a-z0-9_\.-])+#[yahoo]{5}\.([com]{3}\.)?[com]{3}$
this currently matches xxxx#yahoo.com , how can I rewrite this to match some additional domains? for example, gmail.com and deadforce.com. I tried the following but it did not work, what am I doing wrong?
^([a-z0-9_\.-])+#[yahoo|gmail|deadforce]{5,9}\.([com]{3}\.)?[com]{3}$
Thank you in advance!
Your regex doesn't say what you think it says.
^([a-z0-9_\.-])+#[yahoo]{5}\.([com]{3}\.)?[com]{3}$
Says any characters a-z, 0-9, ., - one or more times.
That later part where you are trying match yahoo.com is incorrect. It says y, a, h, or o, any of those characters are allowed 5 times. Same with the com so aaaaa.ooo would be valid here. I'm not sure what the ([com]{3}\.)?[com]{3} was trying to say but I presume you wanted to check for .com.
See character classes documentation here, http://www.regular-expressions.info/charclass.html.
What you want is
^([a-z0-9_.\-])+#yahoo\.com$
or for more domains use grouping,
^([a-z0-9_.\-])+#(yahoo|gmail|deadforce)\.com$
You haven't stated what language you are using so a real demo can't be given.
Functional demo, https://jsfiddle.net/qa9x9hua/1/
Email validation is a notoriously difficult problem, and many people have failed quite horribly at trying to validate them themselves.
Filter var has a filter just for emails. Use that to check for email address validity. See http://php.net/manual/en/function.filter-var.php
if (filter_var('bob#example.com', FILTER_VALIDATE_EMAIL)) {
// Email is valid
}
There's probably no downside to doing the domain check the easy way. Just check for the domain strings in the email address. e.g.
if (
filter_var($email, FILTER_VALIDATE_EMAIL) &&
preg_match("/#(yahoo|gmail|deadforce)\.com/", $email)
) {
// Email is valid
}
In terms of your original regular expression, quite a lot of it was incorrect, which is why you were having trouble changing it.
regexper shows what you've created.
([a-z0-9_\.-])+ should be [a-z0-9_\.-]+ or ([a-z0-9_\.-]+)
The () are only capturing results in this section. If you want results move the brackets, if not remove them.
[yahoo]{5} should be yahoo
That's matching 5 characters that are one of y,a,h,o so it would match hayoo etc.
\.([com]{3}\.)?[com]{3} should be \.com
Dunno what this was trying to accomplish but you only wanted .com
Take a look at http:// www.regular-expressions.info /tutorial.html for a guide to regular expressions

Javascript Simple check for webform to make sure it contains both numbers and letters (instead of my code)

I have this code:
if(address.length<=0)
{
msg.setAttribute("style", "color:red");
msg.innerHTML='Please enter address';
return false;
}
I would like to change so it checks whether the webform contains BOTH numbers and letters. Can you help me?
Thank you so much,
Jones
p.s.: So I want to make sure they also enter street name AND house number as well (example: 24 Sunshine street would be good, but if they forget house number, they would get the message).
That doesn't look like PHP at all. More like JavaScript...
Here's one way to do it in JavaScript:
var re = /^\d+\s+\D+$/;
if (re.test(address)) {
//We get here if the address is correctly formated
}
else {
//We get here if the string is badly formated
}
The regex works like this:
\d+ matches to one or more numbers
\s+ matches to one or more spaces
\D+ matches to one or more letters
If you want to accept both "24 Sunshine" and "Sunshine 24" you could instead use this:
/^(\d+\s+\D+)|(\D+\s+\d+)$/
And if we want to be extra safe and protect from the case that the user might enter an extra trailing or leading space we could either trail the string or use this ReGex:
/^\s*(\d+\s+\D+)|(\D+\s+\d+)\s*$/
Apart from regular expression which is a very nice and clear solution you can use these php functions:
first the ctype_alnum () in order to check if your string contains letters and digits and then
this on ctype_alpha() in case the above is true to check if user forgot to enter number.
In case you are interested there is also this one ctype_digit() for checking if user missed the address but gave the number.
Or if you want just a regex this it will do the job:
^[a-zA-Z]([a-zA-Z-]+\s)+\d{1,4}

Surname regular expression

I'm trying to create a good regular expression for people's Surname.
It should be valid if a Surname is:
abcd
abcd'efg
abcd-efg
abcd, .efg
etc...
I also need to test if symbols do not repeat... so for example:
abcd''efg
abcd-',
Are invalid but the one:
abcd, .efg
Can be valid.
At the moment I just created this:
^[a-z .',-_]+$
And now I'm trying to check for all the double symbols but I cannot go ahead successfully.
It's a bad idea. There is no international list of allowed characters that people could use in their names. Some surnames even contain Unicode symbols — it will not be possible to write a regex that would perfectly validate all of them correctly. Even if you can come up with a regex, it might be too generic that it wouldn't be effective.
Read this article for why you shouldn't be doing this: Falsehoods Programmers Believe About Names
If after reading this insightful post by Amal Murali and you still want to do this with a regex, please see this:
/^(?![^'\-_\n]*['\-_][^'\-_\n]*['\-_])[a-z .',-_]+$/m
View a regex demo!

Categories