optional characters in javascript regular expression - javascript

I am trying to build a regular expression in javascript that checks for 3 word characters however 2 of them are are optional. So I have:
/^\w\w\w/i
what I am stumped on is how to make it that the user does not have to enter the last two letters but if they do they have to be letters

You can use this regular expression:
/^\w{1,3}$/i
The quantifier {1,3} means to repeat the preceding expression (\w) at least 1 and at most 3 times. Additionally, $ marks the end of the string similar to ^ for the start of the string. Note that \w does not just contain the characters a–z and their uppercase counterparts (so you don’t need to use the i modifier to make the expression case insensitive) but also the digits 0–9 and the low line character _.

Like this:
/^\w\w?\w?$/i
The ? marks the preceding expression as optional.
The $ is necessary to anchor the end of the regex.
Without the $, it would match a12, because it would only match the first character. The $ forces the regex to match the entire string.

Related

Regular Expression allows more than specified characters

HI I am new to regular expression, I tried creation regular expression based on below conditions:
Maximum 9 characters are allowed
First character must be upper case
Ending character must be 0-9
Must contain following special character ($,%,#)
/^[A-Z][a-z0-9A-Z$#%]{3,9}(?=.*[#$%]).\d+$/
What is wrong in my regular expression?
^[A-Z](?=.*[#$%])[a-z0-9A-Z$#%]{1,7}\d$
You need to take the lookahead at the start.\d+ should be \d.{3,9} should be {1,7}
Breaking the regex down
/^[A-Z][a-z0-9A-Z$#%]{3,9}(?=.*[#$%]).\d+$/
^ # Match the start of a string
[A-Z] # First character must be a capital letter
[a-z0-9A-Z$#%]{3,9} # The next 3-9 characters must be alphanumeric or one of $, # and %.
(?=.*[#$%]) # Look-ahead, requiring that some character be one of $, # and % (note that this is strictly after the 3-9 character check)
. # Match any character
\d+ # Match one or more numeric digits
$ # Match the end of the string
Therefore a string like "Aaaa$^55555555555555555" would be matched.
You need to change your look-ahead, probably moving it to before the 3-9 character check. You'll also want to make the length of that smaller, since you're explicitly allowing a capital letter as the first character and digit as the last character, so you'll probably want to match 1-7 characters instead of 3-9.

How to capture group with optional number of letters/numbers followed by optional comma?

I'm altering the segment of a string in javascript and got the following working (first iteration -optional comma).
var foo = "wat:a,username:x,super:man"
foo.replace(/(username\:\w+)(?:,)*/,"go:home,");
//"wat:a,go:home,super:man"
The trick now is that I might actually replace a key/value with only the key ... so I need to capture the original group with both optional value + optional comma.
var foo = "wat:a,username:,super:man"
foo.replace(/ ????? /,"go:home,");
//"wat:a,go:home,super:man"
As a bonus I'd like the most concise way to capture both optional numbers/and letters (updating my original to also support)
var foo = "wat:a,username:999,super:man"
foo.replace(/ ????? /,"go:home,");
//"wat:a,go:home,super:man"
You need to replace the + (1 or more occurrences of the preceding subpattern) quantifier with the * (0 or more occurrences of the preceding subpattern).
See Quantifier Cheatsheet at rexegg.com:
A+ One or more As, as many as possible (greedy), giving up characters if the engine needs to backtrack (docile)
A* Zero or more As, as many as possible (greedy), giving up characters if the engine needs to backtrack (docile)
Besides, you are not using any of the capturing groups defined in the pattern, so I suggest removing them.
.replace(/username:\w*,*/,"go:home,")
^
And if you have just 1 optional ,, use just the ? quantifier (1 or 0 repetition of the preceding subpattern):
.replace(/username:\w*,?/,"go:home,")
^
Note that in case you can have any characters before the end of string or comma, you can also use Fede's suggestion of using a negated character class: /username:[^,]*,*/. The [^,]* matches any character (even a newline) other than a comma.
Also, please note that you do not need to escape a colon. The characters that must be escaped outside of character class to be treated as literals are ., *, +, ?, ^, $, {, (, ), |, [, ], \. See RegExp MDN reference.
I'm not sure if I understood your question, but if you want to match username:?? you can use below regex:
(username\:\w*)
Working demo
Update: As stribizhev, pointed in his comment \w* can do the trick, however if you want to extend the regex to any characters besides letters or numbers you can use:
(username\:[^,]*)

JavaScript regular expressions to match no digits, whitespace and selected symbols

Thanks for taking a look.
My goal is to come up with a regexp that will match input that contains no digits, whitespace or the symbols !#£$%^&*()+= or any other symbol I may choose.
I am however struggling to grasp precisely how regular expressions work.
I started out with the simple pattern /\D/, which from my understanding will match the first non-digit character it can find. This would match the string 'James' which is correct but also 'James1' which I don't want.
So, my understanding is that if I want to ensure that a pattern is not found anywhere in a given string, I use the ^ and $ characters, as in /^\D$/. Now because this will only match a single character that is not a digit, I needed to use + to specify that 1 or more digits should not be founds in the entire string, giving me the expression /^\D+$/. Brilliant, it no longer matches 'James1'.
Question 1
Is my reasoning up to this point correct?
The next requirement was to ensure no whitespace is in the given string. \s will match a single whitespace and [^\s] will match the first non-whitespace character. So, from my understanding I just had to add this to what I have already to match strings that contain no digits and no whitespace. Again, because [^\s] will only match a single non-white space character, I used + to match one or more whitespace characters, giving the new regexp of /^\D+[^\s]+$/.
This is where I got lost, as the expression now matches 'James1' or even 'James Smith25'. What? Massively confused at this point.
Question 2
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Question 3
How would I go about writing the regular expression I'm trying to solve?
While I am keen to solve the problem I am more interested in figuring where my understanding of regular expressions is lacking, so any explanations would be helpful.
Not quite; ^ and $ are actually "anchors" - they mean "start" and "end", it's actually a little more complicated, but you can consider them to mean the start and end of a line for now - look up the various modifiers on regular expressions if you're interested in learning more about this. Unfortunately ^ has an overloaded meaning; if used inside square brackets it means "not", which is the meaning you are already acquainted with. It's very important that you understand the difference between these two meanings and that the definition in your head actually applies only to character range matching!
Contributing further to your confusion is that \d means "a numerical digit" and \D means "not a numerical digit". Similarly \s means "a whitespace (space/tab/newline/etc.) character" and \S means "not a whitespace character."
It's worth noting that \d is effectively a shortcut for [0-9] (note that - has a special meaning inside square brackets), and \D is a shortcut for [^0-9].
The reason it's matching strings that contain spaces is that you've asked for "1+ non-numerical digits followed by 1+ non-space characters" - so it'll match lots of strings! I think that perhaps you don't understand that regular expressions match bits of strings, you're not adding constraints as you go, but rather building up bots of matchers that will match bits of corresponding strings.
/^[^\d\s!#£$%^&*()+=]+$/ is the answer you're looking for - I'd look at it like this:
i. [] - match a range of characters
ii. []+ - match one or more of that range of characters
iii. [^\d\s]+ - match one or more characters that do not match \d (numerical digit) or \s (whitespace)
iv. [^\d\s!#£$%^&*()+=]+ - here's a bunch of other characters I don't want you to match
v. ^[^\d\s!#£$%^&*()+=]+$ - now there are anchors applied, so this matcher has to apply to the whole line otherwise it fails to match
A useful website to explore regexs is http://regexr.com/3b9h7 - which I supply with my suggested solution as an example. Edit: Pruthvi Raj's link to debuggerx is awesome!
Is my reasoning up to this point correct?
Almost. /\D/ matches any character other than a digit, but not just the first one (if you use g option).
and [^\s] will match the first non-whitespace character
Almost, [^\s] will match any non-whitespace character, not just the first one (if you use g option).
/^\D+[^\s]+$/ matching strings that contain spaces?
Yes, it does, because \D matches a space (space is not a digit).
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Because \D+ in /^\D+[^\s]+$/can match spaces.
Conclusion:
Use
^[^\d\s!#£$%^&*()+=]+$
It will match strings that have no digits and spaces, and the symbols you do not allow.
Mind that to match a literal -, ] or [ with a character class, you either need to escape them, or use at the start or end of the expression. To play it safe, escape them.
Just insert every character you don't want to include in a negated character class as follows:
^[^\s\d!#£$%^&*()+=]*$
DEMO
Debuggex Demo
^ - start of the string
[^...] - matches one character that is not in `...`
\s - matches a whitespace (space, newline,tab)
\d - matches a digit from 0 to 9
* - a quantifier that repeats immediately preceeding element by 0 or more times
so the regex matches any string that has
1. string that has a beginning
2. containing 0 or more number of characters that is not whitesapce, digit, and all the symbols included in the character class ( In this example !#£$%^&*()+=) i.e., characters that are not included in the character class `[...]`
3.that has ending
NOTE:
If the symbols you don't want it to have also includes - , a hyphen, don't put it in between some other characters because it is a metacharacter in character class, put it at last of character class

Regex: string up to 20char long, without specific characters

I am trying to make regexp for validating string not containing
^ ; , & . < > | and having 1-20 characters. Any other Unicode characters are valid (asian letters for example).
How to do it?
You can use the following:
^[^^;,&.<>|]{1,20}$
Explanation:
^ assert starting of the string
[^ start of negated character class ([^ ])
^;,&.<>| all the characters you dont want to match
] close the negates character class
{1,20} range of matches
$ assert ending of the string
It will match any character other than specified characters within range of 1-20.
Your regex \w[^;,&.<>|]{1,20} contains \w that might not match all Unicode letters (I guess your regex flavor does not match Unicode letters with \w). Anyway, the \w only matches 1 character in your pattern.
Also, you say you need to exclude ^ but it is missing in your pattern.
When you want to validate length, you also must use ^/$ anchors to mark the beginning and end of a string.
To create a pattern for some range that does not match specific characters, you need a negated character class with anchors around it, and the length is set with limiting quantifiers:
^[^^;,&.<>|]{1,20}$
Or (this version makes sure we only match at the beginning and end of the string, never a line):
\A[^^;,&.<>|]{1,20}\z
Note that inside a character class, almost all special characters do not require escaping (only some of them, none in your case). Even the ^ caret symbol.
See demo

decoding a JS regular expression

I am going through some legacy code and I came across this regular express:
var REGEX_STRING_REGEXP = /^\/(.+)\/([a-z]*)$/;
I am slightly confused as to what this regular expression signifies.
I have so far concluded the following:
Begin with /
Then any character (numeric, alphabetic, symbols, spaces)
then a forward slash
End with alphabetic characters
Can someone advice?
You can use a tool like Regexper to visualise your regular expressions. If we pass your regular expression into Regexper, we'll be given the following visualisation:
Direct link to Regexper result.
regex: /^/(.+)/([a-z]*)$/
^ : anchor regex to start of line
(.+) : 1 or more instances of word characters, non-word characters, or digits
([a-z]*) : 0 or more instances of any single lowercase character a-z
$ : anchor regex to end of line
In summary, your regular expression is looking to match strings where it is the first forwardslash, then 1 or more instances of word characters, non-word characters, or digits followed, then another forwardslash, then 0 or more instances of any single lowercase character a-z. Lastly, since both (.+) and ([a-z]*) are surrounded in parenthesis, they will capture whatever matches when you use them to perform regular expression operations.
I would suggest going to rubular, placing the regex ^/(.+)/([a-z]*)$ in the top field and playing with example strings in the test string box to better understand what strings will fit within that regex. (/string/something for example will work with your regular expression).

Categories