Regular Expression matching space but not any non-word character Javascript - javascript

I want to write such one using javascript that allow any char or space but not any other non-word character:
david johan // pass
david johan mark // pass
david## johan // doesn't pass
I have used this
/^(([a-zA-Z]{3,30})+[ ]+([a-zA-Z]{3,30})+)+$/
but it doesn't work
any suggestions ?

Hard to tell exactly what you're after, but I think this will do it:
/^([a-z0-9]|\s)*$/i
The ^ means it needs to start with the code in parentheses, and the $ means it needs to end with one of those characters too. * means 0 or more of the preceding expression and the bit inside the parens means any letter in the range a-z or number 0-9 or (|) any space character, tab, new line etc (\s).
That should match any letter or number and it has the case insensitive flag (i) on it too, it also accounts for white space.
If it was okay to include _ then you could have used /^(\w|\s)*$/

You can use this regular expression:
^[a-zA-Z ]*$
or in Javascript:
var re = new RegExp('^[a-zA-Z ]*$');

try this pattern \w+\s+\w+
Demo
console.log(/\w+\s+\w+/g.test("david johan"))
console.log(/\w+\s+\w+/g.test("david johan mark "))
console.log(/\w+\s+\w+/g.test("david## johan"))
console.log(/\w+\s+\w+/g.test("david ## johan"))
console.log(/\w+\s+\w+/g.test("this should not match!"));

Related

How to modify this hashtag regex to check if the second character is a-z or A-Z?

I'm building on a regular expression I found that works well for my use case. The purpose is to check for what I consider valid hashtags (I know there's a ton of hashtag regex posts on SO but this question is specific).
Here's the regex I'm using
/(^|\B)#(?![0-9_]+\b)([a-zA-Z0-9_]{1,20})(\b|\r)/g
The only problem I'm having is I can't figure out how to check if the second character is a-z (the first character would be the hashtag). I only want the first character after the hashtag to be a-z or A-Z. No numbers or non-alphanumeric.
Any help much appreciated, I'm very novice when it comes to regular expressions.
As I mentioned in the comments, you can replace [a-zA-Z0-9_]{1,20} with [a-zA-Z][a-zA-Z0-9_]{0,19} so that the first character is guaranteed to be a letter and then followed by 0 to 19 word characters (alphanumeric or underscore).
However, there are other unnecessary parts in your pattern. It appears that all you need is something like this:
/(?:^|\B)#[a-zA-Z][a-zA-Z0-9_]{0,19}\b/g
Demo.
Breakdown of (?:^|\B):
(?: # Start of a non-capturing group (don't use a capturing group unless needed).
^ # Beginning of the string/line.
| # Alternation (OR).
\B # The opposite of `\b`. In other words, it makes sure that
# the `#` is not preceded by a word character.
) # End of the non-capturing group.
Note: You may also replace [a-zA-Z0-9_] with \w.
References:
Word Boundaries.
Difference between \b and \B in regex.
The below should work.
(^|\B)#(?![0-9_]+\b)([a-zA-Z][a-zA-Z0-9_]{0,19})(\b|\r)
If you only want to accept two or more letter hashtags then change {0,19} with {1,19}.
You can test it here
In your pattern you use (?![0-9_]+\b) which asserts that what is directly on the right is not a digit or an underscore and can match a lot of other characters as well besides an upper or lower case a-z.
If you want you can use this part [a-zA-Z0-9_]{1,20} but then you have to use a positive lookahead instead (?=[a-zA-Z]) to assert what is directly to the right is an upper or lower case a-z.
(?:^|\B)#(?=[a-zA-Z])[a-zA-Z0-9_]{1,20}\b
Regex demo

Regex match word after negated set

I'm currently trying to match the following cases with Regex.
Current regex
\.\/[^/]\satoms\s\/[^/]+\/index\.js
Cases
// Should match
./atoms/someComponent/index.js
./molecules/someComponent/index.js
./organisms/someComponent/index.js
// Should not match
./atomsdsd/someComponent/index.js
./atosdfms/someComponent/index.js
./atomssss/someComponent/index.js
However none of the cases are matching, what am I doing wrong?
Hope this will help you out. You have added some addition characters which lets your regex to fail.
Regex: \.\/(atoms|molecules|organisms)\/[^\/]+\/index\.js
1. \.\/ This will match ./
2. (atoms|molecules|organisms) This will match either atoms or molecules or organisms
3. \/[^\/]+\/ This will match / and then till /
4. index\.js This will match index.js
Regex demo
why not just this simpler pattern?
\.\/(atoms|molecules|organisms)\/.*?index\.js
Try the following:
\.\/(atoms|molecules|organisms)\/[a-zA-Z]*\/index\.js
Forward slashes (and other special characters) should be escaped with a back slash \.
\.\/(atoms|molecules|organisms)\/ matches '.atoms/' or .molecules or organisms strictly. Without the parenthesis it will match partial strings. the | is an alternation operator that matches either everything to the left or everything to the right.
[a-zA-Z]* will match a string of any length with characters in any case. a-z accounts for lower case while A-Z accounts for upper case. * indicates one or more characters. Depending on what characters may be in someCompenent you may need to account for numbers using [a-zA-Z\d]*.
\/index\.js will match '/index.js'

decoding a JS regular expression

I am going through some legacy code and I came across this regular express:
var REGEX_STRING_REGEXP = /^\/(.+)\/([a-z]*)$/;
I am slightly confused as to what this regular expression signifies.
I have so far concluded the following:
Begin with /
Then any character (numeric, alphabetic, symbols, spaces)
then a forward slash
End with alphabetic characters
Can someone advice?
You can use a tool like Regexper to visualise your regular expressions. If we pass your regular expression into Regexper, we'll be given the following visualisation:
Direct link to Regexper result.
regex: /^/(.+)/([a-z]*)$/
^ : anchor regex to start of line
(.+) : 1 or more instances of word characters, non-word characters, or digits
([a-z]*) : 0 or more instances of any single lowercase character a-z
$ : anchor regex to end of line
In summary, your regular expression is looking to match strings where it is the first forwardslash, then 1 or more instances of word characters, non-word characters, or digits followed, then another forwardslash, then 0 or more instances of any single lowercase character a-z. Lastly, since both (.+) and ([a-z]*) are surrounded in parenthesis, they will capture whatever matches when you use them to perform regular expression operations.
I would suggest going to rubular, placing the regex ^/(.+)/([a-z]*)$ in the top field and playing with example strings in the test string box to better understand what strings will fit within that regex. (/string/something for example will work with your regular expression).

I want to ignore square brackets when using javascript regex [duplicate]

This question already has answers here:
Why is this regex allowing a caret?
(3 answers)
Closed 1 year ago.
I am using javascript regex to do some data validation and specify the characters that i want to accept (I want to accept any alphanumeric characters, spaces and the following !&,'\- and maybe a few more that I'll add later if needed). My code is:
var value = userInput;
var pattern = /[^A-z0-9 "!&,'\-]/;
if(patt.test(value) == true) then do something
It works fine and excludes the letters that I don't want the user to enter except the square bracket and the caret symbols. From all the javascript regex tutorials that i have read they are special characters - the brackets meaning any character between them and the caret in this instance meaning any character not in between the square brackets. I have searched here and on google for an explanation as to why these characters are also accepted but can't find an explanation.
So can anyone help, why does my input accept the square brackets and the caret?
The reason is that you are using A-z rather than A-Za-z. The ascii range between Z (0x5a) and a (0x61) includes the square brackets, the caret, backquote, and underscore.
Your regex is not in line with what you said:
I want to accept any alphanumeric characters, spaces and the following !&,'\- and maybe a few more that I'll add later if needed
If you want to accept only those characters, you need to remove the caret:
var pattern = /^[A-Za-z0-9 "!&,'\\-]+$/;
Notes:
A-z also includesthe characters: [\]^_`.
Use A-Za-z or use the i modifier to match only alphabets:
var pattern = /^[a-z0-9 "!&,'\\-]+$/i;
\- is only the character -, because the backslash will act as special character for escaping. Use \\ to allow a backslash.
^ and $ are anchors, used to match the beginning and end of the string. This ensures that the whole string is matched against the regex.
+ is used after the character class to match more than one character.
If you mean that you want to match characters other than the ones you accept and are using this to prevent the user from entering 'forbidden' characters, then the first note above describes your issue. Use A-Za-z instead of A-z (the second note is also relevant).
I'm not sure what you want but I don't think your current regexp does what you think it does:
It tries to find one character is not A-z0-9 "!&,'\- (^ means not).
Also, I'm not even sure what A-z matches. It's either a-z or A-Z.
So your current regexp matches strings like "." and "Hi." but not "Hi"
Try this: var pattern = /[^\w"!&,'\\-]/;
Note: \w also includes _, so if you want to avoid that then try
var pattern = /[^a-z0-9"!&,'\\-]/i;
I think the issue with your regex is that A-z is being understood as all characters between 0x41 (65) and 0x7A (122), which included the characters []^_` that are between A-Z and a-z. (Z is 0x5A (90) and a is 0x61 (97), which means the preceding characters take up 0x5B thru 0x60).

Creating javascript regex tp replace characters using whitelist

I'm trying to create a regex which will replace all the characters which are not in the specified white list (letters,digits,whitespaces, brackets, question mark and explanation mark)
This is the code :
var regEx = /^[^(\s|\w|\d|()|?|!|<br>)]*?$/;
qstr += tempStr.replace(regEx, '');
What is wrong with it ?
Thank you
The anchors are wrong - they only allow the regex to match the entire string
The lazy quantifier is wrong - you wouldn't want the regex to match 0 characters (if you have removed the anchors)
The parentheses and pipe characters are wrong - you don't need them in a character class.
The <br> is wrong - you can't match specific substrings in a character class.
The \d is superfluous since it's already contained in \w (thanks Alex K.!)
You're missing the global modifier to make sure you can do more than one replace.
You should be using + instead of * in order not to replace lots of empty strings with themselves.
Try
var regEx = /[^\s\w()?!]+/g;
and handle the <br>s independently (before that regex is applied, or the brackets will be removed).
You'll want to use the g (global) modifier:
var regEx = /^[^(\s|\w|\d|()|?|!|<br>)]*?$/g; // <-- `g` goes there
qstr += tempStr.replace(regEx, '');
This allows your expression to match multiple times.

Categories