Regular Expression - Match any character except +, empty string should also be matched - javascript

I am having a bit of trouble with one part of a regular expression that will be used in JavaScript. I need a way to match any character other than the + character, an empty string should also match.
[^+] is almost what I want except it does not match an empty string. I have tried [^+]* thinking: "any character other than +, zero or more times", but this matches everything including +.

Add a {0,1} to it so that it will only match zero or one times, no more no less:
[^+]{0,1}
Or, as FailedDev pointed out, ? works too:
[^+]?
As expected, testing with Chrome's JavaScript console shows no match for "+" but does match other characters:
x = "+"
y = "A"
x.match(/[^+]{0,1}/)
[""]
y.match(/[^+]{0,1}/)
["A"]
x.match(/[^+]?/)
[""]
y.match(/[^+]?/)
["A"]

[^+] means "match any single character that is not a +"
[^+]* means "match any number of characters that are not a +" - which almost seems like what I think you want, except that it will match zero characters if the first character (or even all of the characters) are +.
use anchors to make sure that the expression validates the ENTIRE STRING:
^[^+]*$
means:
^ # assert at the beginning of the string
[^+]* # any character that is not '+', zero or more times
$ # assert at the end of the string

If you're just testing the string to see if it doesn't contain a +, then you should use:
^[^+]*$
This will match only if the ENTIRE string has no +.

Related

Write a regex for usernames

I want a Regex for my mongoose schema to test if a username contains only letters, numbers and underscore, dash or dot. What I got so far is
/[a-zA-Z0-9-_.]/
but somehow it lets pass everything.
Your regex is set to match a string if it contains ANY of the contained characters, but it doesn't make sure that the string is composed entirely of those characters.
For example, /[a-zA-Z0-9-_.]/.test("a&") returns true, because the string contains the letter a, regardless of the fact that it also includes &.
To make sure all characters are one of your desired characters, use a regex that matches the beginning of the string ^, then your desired characters followed by a quantifier + (a plus means one or more of the previous set, a * would mean zero or more), then end of string $. So:
const reg = /^[a-zA-Z0-9-_.]+$/
console.log(reg.test("")) // false
console.log(reg.test("I-am_valid.")) // true
console.log(reg.test("I-am_not&")) // false
Try like this with start(^) and end($),
^[a-zA-Z0-9-_.]+$
See demo : https://regex101.com/r/6v0nNT/3
/^([a-zA-Z0-9]|[-_\.])*$/
This regex should work.
^ matches at the beginning of the string. $ matches at the end of the string. This means it checks for the entire string.
The * allows it to match any number of characters or sequences of characters. This is required to match the entire password.
Now the parentheses are required for this as there is a | (or) used here. The first stretch was something you already included, and it is for capital/lowercase letters, and numbers. The second area of brackets are used for the other characters. The . must be escaped with a backslash, as it is a reserved character in regex, used for denoting that something can be any character.

javascript regexp does not evaluate only one character

I'm using this regexp:
/[^+][a-z]/.test(str)
I'm trying to ensure that if there are any letters ([a-z]) in a string (str) not proceeded by a plus ([^+]) , a match is found and therefore it will return true.
It mostly works except when there is only one character in the string. For example, a returns false, even though there is no plus sign preceding it.
How can I ensure it works for all strings including one character strings. Thanks!
Add a ^ as an alternative to [^+]:
/(?:^|[^+])[a-z]/.test(str)
^^^^^^^^^^
The (?:^|[^+]) is a non-capturing alternation group matching either the start of the string (with ^) or (|) any char other than + (with [^+]).

Why does my regexp not work when the strings end with spaces?

I am using this regexp - [^\s\da-zA-ZåäöÅÄÖ]+$ to filter out anything but A-Z, 0-9 plus the Swedish characters ÅÄÖ. It works as expected as long as the string isn't ending with whitespace and I am a bit confused on what I need correct to make it accept strings even if they end with whitespace. The \s is there but is apparently not enough.
What is wrong in my regexp?
"something #¤%&/()=?".replace(/[^\s\da-zA-ZåäöÅÄÖ]+$/, '') # => a string
"something ending with whitespace #¤%&/()=? ".replace(/[^\s\da-zA-ZåäöÅÄÖ]+$/, '')# => a string ending with space #¤%&/()=?
You're using a negated character class ("anything that is not a space, a digit, a letter etc."), therefore your regex fails to match.
Drop the \s from it, and also the $ (which ties the match to the end of the string), and it should work.
If you do want to keep spaces inside the string and only remove them at the end, use
"something with whitespace #¤%&/()=? ".replace(/[^\s\da-zA-ZåäöÅÄÖ]+|\s+$/g, '')
Result:
something with whitespace
Your regex says: "match one or more instances of the characters not in the following range, followed by end-of-string". This essentially means that your regex will match only sequences of not-allowed characters appearing at the end of the string. Since your test string ends with a whitespace, which is allowed by your logic, there's no 'sequence of not-allowed characters appearing at the end of the string' and so the regex doesn't match anything.
You can achieve your desired filtering if you remove the $ from the end of the regex and instead use the g flag to make it globally replace anything not in the specified character range with the empty string.
If you additionally want to trim trailing whitespace, it'd be better to do so using another regex, or a simpler trimRight call.

can someone help to explain this regular expression in javascript?

This code is used to get rid of mime type from rawdata.but I can not understand how it works
content.replace(/^[^,]*,/ , '')
it seems quite different from java.... any help will be appreciated.
Your mime-type probably is seperated by a comma , and at the beginning of your raw data.
This regex says take everything from the beginning (^) that is NOT a comma ([^,]*) (the star makes it as many characters until there is a comma) and take the comma itself (,). Then replace it by nothing ('').
This one only gets the first appearence because it is marked by the beginning ^ that it must be at the beginning of the string.
The first thing you need to know is that there are regex literals in JavaScript, constructed by pairs of slashes. So like "..." is a string, /.../ is a regex. That's actually the only difference your code shows as compared to a Java regex.
Then, [abc] within a regex is called a character class, meaning "one character out of a, b or c". Conversely, [^abc] is a negated character class, meaning "one character except a, b or c".
So your sample means:
/ # Start of regex literal
^ # Start the match at the start of the string
[^,]* # Match any number of characters except commas
, # Match a comma
/ # End of regex literal
The regular expression is the text between the two forward slashes, the first carat (^) means at the begining of the string, the brackets mean a character class, the carat inside the brackets means any character except a comma, then asterisk after the closing bracket means match zero or more of the character defined by the character class (which again is any character except the comma), and then finally the last comma means match the comma after all this. Then its used in a replace function so the matching result will be replaced with the second parameter, in your case: an empty string.
Basically it matches the first characters up to and including the first comma in the 'content' variable and then replaces it with an empty string.

Help interpreting a javascript Regex

I have found the following expression which is intended to modify the id of a cloned html element e.g. change contactDetails[0] to contactDetails[1]:
var nel = 1;
var s = $(this).attr(attribute);
s.replace(/([^\[]+)\[0\]/, "$1["+nel+"]");
$(this).attr(attribute, s);
I am not terribly familiar with regex, but have tried to interpret it and with the help of The Regex Coach however I am still struggling. It appears that ([^\[]+) matches one or more characters which are not '[' and \[0\]/ matches [0]. The / in the middle I interpret as an 'include both', so I don't understand why the author has even included the first expression.
I dont understand what the $1 in the replace string is and if I use the Regex Coach replace functionality if I simply use [0] as the search and 1 as the replace I get the correct result, however if I change the javascript to s.replace(/\[0\]/, "["+nel+"]"); the string s remains unchanged.
I would be grateful for any advice as to what the original author intended and help in finding a solution which will successfully replace the a number in square brackets anywhere within a search string.
Find
/ # Signifies the start of a regex expression like " for a string
([^\[]+) # Capture the character that isn't [ 1 or more times into $1
\[0\] # Find [0]
/ # Signifies the end of a regex expression
Replace
"$1[" # Insert the item captured above And [
+nel+ # New index
"]" # Close with ]
To create an expression that captures any digit, you can replace the 0 with \d+ which will match a digit 1 or more times.
s.replace(/([^\[]+)\[\d+\]/, "$1["+nel+"]");
The $1 is a backreference to the first group in the regex. Groups are the pieces inside (). So, in this case $1 will be replaced by whatever the ([^\[]+) part matched.
If the string was contactDetails[0] the resulting string would be contactDetails[1].
Note that this regex only replaces 0s inside square brackets. If you want to replace any number you will need something like:
([^\[]+)\[\d+\]
The \d matches any digit character. \d+ then becomes any sequence of at least one digit.
But your code will still not work, because Javascript strings are immutable. That means they can't be changed once created. The replace method returns a new string, instead of changing the original one. You should use:
s = s.replace(...)
looks like it replaces arrays of 0 with 1.
For example: array[0] goes to array[1]
Explanation:
([^[]+) - This part means save everything that is not a [ into variable $1
[0]/ - This part limits Part 1 to save everything up to a [0]
"$1["+nel+"]" - Print out the contents of $1 (loaded from part 1) and add the brackets with the value of nel. (in your example nel = 1)
Square braces define a set of characters to match. [abc] will match the letters a, b or c.
By adding the carat you are now specifying that you want characters not in the set. [^abc] will match any character that is not an a, b or c.
Because square braces have special meaning in RegExps you need to escape them with a slash if you want to match one. [ starts a character set, \[ matches a brace. (Same concept for closing braces.)
So, [^\[]+ captures 1 or more characters that are not [.
Wrapping that in parenthesis "captures" the matched portion of the string (in this case "contactDetails" so that you can use it in the replacement.
$1 uses the "captured" string (i.e. "contactDetails") in the replacement string.
This regex matches "something" followed by a [0].
"something" is identified by the expression [^\[]+ which matches all charactes that are not a [. You can see the () around this expression, because the match is reused with $1, later. The rest of your regex - that is \[0\] just matches the index [0]. The author had to write \[ and \] because [ and ] are special charactes for regular expressions and have to be escaped.
$1 is a reference to the value of the first paranthesis pair. In your case the value of
[^\[]+
which matches one or more characters which are not a '['
The remaining part of the regexp matches string '[0]'.
So if s is 'foobar[0]' the result will be 'foobar[1]'.
[^\[] will match any character that is not [, the '+' means one or more times. So [^[]+ will match contactDetails. The brackets will capture this for later use. The '\' is an escape symbol so the end \[0\] will match [0]. The replace string will use $1 which is what was captured in the brackets and add the new index.
Your interpretation of the regular expression is correct. It is intended to match one or more characters which are not [, followed by a literal [0]. And used in the replace method, the match would be replaced with the match of the first grouping (that’s what $1 is replaced with) together with the sequence [ followed by the value of nel and ] (that’s how "$1["+nel+"]" is to be interpreted).
And again, a simple s.replace(/\[0\]/, "["+nel+"]") does the same. Except if there is nothing in front of [0], because in that case the first regex wouldn’t find a match.

Categories