Do I need to escape dash character in regex? [duplicate] - javascript

This question already has answers here:
Regex - Should hyphens be escaped? [duplicate]
(3 answers)
Closed 7 years ago.
I'm trying to understand dash character - needs to escape using backslash in regex?
Consider this:
var url = '/user/1234-username';
var pattern = /\/(\d+)\-/;
var match = pattern.exec(url);
var id = match[1]; // 1234
As you see in the above regex, I'm trying to extract the number of id from the url. Also I escaped - character in my regex using backslash \. But when I remove that backslash, still all fine ....! In other word, both of these are fine:
/\/(\d+)\-/
/\/(\d+)-/
Now I want to know, which one is correct (standard)? Do I need to escape dash character in regex?

You only need to escape the dash character if it could otherwise be interpreted as a range indicator (which can be the case inside a character class).
/-/ # matches "-"
/[a-z]/ # matches any letter in the range between ASCII a and ASCII z
/[a\-z]/ # matches "a", "-" or "z"
/[a-]/ # matches "a" or "-"
/[-z]/ # matches "-" or "z"

- may have a meaning only inside a character class [], so when you're outside of it you don't need to escape -

Related

What is the regex that allows alphanumeric and '-' special character that also should be in between the text not at the begining or ending [duplicate]

This question already has answers here:
Regex to match '-' delimited alphanumeric words
(5 answers)
Closed 8 months ago.
All the special character except hyphen are not allowed.
Other conditions:
-xnnw729 //not allowed
nsj28w- // not allowed
aks82-z2s0j // allowed
Some notes about your answer:
Using \w also matches \d and _
For a match only you don't need all the capture groups
If you want to validate the whole line, you can append $ to assert the end of the line
Using a plus sign in the character class [\w+\d+_] matches a + character and is the same as [\w+]
You can simplify your pattern to:
^\w+(?:-\w+)*$
Regex demo
The one I was looking for is
^([\w+\d+_]+)((-)([\w+\d+_]+))*

Regular Expression for matching content between bracket [duplicate]

This question already has answers here:
Regular Expression to find a string included between two characters while EXCLUDING the delimiters
(13 answers)
Closed 5 years ago.
I need a RegEx that select something like [something and not support [something] for first one I know this
/\[[\w]+/g
But for not selecting if content between [] I don't know what should I do.
Regex can't necessarily solve your problem.
If you're always matching the entire string, you can ensure that the end of the string comes before any occurrence of the "]" character:
var reg = /\[[^\]]+$/;
/*
How is this regex working?
- The first two characters, `"\["`, mean to match a literal "["
- The next 5 characters, `"[^\]]"`, match ANY character except
the "]" character. The outer "[]" define a character class,
and when "^" appears as the first character of a character
class it means to invert the character class, so only accept
characters which DON'T match. Then the only character which
cannot be matched is an escaped right-square-bracket: "\]"
- Add one more character to the previous 5 - `"[^\]]+"` - and
you will match any number (one or more) of characters which
aren't the right-square-bracket.
- Finally, match the `"$"` character, which means "end of input".
This means that no "]" character can be matched before the input
ends.
*/
[
'[',
'[aaa',
'aa[bb',
'[[[[',
']]]]',
'[aaa]',
'[aaa[]',
'][aaa'
].forEach(function(val) {
console.log('Match "' + val + '"? ' + (reg.test(val) ? 'Yes.' : 'No.'));
});

Javascript regex to remove punctuation [duplicate]

This question already has answers here:
How can I strip all punctuation from a string in JavaScript using regex?
(16 answers)
Closed 7 years ago.
I'm having trouble with my regex. I'm sure something is not escaping properly.
function regex(str) {
str = str.replace(/(~|`|!|#|#|$|%|^|&|*|\(|\)|{|}|\[|\]|;|:|\"|'|<|,|\.|>|\?|\/|\\|\||-|_|+|=)/g,"")
document.getElementById("innerhtml").innerHTML = str;
}
<div id="innerhtml"></div>
<p><input type="button" value="Click Me" onclick="regex('test # . / | ) this');">
* and + needs to be escaped.
function regex (str) {
return str.replace(/(~|`|!|#|#|$|%|^|&|\*|\(|\)|{|}|\[|\]|;|:|\"|'|<|,|\.|>|\?|\/|\\|\||-|_|\+|=)/g,"")
}
var testStr = 'test # . / | ) this'
document.write('<strong>before: </strong>' + testStr)
document.write('<br><strong>after: </strong>' + regex(testStr))
The accepted answer on the question proposed duplicate doesn't cover all the punctuation characters in ASCII range. (The comment on the accepted answer does, though).
A better way to write this regex is to use put the characters into a character class.
/[~`!##$%^&*(){}\[\];:"'<,.>?\/\\|_+=-]/g
In a character class, to match the literal characters:
^ does not need escaping, unless it is at the beginning of the character class.
- should be placed at the beginning of the character class (after the ^ in a negated character class) or at the end of a character class.
] has to be escaped to be specified as literal character. [ does not need to be escaped (but I escape it anyway, as a habit, since some language requires [ to be escaped inside character class).
$, *, +, ?, (, ), {, }, |, . loses their special meaning inside character class.
In RegExp literal, / has to be escaped.
In RegExp, since \ is the escape character, if you want to specify a literal \, you need to escape it \\.

Why have two '\' in Regex? [duplicate]

This question already has answers here:
Why do regex constructors need to be double escaped?
(5 answers)
Extra backslash needed in PHP regexp pattern
(4 answers)
Regex to replace single backslashes, excluding those followed by certain chars
(3 answers)
Closed 7 years ago.
function trim(str) {
var trimer = new RegExp("(^[\\s\\t\\xa0\\u3000]+)|([\\u3000\\xa0\\s\\t]+\x24)", "g");
return String(str).replace(trimer, "");
}
why have two '\' before 's' and 't'?
and what's this "[\s\t\xa0\u3000]" mean?
You're using a literal string.
In a literal string, the \ character is used to escape some other chars, for example \n (a new line) or \" (a double quote), and it must be escaped itself as \\. So when you want your string to have \s, you must write \\s in your string literal.
Thankfully JavaScript provides a better solution, Regular expression literals:
var trimer = /(^[\s\t\xa0\u3000]+)|([\u3000\xa0\s\t]+\x24)/g
why have two '\' before 's' and 't'?
In regex the \ is an escape which tells regex that a special character follows. Because you are using it in a string literal you need to escape the \ with \.
and what's this "[\s\t\xa0\u3000]" mean?
It means to match one of the following characters:
\s white space.
\t tab character.
\xa0 non breaking space.
\u3000 wide space.
This function is inefficient because each time it is called it is converting a string to a regex and then it is compiling that regex. It would be more efficient to use a Regex literal not a string and compile the regex outside the function like the following:
var trimRegex = /(^[\s\t\xa0\u3000]+)|([\u3000\xa0\s\t]+$)/g;
function trim(str) {
return String(str).replace(trimRegex, "");
}
Further to this \s will match any whitespace which includes tabs, the wide space and the non breaking space so you could simplify the regex to the following:
var trimRegex = /(^\s+)|(\s+$)/g;
Browsers now implement a trim function so you can use this and use a polyfill for older browsers. See this Answer

Prepend backslash to selected characters with regex [duplicate]

This question already has answers here:
Is there a RegExp.escape function in JavaScript?
(18 answers)
Closed 9 years ago.
The code I use at the moment is ugly because I have to write "replace" separately for every special character.
var str = ":''>";
str.replace("'","\\'").replace(">","\\>");
I would like to prepend backslash to < > * ( ) and ? through regex.
Using a regex that matches the characters with a character set, you could try:
str = str.replace(/([<>*()?])/g, "\\$1");
DEMO: http://jsfiddle.net/8ar3Z/
It matches any of the characters inside of the [ ] (the ones you specified), captures them with the surrounding () (so that it can be referenced as $1 in the replaced text part), and then prepends with \\.
UPDATE:
As a suggestion from Mr. #T.J.Crowder, it is unnecessary to capture with (), changing $1 to $&, written as:
str = str.replace(/[<>*()?]/g, "\\$&");
DEMO: http://jsfiddle.net/8ar3Z/1/
References:
Regex character set: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#special-character-set
$1 and $& use: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#Specifying_a_string_as_a_parameter

Categories