Escape $ in regex replacement string - javascript

I want to turn the string dkfj-dkfj-sflj into dkfj-woop$dkfj-sflj.
Here's what I've tried:
var my_string = "dkfj-dkfj-sflj";
var regex = new RegExp("(\\w+)-(\\w+)-(\\w+)", "g");
console.log(my_string.replace(regex, "$1$woop[\$$2]$3");
And my result is: dkfj-woop$2-sflj. Because the "$" is in front of the "$2" capture group, it messes up that capture group.
Assuming I want the structure of my regex and capture group string to stay the same, what's the right way to escape that "$" so it works?

That isn't how you escape a $ for replace. Backslash escaping works at the parser level, functions like replace cannot give special meaning to new escape sequences like \$ because they don't even see the \$. The string "\$" is exactly equivalent to the string "$", both produce the same string. If you wanted to pass a backslash and a dollar sign to a function, it's the backslash itself that requires escaping: "\\$".
Regardless, replace expects you to escape a $ with $$. You need "$1$woop[$$$2]$3"; a $$ for the literal $, and $2 for he capture group.
Read Specifying a string as a parameter in the replace docs.

Use $$ in the replacement part to print a literal $ symbol and you don't need to have a character class in the replacement part if your pattern was enclosed within forward slashes.
> var my_string = "dkfj-dkfj-sflj";
undefined
> my_string.replace(/(\w+)-\w+-(\w+)/, "$1-woop$$$1-$2")
'dkfj-woop$dkfj-sflj'

Related

Regex match with '\' slash and replace with '\\'?

I was converting normal string in to latex format.So i was created the latex code match and replace the \ single slash into \\ double slash.why the i need it Refer this link.I tried Below code :
function test(){
var tex="$$\left[ x=\left({{11}\over{2}}+{{\sqrt{3271}}\over{2\,3^{{{3}\over{2} $$";
var tex_form = tex.replace("/[\\\/\\\\\.\\\\]/g", "\\");
document.getElementById('demo').innerHTML=tex_form;//nothing get
}
test();
<p id="demo"></p>
Not getting any output data.But the match in this link
i wish to need replace the \ into \\
There are these issues:
The string literal has no backslashes;
The regular expression is not a regular expression;
The class in the intended regular expression cannot match sequences, only single characters;
The replacement would not add backslashes, only replace with them.
Here you find the details on each point:
1. How to Encode Backslashes in String Literals
Your tex variable has no backslashes. This is because a backslash in a string literal is not taken as a literal backslash, but as an escape for interpreting the character that follows it.
When you have "$$\left...", then the \l means "literal l", and so the content of your variable will be:
$$left...
As an l does not need to be escaped, the backslash is completely unnecessary, and these two assignments result in the same string value:
var tex="$$\left[ x=\left({{11}\over{2}}+{{\sqrt{3271}}\over{2\,3^{{{3}\over{2} $$";
var tex="$$left[ x=left({{11}over{2}}+{{sqrt{3271}}over{2,3^{{{3}over{2} $$";
To bring the point home, this will also represent the same value:
var tex="\$\$\l\e\f\t\[\ \x\=\l\e\f\t\(\{\{\1\1\}\o\v\e\r\{\2\}\}\+\{\{\s\q\r\t\{\3\2\7\1\}\}\o\v\e\r\{\2\,\3\^\{\{\{\3\}\o\v\e\r\{\2\}\ \$\$";
If you really want to have literal backslashes in your content (which I understand you do, as this is about LaTeX), then you need to escape each of those backslashes... with a backslash:
var tex="$$\\left[ x=\\left({{11}\\over{2}}+{{\\sqrt{3271}}\\over{2\\,3^{{{3}\\over{2} $$";
Now the content of your tex variable will be this string:
$$\left[ x=\left({{11}\over{2}}+{{\sqrt{3271}}\over{2\,3^{{{3}\over{2} $$
2. How to Code Regular Expression Literals
You are passing a string literal to the first argument of replace, while you really intend to pass a regular expression literal. You should leave out the quotes for that to happen. The / are the delimiters of a regular expression literal, not quotes:
/[\\\/\\\\\.\\\\]/g
This should not be wrapped in quotes. JavaScript understands the / delimiters as denoting a regular expression literal, including the optional modifiers at the end (like g here).
3. Classes are sets of single characters
This regular expression has unnecessary characters. The class [...] should list all individual characters you want to match. Currently you have these characters (after resolving the escapes):
\
/
\
\
.
\
\
It is overkill to have the backslash represented 5 times. Also, in JavaScript the forward slash and dot do not need to be escaped when occurring in a class. So the above regular expression is equivalent to this one:
/[\\/.]/g
Maybe this is, or is not, what you intended to match. To match several sequences of characters, you could use the | operator. This is just an example:
/\\\\|\\\/|\\\./g
... but I don't think you need this.
4. How to actually prefix with backslashes
It seems strange to me that you would want to replace a point or forward slash with a backslash. Probably you want to prefix those with a backslash. In that case make a capture group (with parentheses) and refer to it with $1 in this replace:
tex.replace(/([\\/.])/g, "\\$1");
Note again, that in the replacement string there is only one literal backslash, as the first one is an escape (see point 1 above).
why the i need it
As the question you link to says, the \ character has special meaning inside a JavaScript string literal. It represents an escape sequence.
Not getting any output data.But the match in this link
The escape sequence is processed when the string literal is parsed by the JavaScript compiler.
By the time you apply your regular expression to them, they have been consumed. The slash characters only exist in your source code, not in your data.
If you want to put a slash character in your string, then you need to write the escape sequence for it (the \\) in the source code. You can't add them back in with JavaScript afterwards.
Not sure if I understood the problem, but try this code:
var tex_form = tex.replace("/(\\)/g","\\\\");.
You need to use '(' ')' instead of '['']' to get a match for output.

Regular Expression in JS: \\. does not match \n

I am getting a string containing newlines (/n), tabs (/t) and lowercase letters [a-z]. It is possible to do that by matching /\n|\t/. AFAIK the dot represents the wildcard.
Therefore I was wondering, why /\n|\t/ doesn't match the same things as /\\./
var text = 'test1 \ntest2';
text.split(/\n/) //['test1', 'test2']
text.split(/\./) //['test1 \ntest2']
text.split(/\\./) //['test1 \ntest2']
Shouldn't the \\. match the \n (newline)?
Let me try and answer all the points:
AFAIK the dot represents the wildcard.
No, in regex, we do not use the term "wildcard". It is a special regex (meta)character. A dot in JavaScript regex matches any character but a newline.
I was wondering, why /\n|\t/ doesn't match the same things as /\\./
Because /\n|\t/ matches 1 symbol, either a newline or tab, while the regex /\\./ matches a literal \ and a character other than a newline.
The \n and \t are escape sequences. That means that the \ is not a literal backaslash that, together with the following symbol forms a code unit, a string that cannot be written otherwise. Indeed, how can we write a line break on the paper with a pen? No way!
See more about JavaScript character escape sequences here.
Now,
text.split(/\n/) //['test1', 'test2']
True, your input string contains a line break, thus, you get two elements in the resulting array
text.split(/\./) //['test1 \ntest2']
No match was found because \. matches a literal dot. A dot that is escaped (that has a literal \ before it) in the regex stops being a special regex metacharacter, and just matches its literal representation. Your string has no dot, thus, no matches.
text.split(/\\./) //['test1 \ntest2']
Again, no match is found, as /\\./ looks for a literal \ followed by any character but a newline.
A hint: use your expressions at regex101.com, it will tell you what your regex can match on the right.
Here, with regex, you have a literal notation (/.../). In literal notation, \ is considered a literal, thus, you do not have to escape it twice. If you used a constructor notation (i.e. RegExp(....)), you would have to use double escaping. E.g.
var re = /\\./; // is equal to
var re = new RegExp("\\\\.");
See more about constructor and literal notations at MDN RegExp help page.
\n gets evaluated to a new line, so you're essentially matching against an empty string. If you do a quick console.log('\n'); you can see the output of that.

Understanding some JavaScript with a RegExp

I have the following js code
var regex = new RegExp('([\'"]?)((?:\\\\\\1|.)+?)\\1(,|$)', 'g'),
key = regex.exec( m ),
val = regex.exec( m );
I would like to understand it.
In particular:
why there are all those backslash in the definition of the RegExp? I can clearly see that \\1 is a reference to the first saved element. Why in a new RegExp using ' and not " we need to use \\1 and not simple \1?
why there is a comma between the two definitions of key and val? I may guess that it depends on the "instances" finded using "g", but it is not very clear anyway to me.
I tried to execute the code with
m = 'batman, robin'
and the result is pretty a mess, and I cannot really explain it very well.
The code is taken from JQuery Cookbook, 2.12
why there are all those backslash in the definition of the RegExp?
"\\" is a string whose value is \. One backslash is used as an escape, the second for the value. Then, within the regex you also need to escape the backslash character again because backslash characters are used to mean special things within regex.
For example
"\\1"
is a string whose value is \1, which, in a regular expression, matches the first captured group.
"\\\\"
is a string whose value is \\, which, in a regular expression, matches a single \ character.
"\\\\\\1"
is a string whose value is \\\1, which, in a regular expression, matches a single \ followed by the first captured group.
This need to escape backslashes, and then escape them again is called "double escaping". The reason you need to double escape is so that you have the correct value within the regular expression. The first escape is to make sure that the string has the correct value, the second escape is so that the regular expression matches the correct pattern.
why there is a comma between the two definitions of key and val?
The code you posted is a variable declaration. It's easier to see when formatted:
var regex = ...,
key = ...,
val = ...;
Each of the variable names in the list are declared via the var keyword. It is the same as declaring the keywords separately:
var regex,
key,
val;
regex = ...
key = ...
val = ...
Which is the same as declaring each var with a different var keyword:
var regex = ...
var key = ...
var val = ...
There's a difference when writing dynamic regex objects and static regex objects. When you initialize a regex object with a string it needs to be transformed into a regex object. However, not only does the '\' holds a special value within regex objects but also within javascript strings, hence the double escape.
Edit: Regarding your second question. You can do multiple declarations with comma, like so:
var one = 'one',
two = 'two',
three = 'three';
2nd Edit: Here's what happens with your string once it compiles into a RegEx object.
/(['"]?)((?:\\\1|.)+?)\1(,|$)/g
The regex is better represented as a regex literal:
var regex = /(['"]?)((?:\\\1|.)+?)\1(,|$)/g;
Backslashes are used to escape special characters. For example, if your regex needs to match a literal period, writing . will not work, since . matches any character: you need to "escape" the period with a backslash: \..
Backslashes that are not themselves part of an escape sequence must be escaped, so if you want to match just a backslash in the text, you must escape it with a backslash: \\.
The reason your regular expression is so complicated when passed into the RegExp constructor is because you are representing the above regular expression as a string, which adds another "layer" of escaping. Thus, every single backslash must be escaped by yet another backslash and because the string is enclosed in single quotes, your single quote must be escaped with yet another backslash:
var regex = new RegExp('([\'"]?)((?:\\\\\\1|.)+?)\\1(,|$)', 'g'),

How do I ignore $1 replace backreferencing in javascript

I have a string that a user can edit at any time, and a regex that is being conducted on the string, to add it to an xml and then save it but they can add '$1' to the string. I just want the text '$1' to be saved but I have to perform a regular expression on the same string that $1 is in. It replaces the $1 with a character from the regex every time.
How do I find, and replace, the $1 in this string?
Example of what is happening:
string1 = '<item id="1">i have $100</item>'
regexp = new RegExp('<item id="1"([^<]|<[^\/]|<\/[^i]|<\/i[^t]|<\/it[^e]|<\/ite[^m]|<\/item[^>])*<\/item>');
data = '<data><item id="1">i have no money</item><item id="2">i have no money</item></data>'
data = data.replace(regexp, string1);
Results
<data><item id="1">i have >00</item><item id="2">i have no money</item></data>
If you have a variable string that you want to put in your replace() call which might possibly have $N's in it, you can prevent the $N from being treated as a backreference by replacing $ with $$. Apparently, unlike other special characters in JS regex, the $ character cannot be escaped with a \ - it must be escaped with a preceding $ (go figure).
In your example, you could do the following to fix the issue:
data = data.replace(regexp, string1.replace('$', '$$$'));
This should turn any $'s into $$ in string1, preventing them from being treated as backreferences.
(Note: I found this little nugget here)
This should only happen if you have a capturing group in the regex.
If you don't want your groups to capture, then place ?: inside the start of the group.
/foo(?:bar)/
You can escape the $. Eg:
var replacement = '<item id="1">i have \\$100</item>';
Useful when you have capturing groups and need to write a $.

How can I use backslashes (\) in a string?

I tried many ways to get a single backslash from an executed (I don't mean an input from html).
I can get special characters as tab, new line and many others then escape them to \\t or \\n or \\(someother character) but I cannot get a single backslash when a non-special character is next to it.
I don't want something like:
str = "\apple"; // I want this, to return:
console.log(str); // \apple
and if I try to get character at 0 then I get a instead of \.
(See ES2015 update at the end of the answer.)
You've tagged your question both string and regex.
In JavaScript, the backslash has special meaning both in string literals and in regular expressions. If you want an actual backslash in the string or regex, you have to write two: \\.
The following string starts with one backslash, the first one you see in the literal is an escape character starting an escape sequence. The \\ escape sequence tells the parser to put a single backslash in the string:
var str = "\\I have one backslash";
The following regular expression will match a single backslash (not two); again, the first one you see in the literal is an escape character starting an escape sequence. The \\ escape sequence tells the parser to put a single backslash character in the regular expression pattern:
var rex = /\\/;
If you're using a string to create a regular expression (rather than using a regular expression literal as I did above), note that you're dealing with two levels: The string level, and the regular expression level. So to create a regular expression using a string that matches a single backslash, you end up using four:
// Matches *one* backslash
var rex = new RegExp("\\\\");
That's because first, you're writing a string literal, but you want to actually put backslashes in the resulting string, so you do that with \\ for each one backslash you want. But your regex also requires two \\ for every one real backslash you want, and so it needs to see two backslashes in the string. Hence, a total of four. This is one of the reasons I avoid using new RegExp(string) whenver I can; I get confused easily. :-)
ES2015 and ES2018 update
Fast-forward to 2015, and as Dolphin_Wood points out the new ES2015 standard gives us template literals, tag functions, and the String.raw function:
// Yes, this unlikely-looking syntax is actually valid ES2015
let str = String.raw`\apple`;
str ends up having the characters \, a, p, p, l, and e in it. Just be careful there are no ${ in your template literal, since ${ starts a substitution in a template literal. E.g.:
let foo = "bar";
let str = String.raw`\apple${foo}`;
...ends up being \applebar.
Try String.raw method:
str = String.raw`\apple` // "\apple"
Reference here: String.raw()
\ is an escape character, when followed by a non-special character it doesn't become a literal \. Instead, you have to double it \\.
console.log("\apple"); //-> "apple"
console.log("\\apple"); //-> "\apple"
There is no way to get the original, raw string definition or create a literal string without escape characters.
please try the below one it works for me and I'm getting the output with backslash
String sss="dfsdf\\dfds";
System.out.println(sss);

Categories