Replace array of chars in a string - javascript

I'm looking for a function that does replaceAll with any start and endcharacter.
I know I can use the regex notation:
string=string.replace(/a/g,"b");
However, because the searched char is in a regex, I sometimes need to escape that character and sometimes not, which is annoying if I want to do this for a full list of chars
convertEncoding= function(string) {
var charMap= {'"':""",'&':"&",...}
for (startChar in charMap) {
endChar=charMap[startChar];
string= string.replaceAll(startChar,endChar);
}
}
Is they a good way to write that function replaceAll, without doing a for loop and using String.replace() (eg the naive way) ?

you can use escape the RegExp special characters in the strings such as described here https://stackoverflow.com/a/6969486/519995:
and then you can use the regexp global replace
function escapeRegExp(str) {
return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
}
for (startChar in charMap) {
endChar=charMap[startChar];
result = string.replace(RegExp(escapeRegExp(startChar), 'g'), endChar);
}

Related

Wrong result when replacing with regex

I'm replacing a sub-string using replace function and regex expression.
However after character escape and replacement, I still have an extra '/' character. I'm not really familiar with regex can someone guide me.
I have implemented the escape character function found here: Is there a RegExp.escape function in Javascript?
RegExp.escape= function(s) {
return s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
};
const latexConversions = [
["\\cdot", "*"],
["\\right\)", ")"],
["\\left\(", "("],
["\\pi", "pi"],
["\\ln\((.*?)\)", "log($1)"],
["stdev\((.*?)\)", "std($1)"],
["stdevp\((.*?)\)", "std(\[$1\], \"uncorrected\")"],
["mean\((.*?)\)", "mean($1)"],
["\\sqrt\((.*?)\)", "sqrt($1)"],
["\\log\((.*?)\)", "log10($1)"],
["\(e\)", "e"],
["\\exp\((.*?)\)", "exp($1)"],
["round\((.*?)\)", "round($1)"],
["npr\((.*?),(.*?)\)", "($1!/($1-$2)!)"],
["ncr\((.*?),(.*?)\)", "($1!/($2!($1-$2)!))"],
["\\left\|", "abs("],
["\\right\|", ")"],
];
RegExp.escape = function (s) {
var t = s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
return t;
};
mathematicalExpression = "\\sqrt( )"
//Problem is here
mathematicalExpression = mathematicalExpression.replace(new RegExp(RegExp.escape(latexConversions[8][0]), 'g'), latexConversions[8][1]);
//Works
mathematicalExpression2 = mathematicalExpression.replace(/\\sqrt\((.*?)\)/g, "sqrt($1)");
alert("what I got: "+mathematicalExpression); // "\sqrt()"
alert("Supposed to be: "+ mathematicalExpression2); // "sqtr()"
I have a live example here: https://jsfiddle.net/nky342h5/2/
There are several misconceptions regarding the string literal "\\sqrt\((.*?)\)":
This string in raw characters is: \sqrt((.*?)). Note how there is no difference between the two opening parentheses: the backslash in the string literal was not very useful. In other words, "\(" === "("
Both opening parentheses will be escaped by RegExp.escape
Points 1 and 2 are equally true for the closing parentheses, for the dot, the asterisk and the question mark: they will be escaped by RegExp.escape.
In short, you have no way to distinguish that a character is intended as a literal or as a regex special symbol -- you are escaping all of them as if they were intended as literal characters.
The solution:
Since you already are encoding regex specific syntax in your strings (like (.*?)), you might as well use regex literals instead of string literals.
In the case you highlighted, instead of this:
["\\sqrt\((.*?)\)", "sqrt($1)"]
...use this:
[/\\sqrt\((.*?)\)/g, "sqrt($1)"]
And let your code do:
mathematicalExpression = mathematicalExpression.replace(...latexConversions[8]);
Alternative
If for some reason regex literals are a no-go, then define your own special syntax for (.*?). For instance, use the symbol µ to denote that particular regex syntax.
Then your array pair would look like this:
["\\sqrt(µ)", "sqrt($1)"],
...and code:
mathematicalExpression = mathematicalExpression.replace(
new RegExp(RegExp.escape(latexConversions[8][0]).replace(/µ/g, '(.*?)'), 'g'),
latexConversions[8][1]
);
Note how here the (.*?) is introduced in the string after RegExp.escape has done its job.
extra \ rather than escaping everything
replace ["\\sqrt\((.*?)\)", "sqrt($1)"], with ["\\\\sqrt\((.*?)\)", "sqrt($1)"],
and replace the final replace with
mathematicalExpression = mathematicalExpression.replace(new RegExp((latexConversions1[8][0]), 'g'), latexConversions1[8][1]);

Javascript regular expression: save matching value

I have the following expression in JS (typescript, but I think everyone understands what it transpiles to):
markString(text: string) {
const regEx = new RegExp(this.partToHighlight, 'ig');
return text.replace(regEx, `<mark>${this.partToHighlight}<\/mark>`);
}
The problem is that this way, with the 'ig'-option, the matching value can have any case, upper or lower, but is always replaced by the partToHighlight-value. Instead the function should read the matched value, save it and output it surrounded with the HTML-tags. How do I do this? I am pretty sure this is a duplicate question, but I couldn't find the one asked before.
You need to replace with the found match, $&:
markString(text: string) {
const regEx = new RegExp(this.partToHighlight, 'ig');
return text.replace(regEx, "<mark>$&</mark>");
}
Using $&, you replace with found match with the the same text found and do not need to hardcode the replacement, nor use any callbacks.
See "Specifying a string as a parameter" for more details.
As mentionned in comments you will need to use RegExp.lastMatch or $&, to point out to the matched substring, in your replace() method:
const regEx = new RegExp(this.partToHighlight, 'ig');
return text.replace(regEx, `<mark>$&<\/mark>`);

Regex return undefined in a JavaScript String

I have a little code snippet where I use Regular Expressions to rip off punctuation, numbers etc from a string. I am getting undefined along with output of my ripped string. Can someone explain whats happening? Thanks
var regex = /[^a-zA-z\s\.]|_/gi;
function ripPunct(str) {
if ( str.match(regex) ) {
str = str.replace(regex).replace(/\s+/g, "");
}
return str;
}
console.log(ripPunct("*#£#__-=-=_+_devide-00000110490and586#multiply.edu"));
You should pass a replacement pattern to the first replace method, and also use A-Z, not A-z, in the pattern. Also, there is no point to check for a match before replacing, just use replace directly. Also, it seems the second chained replace is redundant as the first one already removes whitespace (it contains \s). Besides, the |_ alternative is also redundant since the [^a-zA-Z\s.] already matches an underscore as it is not part of the symbols specified by this character class.
var regex = /[^a-zA-Z\s.]/gi;
function ripPunct(str) {
return str.replace(regex, "");
}
console.log(ripPunct("*#£#__-=-=_+_devide-00000110490and586#multiply.edu"));

Unescaping multiple values in javascript

Here is the code to escape special characters:
function escapeRegExp(string){
return string.replace(/([.*+?^${}()|\[\]\/\\])/g, "\\$1");
}
How can I unescape the special characters to get the original strings?
I am getting confused with use of / and //.
That's pretty easy, you can use another function that removes the \ characters.
// Use this to escape
function escapeRegExp(string){
return string.replace(/([\.\*\+\?\^\$\{\}\(\)\|\[\]\/\\])/g, "\\$1");
}
// And this to unescape
function unescapeRegExp(string) {
return string.replace(/\\([\.\*\+\?\^\$\{\}\(\)\|\[\]\/\\])/g, "$1")
}
// EXAMPLE:
escapeRegExp(".?[]");
> "\.\?\[\]"
unescapeRegExp("\.\?\[\]");
> ".?[]"
PS: I corrected your original function, the regular expression was wrong.
I guess, you've take this from MDN.
All you need to revert escaping is to remove every odd occurence of \ character.
function unescapeRegExp(string) {
return string.replace(/\\(.)/g, '$1');
}

Create RegExps on the fly using string variables

Say I wanted to make the following re-usable:
function replace_foo(target, replacement) {
return target.replace("string_to_replace",replacement);
}
I might do something like this:
function replace_foo(target, string_to_replace, replacement) {
return target.replace(string_to_replace,replacement);
}
With string literals this is easy enough. But what if I want to get a little more tricky with the regex? For example, say I want to replace everything but string_to_replace. Instinctually I would try to extend the above by doing something like:
function replace_foo(target, string_to_replace, replacement) {
return target.replace(/^string_to_replace/,replacement);
}
This doesn't seem to work. My guess is that it thinks string_to_replace is a string literal, rather than a variable representing a string. Is it possible to create JavaScript regexes on the fly using string variables? Something like this would be great if at all possible:
function replace_foo(target, string_to_replace, replacement) {
var regex = "/^" + string_to_replace + "/";
return target.replace(regex,replacement);
}
There's new RegExp(string, flags) where flags are g or i. So
'GODzilla'.replace( new RegExp('god', 'i'), '' )
evaluates to
zilla
With string literals this is easy enough.
Not really! The example only replaces the first occurrence of string_to_replace. More commonly you want to replace all occurrences, in which case, you have to convert the string into a global (/.../g) RegExp. You can do this from a string using the new RegExp constructor:
new RegExp(string_to_replace, 'g')
The problem with this is that any regex-special characters in the string literal will behave in their special ways instead of being normal characters. You would have to backslash-escape them to fix that. Unfortunately, there's not a built-in function to do this for you, so here's one you can use:
function escapeRegExp(s) {
return s.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&')
}
Note also that when you use a RegExp in replace(), the replacement string now has a special character too, $. This must also be escaped if you want to have a literal $ in your replacement text!
function escapeSubstitute(s) {
return s.replace(/\$/g, '$$$$');
}
(Four $s because that is itself a replacement string—argh!)
Now you can implement global string replacement with RegExp:
function replace_foo(target, string_to_replace, replacement) {
var relit= escapeRegExp(string_to_replace);
var sub= escapeSubstitute(replacement);
var re= new RegExp(relit, 'g');
return target.replace(re, sub);
}
What a pain. Luckily if all you want to do is a straight string replace with no additional parts of regex, there is a quicker way:
s.split(string_to_replace).join(replacement)
...and that's all. This is a commonly-understood idiom.
say I want to replace everything but string_to_replace
What does that mean, you want to replace all stretches of text not taking part in a match against the string? A replacement with ^ certainly doesn't this, because ^ means a start-of-string token, not a negation. ^ is only a negation in [] character groups. There are also negative lookaheads (?!...), but there are problems with that in JScript so you should generally avoid it.
You might try matching ‘everything up to’ the string, and using a function to discard any empty stretch between matching strings:
var re= new RegExp('(.*)($|'+escapeRegExp(string_to_find)+')')
return target.replace(re, function(match) {
return match[1]===''? match[2] : replacement+match[2];
});
Here, again, a split might be simpler:
var parts= target.split(string_to_match);
for (var i= parts.length; i-->0;)
if (parts[i]!=='')
parts[i]= replacement;
return parts.join(string_to_match);
As the others have said, use new RegExp(pattern, flags) to do this. It is worth noting that you will be passing string literals into this constructor, so every backslash will have to be escaped. If, for instance you wanted your regex to match a backslash, you would need to say new RegExp('\\\\'), whereas the regex literal would only need to be /\\/. Depending on how you intend to use this, you should be wary of passing user input to such a function without adequate preprocessing (escaping special characters, etc.) Without this, your users may get some very unexpected results.
Yes you can.
https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
function replace_foo(target, string_to_replace, replacement) {
var regex = new RegExp("^" + string_to_replace);
return target.replace(regex, replacement);
}
A really simple solution to this is this:
function replace(target, string_to_replace, replacement) {
return target.split(string_to_replace).join(replacement);
}
No need for Regexes at all
It also seems to be the fastest on modern browsers https://jsperf.com/replace-vs-split-join-vs-replaceall
I think I have very good example for highlight text in string (it finds not looking at register but highlighted using register)
function getHighlightedText(basicString, filterString) {
if ((basicString === "") || (basicString === null) || (filterString === "") || (filterString === null)) return basicString;
return basicString.replace(new RegExp(filterString.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\\\$&'), 'gi'),
function(match)
{return "<mark>"+match+"</mark>"});
}
http://jsfiddle.net/cdbzL/1258/

Categories