Javascript Regular Expressions - how to NOT match a substring between < and > - javascript

I'm using this regular expression:
var regex = /\<.*?.\>/g
to match with this string:
var str = 'This <is> a string to <use> to test the <regular> expression'
using a simple match:
str.match(regex)
and, as expected, I get:
["<is>", "<use>", "<regular>"]
(But without the backslashes, sorry for any potential confusion)
How can I get the reverse result? i.e. what regular expression do I need that does not return those items contained between < and >?
I tried /(^\<.*?\>)/g and various other similar combos including square brackets and stuff. I've got loads of cool results, just nothing that is quite what I want.
Where I'm going with this: Basically I want to search and replace occurences of substrings but I want to exclude some of the search space, probably using < and >. I don't really want a destructive method as I don't want to break apart strings, change them, and worry about reconstructing them.
Of course I could do this 'manually' by searching through the string but I figured regular expressions should be able to handle this rather well. Alas, my knowledge is not where it needs to be!!

Here's a way to do custom replacement of everything outside of the tags, and to strip the tags from the tagged parts http://jsfiddle.net/tcATT/
var string = 'This <is> a string to <use> to test the <regular> expression';
// The regular expression matches everything, but each val is either a
// tagged value (<is> <regular>), or the text you actually want to replace
// you need to decide that in the replacer function
console.log(str.replace( /[^<>]+|<.*?>/g, function(val){
if(val.charAt(0) == '<' && val.charAt(val.length - 1) == '>') {
// Just strip the < and > from the ends
return val.slice(1,-1);
} else {
// Do whatever you want with val here, I'm upcasing for simplicity
return val.toUpperCase();
}
} ));​
// outputs: "THIS is A STRING TO use TO TEST THE regular EXPRESSION"
To generalize it, you could use
function replaceOutsideTags(str, replacer) {
return str.replace( /[^<>]+|<.*?>/g, function(val){
if(val.charAt(0) == '<' && val.charAt(val.length - 1) == '>') {
// Just strip the < and > from the ends
return val.slice(1,-1);
} else {
// Let the caller decide how to replace the parts that need replacing
return replacer(val);
}
})
}
// And call it like
console.log(
replaceOutsideTags( str, function(val){
return val.toUpperCase();
})
);

If I understand correctly you want to apply some custom processing to a string except parts that are protected (enclosed in with < and >)? If, this is the case you could do it like this:
// The function that processes unprotected parts
function process(s) {
// an example could be transforming whole part to uppercase:
return s.toUpperCase();
}
// The function that splits string into chunks and applies processing
// to unprotected parts
function applyProcessing (s) {
var a = s.split(/<|>/),
out = '';
for (var i=0; i<a.length; i++)
out += i%2
? a[i]
: process(a[i]);
return out;
}
// now we just call the applyProcessing()
var str1 = 'This <is> a string to <use> to test the <regular> expression';
console.log(applyProcessing(str1));
// This outputs:
// "THIS is A STRING TO use TO TEST THE regular EXPRESSION"
// and another string:
var str2 = '<do not process this part!> The <rest> of the a <string>.';
console.log(applyProcessing(str2));
// This outputs:
// "do not process this part! THE rest OF THE A string."
This is basically it. It returns the whole string with the unprotected parts processed.
Please note that the splitting will not work correctly if the angle brackets (< and >) are not balanced.
There are various places that could be improved but I'll leave that as an excersize to the reader. ;p

This is a perfect application for passing a regex argument to the core String.split() method:
var results = str.split(/<[^<>]*>/);
Simple!

Using the variables you've already created, try using replace. It's non-destructive, too.
str.replace(regex, '');
--> "This a string to to test the expression"

/\b[^<\W]\w*(?!>)\b/g
This works, test it out:
var str = 'This <is> a string to <use> to test the <regular> expression.';
var regex = /\<.*?.>/g;
console.dir(str.match(regex));
var regex2 = /\b[^<\W]\w*(?!>)\b/g;
console.dir(str.match(regex2));

Ah, okay, sorry - I misunderstood your question. This is a difficult problem to solve with pure regular expressions in javascript, because javascript doesn't support lookbehinds, and usually I think I would use lookaheads and lookbehinds to solve this. A (sort of contrived) way of doing it would be something like this:
str.replace(/((?:<[^>]+>)?)([^<]*)/g, function (m, sep, s) { return sep + s.replace('test', 'FOO'); })
// --> "This <is> a string to <use> to FOO the <regular> expression"
This also works on strings like "This test <is> a string to <use> to test the <regular> expression", and if you use /test/g instead of 'test' in the replacer function, it will also turn
"This test <is> a string to <use> to test the test <regular> expression"
into
"This FOO <is> a string to <use> to FOO the FOO <regular> expression"
UPDATE
And something like this would also strip the <> characters:
str.replace(/((?:<[^>]+>)?)([^<]*)/g, function (m, sep, s) { return sep.replace(/[<>]/g, '') + s.replace(/test/g, 'FOO'); })
"This test <is> a string to <use> to test the test <regular> expression"
--> "This FOO is a string to use to FOO the FOO regular expression"

Try this regex:
\b\w+\b(?!>)
UPDATE
To support spaces inside brackets try this one. It's not pure regex.match, but it works and it's much simpler that the answer above:
alert('This <is> a string to <use use> to test the <regular> expression'.split(/\s*<.+?>\s*/).join(' '));

Related

how to replace all strict pattern of string? [duplicate]

Given a string:
s = "Test abc test test abc test test test abc test test abc";
This seems to only remove the first occurrence of abc in the string above:
s = s.replace('abc', '');
How do I replace all occurrences of it?
As of August 2020: Modern browsers have support for the String.replaceAll() method defined by the ECMAScript 2021 language specification.
For older/legacy browsers:
function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}
function replaceAll(str, find, replace) {
return str.replace(new RegExp(escapeRegExp(find), 'g'), replace);
}
Here is how this answer evolved:
str = str.replace(/abc/g, '');
In response to comment "what's if 'abc' is passed as a variable?":
var find = 'abc';
var re = new RegExp(find, 'g');
str = str.replace(re, '');
In response to Click Upvote's comment, you could simplify it even more:
function replaceAll(str, find, replace) {
return str.replace(new RegExp(find, 'g'), replace);
}
Note: Regular expressions contain special (meta) characters, and as such it is dangerous to blindly pass an argument in the find function above without pre-processing it to escape those characters. This is covered in the Mozilla Developer Network's JavaScript Guide on Regular Expressions, where they present the following utility function (which has changed at least twice since this answer was originally written, so make sure to check the MDN site for potential updates):
function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}
So in order to make the replaceAll() function above safer, it could be modified to the following if you also include escapeRegExp:
function replaceAll(str, find, replace) {
return str.replace(new RegExp(escapeRegExp(find), 'g'), replace);
}
For the sake of completeness, I got to thinking about which method I should use to do this. There are basically two ways to do this as suggested by the other answers on this page.
Note: In general, extending the built-in prototypes in JavaScript is generally not recommended. I am providing as extensions on the String prototype simply for purposes of illustration, showing different implementations of a hypothetical standard method on the String built-in prototype.
Regular Expression Based Implementation
String.prototype.replaceAll = function(search, replacement) {
var target = this;
return target.replace(new RegExp(search, 'g'), replacement);
};
Split and Join (Functional) Implementation
String.prototype.replaceAll = function(search, replacement) {
var target = this;
return target.split(search).join(replacement);
};
Not knowing too much about how regular expressions work behind the scenes in terms of efficiency, I tended to lean toward the split and join implementation in the past without thinking about performance. When I did wonder which was more efficient, and by what margin, I used it as an excuse to find out.
On my Chrome Windows 8 machine, the regular expression based implementation is the fastest, with the split and join implementation being 53% slower. Meaning the regular expressions are twice as fast for the lorem ipsum input I used.
Check out this benchmark running these two implementations against each other.
As noted in the comment below by #ThomasLeduc and others, there could be an issue with the regular expression-based implementation if search contains certain characters which are reserved as special characters in regular expressions. The implementation assumes that the caller will escape the string beforehand or will only pass strings that are without the characters in the table in Regular Expressions (MDN).
MDN also provides an implementation to escape our strings. It would be nice if this was also standardized as RegExp.escape(str), but alas, it does not exist:
function escapeRegExp(str) {
return str.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); // $& means the whole matched string
}
We could call escapeRegExp within our String.prototype.replaceAll implementation, however, I'm not sure how much this will affect the performance (potentially even for strings for which the escape is not needed, like all alphanumeric strings).
In the latest versions of most popular browsers, you can use replaceAll
as shown here:
let result = "1 abc 2 abc 3".replaceAll("abc", "xyz");
// `result` is "1 xyz 2 xyz 3"
But check Can I use or another compatibility table first to make sure the browsers you're targeting have added support for it first.
For Node.js and compatibility with older/non-current browsers:
Note: Don't use the following solution in performance critical code.
As an alternative to regular expressions for a simple literal string, you could use
str = "Test abc test test abc test...".split("abc").join("");
The general pattern is
str.split(search).join(replacement)
This used to be faster in some cases than using replaceAll and a regular expression, but that doesn't seem to be the case anymore in modern browsers.
Benchmark: https://jsben.ch/TZYzj
Conclusion:
If you have a performance-critical use case (e.g., processing hundreds of strings), use the regular expression method. But for most typical use cases, this is well worth not having to worry about special characters.
Here's a string prototype function based on the accepted answer:
String.prototype.replaceAll = function(find, replace) {
var str = this;
return str.replace(new RegExp(find, 'g'), replace);
};
If your find contains special characters then you need to escape them:
String.prototype.replaceAll = function(find, replace) {
var str = this;
return str.replace(new RegExp(find.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 'g'), replace);
};
Fiddle: http://jsfiddle.net/cdbzL/
Use word boundaries (\b)
'a cat is not a caterpillar'.replace(/\bcat\b/gi,'dog');
//"a dog is not a caterpillar"
This is a simple regex that avoids replacing parts of words in most cases. However, a dash - is still considered a word boundary. So conditionals can be used in this case to avoid replacing strings like cool-cat:
'a cat is not a cool-cat'.replace(/\bcat\b/gi,'dog');//wrong
//"a dog is not a cool-dog" -- nips
'a cat is not a cool-cat'.replace(/(?:\b([^-]))cat(?:\b([^-]))/gi,'$1dog$2');
//"a dog is not a cool-cat"
Basically, this question is the same as the question here:
Replace " ' " with " '' " in JavaScript
Regexp isn't the only way to replace multiple occurrences of a substring, far from it. Think flexible, think split!
var newText = "the cat looks like a cat".split('cat').join('dog');
Alternatively, to prevent replacing word parts—which the approved answer will do, too! You can get around this issue using regular expressions that are, I admit, somewhat more complex and as an upshot of that, a tad slower, too:
var regText = "the cat looks like a cat".replace(/(?:(^|[^a-z]))(([^a-z]*)(?=cat)cat)(?![a-z])/gi,"$1dog");
The output is the same as the accepted answer, however, using the /cat/g expression on this string:
var oops = 'the cat looks like a cat, not a caterpillar or coolcat'.replace(/cat/g,'dog');
//returns "the dog looks like a dog, not a dogerpillar or cooldog" ??
Oops indeed, this probably isn't what you want. What is, then? IMHO, a regex that only replaces 'cat' conditionally (i.e., not part of a word), like so:
var caterpillar = 'the cat looks like a cat, not a caterpillar or coolcat'.replace(/(?:(^|[^a-z]))(([^a-z]*)(?=cat)cat)(?![a-z])/gi,"$1dog");
//return "the dog looks like a dog, not a caterpillar or coolcat"
My guess is, this meets your needs. It's not foolproof, of course, but it should be enough to get you started. I'd recommend reading some more on these pages. This'll prove useful in perfecting this expression to meet your specific needs.
RegExp (regular expression) object
Regular-Expressions.info
Here is an example of .replace used with a callback function. In this case, it dramatically simplifies the expression and provides even more flexibility, like replacing with correct capitalisation or replacing both cat and cats in one go:
'Two cats are not 1 Cat! They\'re just cool-cats, you caterpillar'
.replace(/(^|.\b)(cat)(s?\b.|$)/gi,function(all,char1,cat,char2)
{
// Check 1st, capitalize if required
var replacement = (cat.charAt(0) === 'C' ? 'D' : 'd') + 'og';
if (char1 === ' ' && char2 === 's')
{ // Replace plurals, too
cat = replacement + 's';
}
else
{ // Do not replace if dashes are matched
cat = char1 === '-' || char2 === '-' ? cat : replacement;
}
return char1 + cat + char2;//return replacement string
});
//returns:
//Two dogs are not 1 Dog! They're just cool-cats, you caterpillar
These are the most common and readable methods.
var str = "Test abc test test abc test test test abc test test abc"
Method 1:
str = str.replace(/abc/g, "replaced text");
Method 2:
str = str.split("abc").join("replaced text");
Method 3:
str = str.replace(new RegExp("abc", "g"), "replaced text");
Method 4:
while(str.includes("abc")){
str = str.replace("abc", "replaced text");
}
Output:
console.log(str);
// Test replaced text test test replaced text test test test replaced text test test replaced text
Match against a global regular expression:
anotherString = someString.replace(/cat/g, 'dog');
For replacing a single time, use:
var res = str.replace('abc', "");
For replacing multiple times, use:
var res = str.replace(/abc/g, "");
str = str.replace(/abc/g, '');
Or try the replaceAll method, as recommended in this answer:
str = str.replaceAll('abc', '');
or:
var search = 'abc';
str = str.replaceAll(search, '');
EDIT: Clarification about replaceAll availability
The replaceAll method is added to String's prototype. This means it will be available for all string objects/literals.
Example:
var output = "test this".replaceAll('this', 'that'); // output is 'test that'.
output = output.replaceAll('that', 'this'); // output is 'test this'
Using RegExp in JavaScript could do the job for you. Just simply do something like below code, and don't forget the /g after which standout for global:
var str ="Test abc test test abc test test test abc test test abc";
str = str.replace(/abc/g, '');
If you think of reuse, create a function to do that for you, but it's not recommended as it's only one line function. But again, if you heavily use this, you can write something like this:
String.prototype.replaceAll = String.prototype.replaceAll || function(string, replaced) {
return this.replace(new RegExp(string, 'g'), replaced);
};
And simply use it in your code over and over like below:
var str ="Test abc test test abc test test test abc test test abc";
str = str.replaceAll('abc', '');
But as I mention earlier, it won't make a huge difference in terms of lines to be written or performance. Only caching the function may affect some faster performance on long strings and is also a good practice of DRY code if you want to reuse.
Say you want to replace all the 'abc' with 'x':
let some_str = 'abc def def lom abc abc def'.split('abc').join('x')
console.log(some_str) //x def def lom x x def
I was trying to think about something more simple than modifying the string prototype.
Use a regular expression:
str.replace(/abc/g, '');
Performance
Today 27.12.2019 I perform tests on macOS v10.13.6 (High Sierra) for the chosen solutions.
Conclusions
The str.replace(/abc/g, ''); (C) is a good cross-browser fast solution for all strings.
Solutions based on split-join (A,B) or replace (C,D) are fast
Solutions based on while (E,F,G,H) are slow - usually ~4 times slower for small strings and about ~3000 times (!) slower for long strings
The recurrence solutions (RA,RB) are slow and do not work for long strings
I also create my own solution. It looks like currently it is the shortest one which does the question job:
str.split`abc`.join``
str = "Test abc test test abc test test test abc test test abc";
str = str.split`abc`.join``
console.log(str);
Details
The tests were performed on Chrome 79.0, Safari 13.0.4 and Firefox 71.0 (64 bit). The tests RA and RB use recursion. Results
Short string - 55 characters
You can run tests on your machine HERE. Results for Chrome:
Long string: 275 000 characters
The recursive solutions RA and RB gives
RangeError: Maximum call stack size exceeded
For 1M characters they even break Chrome
I try to perform tests for 1M characters for other solutions, but E,F,G,H takes so much time that browser ask me to break script so I shrink test string to 275K characters. You can run tests on your machine HERE. Results for Chrome
Code used in tests
var t="Test abc test test abc test test test abc test test abc"; // .repeat(5000)
var log = (version,result) => console.log(`${version}: ${result}`);
function A(str) {
return str.split('abc').join('');
}
function B(str) {
return str.split`abc`.join``; // my proposition
}
function C(str) {
return str.replace(/abc/g, '');
}
function D(str) {
return str.replace(new RegExp("abc", "g"), '');
}
function E(str) {
while (str.indexOf('abc') !== -1) { str = str.replace('abc', ''); }
return str;
}
function F(str) {
while (str.indexOf('abc') !== -1) { str = str.replace(/abc/, ''); }
return str;
}
function G(str) {
while(str.includes("abc")) { str = str.replace('abc', ''); }
return str;
}
// src: https://stackoverflow.com/a/56989553/860099
function H(str)
{
let i = -1
let find = 'abc';
let newToken = '';
if (!str)
{
if ((str == null) && (find == null)) return newToken;
return str;
}
while ((
i = str.indexOf(
find, i >= 0 ? i + newToken.length : 0
)) !== -1
)
{
str = str.substring(0, i) +
newToken +
str.substring(i + find.length);
}
return str;
}
// src: https://stackoverflow.com/a/22870785/860099
function RA(string, prevstring) {
var omit = 'abc';
var place = '';
if (prevstring && string === prevstring)
return string;
prevstring = string.replace(omit, place);
return RA(prevstring, string)
}
// src: https://stackoverflow.com/a/26107132/860099
function RB(str) {
var find = 'abc';
var replace = '';
var i = str.indexOf(find);
if (i > -1){
str = str.replace(find, replace);
i = i + replace.length;
var st2 = str.substring(i);
if(st2.indexOf(find) > -1){
str = str.substring(0,i) + RB(st2, find, replace);
}
}
return str;
}
log('A ', A(t));
log('B ', B(t));
log('C ', C(t));
log('D ', D(t));
log('E ', E(t));
log('F ', F(t));
log('G ', G(t));
log('H ', H(t));
log('RA', RA(t)); // use reccurence
log('RB', RB(t)); // use reccurence
<p style="color:red">This snippet only presents codes used in tests. It not perform test itself!<p>
Replacing single quotes:
function JavaScriptEncode(text){
text = text.replace(/'/g,'&apos;')
// More encode here if required
return text;
}
Using
str = str.replace(new RegExp("abc", 'g'), "");
worked better for me than the previous answers. So new RegExp("abc", 'g') creates a regular expression what matches all occurrences ('g' flag) of the text ("abc"). The second part is what gets replaced to, in your case empty string ("").
str is the string, and we have to override it, as replace(...) just returns result, but not overrides. In some cases you might want to use that.
This is the fastest version that doesn't use regular expressions.
Revised jsperf
replaceAll = function(string, omit, place, prevstring) {
if (prevstring && string === prevstring)
return string;
prevstring = string.replace(omit, place);
return replaceAll(prevstring, omit, place, string)
}
It is almost twice as fast as the split and join method.
As pointed out in a comment here, this will not work if your omit variable contains place, as in: replaceAll("string", "s", "ss"), because it will always be able to replace another occurrence of the word.
There is another jsperf with variants on my recursive replace that go even faster (http://jsperf.com/replace-all-vs-split-join/12)!
Update July 27th 2017: It looks like RegExp now has the fastest performance in the recently released Chrome 59.
Loop it until number occurrences comes to 0, like this:
function replaceAll(find, replace, str) {
while (str.indexOf(find) > -1) {
str = str.replace(find, replace);
}
return str;
}
If what you want to find is already in a string, and you don't have a regex escaper handy, you can use join/split:
function replaceMulti(haystack, needle, replacement)
{
return haystack.split(needle).join(replacement);
}
someString = 'the cat looks like a cat';
console.log(replaceMulti(someString, 'cat', 'dog'));
function replaceAll(str, find, replace) {
var i = str.indexOf(find);
if (i > -1){
str = str.replace(find, replace);
i = i + replace.length;
var st2 = str.substring(i);
if(st2.indexOf(find) > -1){
str = str.substring(0,i) + replaceAll(st2, find, replace);
}
}
return str;
}
I like this method (it looks a little cleaner):
text = text.replace(new RegExp("cat","g"), "dog");
String.prototype.replaceAll - ECMAScript 2021
The new String.prototype.replaceAll() method returns a new string with all matches of a pattern replaced by a replacement. The pattern can be either a string or a RegExp, and the replacement can be either a string or a function to be called for each match.
const message = 'dog barks meow meow';
const messageFormatted = message.replaceAll('meow', 'woof')
console.log(messageFormatted);
Of course in 2021 the right answer is:
String.prototype.replaceAll()
console.log(
'Change this and this for me'.replaceAll('this','that') // Normal case
);
console.log(
'aaaaaa'.replaceAll('aa','a') // Challenged case
);
If you don't want to deal with replace() + RegExp.
But what if the browser is from before 2020?
In this case we need polyfill (forcing older browsers to support new features) (I think for a few years will be necessary).
I could not find a completely right method in answers. So I suggest this function that will be defined as a polyfill.
My suggested options for replaceAll polyfill:
replaceAll polyfill (with global-flag error) (more principled version)
if (!String.prototype.replaceAll) { // Check if the native function not exist
Object.defineProperty(String.prototype, 'replaceAll', { // Define replaceAll as a prototype for (Mother/Any) String
configurable: true, writable: true, enumerable: false, // Editable & non-enumerable property (As it should be)
value: function(search, replace) { // Set the function by closest input names (For good info in consoles)
return this.replace( // Using native String.prototype.replace()
Object.prototype.toString.call(search) === '[object RegExp]' // IsRegExp?
? search.global // Is the RegEx global?
? search // So pass it
: function(){throw new TypeError('replaceAll called with a non-global RegExp argument')}() // If not throw an error
: RegExp(String(search).replace(/[.^$*+?()[{|\\]/g, "\\$&"), "g"), // Replace all reserved characters with '\' then make a global 'g' RegExp
replace); // passing second argument
}
});
}
replaceAll polyfill (With handling global-flag missing by itself) (my first preference) - Why?
if (!String.prototype.replaceAll) { // Check if the native function not exist
Object.defineProperty(String.prototype, 'replaceAll', { // Define replaceAll as a prototype for (Mother/Any) String
configurable: true, writable: true, enumerable: false, // Editable & non-enumerable property (As it should be)
value: function(search, replace) { // Set the function by closest input names (For good info in consoles)
return this.replace( // Using native String.prototype.replace()
Object.prototype.toString.call(search) === '[object RegExp]' // IsRegExp?
? search.global // Is the RegEx global?
? search // So pass it
: RegExp(search.source, /\/([a-z]*)$/.exec(search.toString())[1] + 'g') // If not, make a global clone from the RegEx
: RegExp(String(search).replace(/[.^$*+?()[{|\\]/g, "\\$&"), "g"), // Replace all reserved characters with '\' then make a global 'g' RegExp
replace); // passing second argument
}
});
}
Minified (my first preference):
if(!String.prototype.replaceAll){Object.defineProperty(String.prototype,'replaceAll',{configurable:!0,writable:!0,enumerable:!1,value:function(search,replace){return this.replace(Object.prototype.toString.call(search)==='[object RegExp]'?search.global?search:RegExp(search.source,/\/([a-z]*)$/.exec(search.toString())[1]+'g'):RegExp(String(search).replace(/[.^$*+?()[{|\\]/g,"\\$&"),"g"),replace)}})}
Try it:
if(!String.prototype.replaceAll){Object.defineProperty(String.prototype,'replaceAll',{configurable:!0,writable:!0,enumerable:!1,value:function(search,replace){return this.replace(Object.prototype.toString.call(search)==='[object RegExp]'?search.global?search:RegExp(search.source,/\/([a-z]*)$/.exec(search.toString())[1]+'g'):RegExp(String(search).replace(/[.^$*+?()[{|\\]/g,"\\$&"),"g"),replace)}})}
console.log(
'Change this and this for me'.replaceAll('this','that')
); // Change that and that for me
console.log(
'aaaaaa'.replaceAll('aa','a')
); // aaa
console.log(
'{} (*) (*) (RegEx) (*) (\*) (\\*) [reserved characters]'.replaceAll('(*)','X')
); // {} X X (RegEx) X X (\*) [reserved characters]
console.log(
'How (replace) (XX) with $1?'.replaceAll(/(xx)/gi,'$$1')
); // How (replace) ($1) with $1?
console.log(
'Here is some numbers 1234567890 1000000 123123.'.replaceAll(/\d+/g,'***')
); // Here is some numbers *** *** *** and need to be replaced.
console.log(
'Remove numbers under 233: 236 229 711 200 5'.replaceAll(/\d+/g, function(m) {
return parseFloat(m) < 233 ? '' : m
})
); // Remove numbers under 233: 236 711
console.log(
'null'.replaceAll(null,'x')
); // x
// The difference between My first preference and the original:
// Now in 2022 with browsers > 2020 it should throw an error (But possible it be changed in future)
// console.log(
// 'xyz ABC abc ABC abc xyz'.replaceAll(/abc/i,'')
// );
// Browsers < 2020:
// xyz xyz
// Browsers > 2020
// TypeError: String.prototype.replaceAll called with a non-global RegExp
Browser support:
Internet Explorer 9 and later (rested on Internet Explorer 11).
All other browsers (after 2012).
The result is the same as the native replaceAll in case of the first argument input is:
null, undefined, Object, Function, Date, ... , RegExp, Number, String, ...
Ref: 22.1.3.19 String.prototype.replaceAll ( searchValue, replaceValue)
+ RegExp Syntax
Important note: As some professionals mention it, many of recursive functions that suggested in answers, will return the wrong result. (Try them with the challenged case of the above snippet.)
Maybe some tricky methods like .split('searchValue').join('replaceValue') or some well managed functions give same result, but definitely with much lower performance than native replaceAll() / polyfill replaceAll() / replace() + RegExp
Other methods of polyfill assignment
Naive, but supports even older browsers (be better to avoid)
For example, we can support IE7+ too, by not using Object.defineProperty() and using my old naive assignment method:
if (!String.prototype.replaceAll) {
String.prototype.replaceAll = function(search, replace) { // <-- Naive method for assignment
// ... (Polyfill code Here)
}
}
And it should work well for basic uses on IE7+.
But as here #sebastian-simon explained about, that can make secondary problems in case of more advanced uses. E.g.:
for (var k in 'hi') console.log(k);
// 0
// 1
// replaceAll <-- ?
Fully trustable, but heavy
In fact, my suggested option is a little optimistic. Like we trusted the environment (browser and Node.js), it is definitely for around 2012-2021. Also it is a standard/famous one, so it does not require any special consideration.
But there can be even older browsers or some unexpected problems, and polyfills still can support and solve more possible environment problems. So in case we need the maximum support that is possible, we can use polyfill libraries like:
https://polyfill.io/
Specially for replaceAll:
<script src="https://polyfill.io/v3/polyfill.min.js?features=String.prototype.replaceAll"></script>
The simplest way to do this without using any regular expression is split and join, like the code here:
var str = "Test abc test test abc test test test abc test test abc";
console.log(str.split('abc').join(''));
var str = "ff ff f f a de def";
str = str.replace(/f/g,'');
alert(str);
http://jsfiddle.net/ANHR9/
while (str.indexOf('abc') !== -1)
{
str = str.replace('abc', '');
}
If the string contains a similar pattern like abccc, you can use this:
str.replace(/abc(\s|$)/g, "")
As of August 2020 there is a Stage 4 proposal to ECMAScript that adds the replaceAll method to String.
It's now supported in Chrome 85+, Edge 85+, Firefox 77+, Safari 13.1+.
The usage is the same as the replace method:
String.prototype.replaceAll(searchValue, replaceValue)
Here's an example usage:
'Test abc test test abc test.'.replaceAll('abc', 'foo'); // -> 'Test foo test test foo test.'
It's supported in most modern browsers, but there exist polyfills:
core-js
es-shims
It is supported in the V8 engine behind an experimental flag --harmony-string-replaceall.
Read more on the V8 website.
The previous answers are way too complicated. Just use the replace function like this:
str.replace(/your_regex_pattern/g, replacement_string);
Example:
var str = "Test abc test test abc test test test abc test test abc";
var res = str.replace(/[abc]+/g, "");
console.log(res);
After several trials and a lot of fails, I found that the below function seems to be the best all-rounder when it comes to browser compatibility and ease of use. This is the only working solution for older browsers that I found. (Yes, even though old browser are discouraged and outdated, some legacy applications still make heavy use of OLE browsers (such as old Visual Basic 6 applications or Excel .xlsm macros with forms.)
Anyway, here's the simple function.
function replaceAll(str, match, replacement){
return str.split(match).join(replacement);
}
If you are trying to ensure that the string you are looking for won't exist even after the replacement, you need to use a loop.
For example:
var str = 'test aabcbc';
str = str.replace(/abc/g, '');
When complete, you will still have 'test abc'!
The simplest loop to solve this would be:
var str = 'test aabcbc';
while (str != str.replace(/abc/g, '')){
str.replace(/abc/g, '');
}
But that runs the replacement twice for each cycle. Perhaps (at risk of being voted down) that can be combined for a slightly more efficient but less readable form:
var str = 'test aabcbc';
while (str != (str = str.replace(/abc/g, ''))){}
// alert(str); alerts 'test '!
This can be particularly useful when looking for duplicate strings.
For example, if we have 'a,,,b' and we wish to remove all duplicate commas.
[In that case, one could do .replace(/,+/g,','), but at some point the regex gets complex and slow enough to loop instead.]

javascript replace backslashes with forward slashes on path [duplicate]

Given a string:
s = "Test abc test test abc test test test abc test test abc";
This seems to only remove the first occurrence of abc in the string above:
s = s.replace('abc', '');
How do I replace all occurrences of it?
As of August 2020: Modern browsers have support for the String.replaceAll() method defined by the ECMAScript 2021 language specification.
For older/legacy browsers:
function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}
function replaceAll(str, find, replace) {
return str.replace(new RegExp(escapeRegExp(find), 'g'), replace);
}
Here is how this answer evolved:
str = str.replace(/abc/g, '');
In response to comment "what's if 'abc' is passed as a variable?":
var find = 'abc';
var re = new RegExp(find, 'g');
str = str.replace(re, '');
In response to Click Upvote's comment, you could simplify it even more:
function replaceAll(str, find, replace) {
return str.replace(new RegExp(find, 'g'), replace);
}
Note: Regular expressions contain special (meta) characters, and as such it is dangerous to blindly pass an argument in the find function above without pre-processing it to escape those characters. This is covered in the Mozilla Developer Network's JavaScript Guide on Regular Expressions, where they present the following utility function (which has changed at least twice since this answer was originally written, so make sure to check the MDN site for potential updates):
function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}
So in order to make the replaceAll() function above safer, it could be modified to the following if you also include escapeRegExp:
function replaceAll(str, find, replace) {
return str.replace(new RegExp(escapeRegExp(find), 'g'), replace);
}
For the sake of completeness, I got to thinking about which method I should use to do this. There are basically two ways to do this as suggested by the other answers on this page.
Note: In general, extending the built-in prototypes in JavaScript is generally not recommended. I am providing as extensions on the String prototype simply for purposes of illustration, showing different implementations of a hypothetical standard method on the String built-in prototype.
Regular Expression Based Implementation
String.prototype.replaceAll = function(search, replacement) {
var target = this;
return target.replace(new RegExp(search, 'g'), replacement);
};
Split and Join (Functional) Implementation
String.prototype.replaceAll = function(search, replacement) {
var target = this;
return target.split(search).join(replacement);
};
Not knowing too much about how regular expressions work behind the scenes in terms of efficiency, I tended to lean toward the split and join implementation in the past without thinking about performance. When I did wonder which was more efficient, and by what margin, I used it as an excuse to find out.
On my Chrome Windows 8 machine, the regular expression based implementation is the fastest, with the split and join implementation being 53% slower. Meaning the regular expressions are twice as fast for the lorem ipsum input I used.
Check out this benchmark running these two implementations against each other.
As noted in the comment below by #ThomasLeduc and others, there could be an issue with the regular expression-based implementation if search contains certain characters which are reserved as special characters in regular expressions. The implementation assumes that the caller will escape the string beforehand or will only pass strings that are without the characters in the table in Regular Expressions (MDN).
MDN also provides an implementation to escape our strings. It would be nice if this was also standardized as RegExp.escape(str), but alas, it does not exist:
function escapeRegExp(str) {
return str.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); // $& means the whole matched string
}
We could call escapeRegExp within our String.prototype.replaceAll implementation, however, I'm not sure how much this will affect the performance (potentially even for strings for which the escape is not needed, like all alphanumeric strings).
In the latest versions of most popular browsers, you can use replaceAll
as shown here:
let result = "1 abc 2 abc 3".replaceAll("abc", "xyz");
// `result` is "1 xyz 2 xyz 3"
But check Can I use or another compatibility table first to make sure the browsers you're targeting have added support for it first.
For Node.js and compatibility with older/non-current browsers:
Note: Don't use the following solution in performance critical code.
As an alternative to regular expressions for a simple literal string, you could use
str = "Test abc test test abc test...".split("abc").join("");
The general pattern is
str.split(search).join(replacement)
This used to be faster in some cases than using replaceAll and a regular expression, but that doesn't seem to be the case anymore in modern browsers.
Benchmark: https://jsben.ch/TZYzj
Conclusion:
If you have a performance-critical use case (e.g., processing hundreds of strings), use the regular expression method. But for most typical use cases, this is well worth not having to worry about special characters.
Here's a string prototype function based on the accepted answer:
String.prototype.replaceAll = function(find, replace) {
var str = this;
return str.replace(new RegExp(find, 'g'), replace);
};
If your find contains special characters then you need to escape them:
String.prototype.replaceAll = function(find, replace) {
var str = this;
return str.replace(new RegExp(find.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'), 'g'), replace);
};
Fiddle: http://jsfiddle.net/cdbzL/
Use word boundaries (\b)
'a cat is not a caterpillar'.replace(/\bcat\b/gi,'dog');
//"a dog is not a caterpillar"
This is a simple regex that avoids replacing parts of words in most cases. However, a dash - is still considered a word boundary. So conditionals can be used in this case to avoid replacing strings like cool-cat:
'a cat is not a cool-cat'.replace(/\bcat\b/gi,'dog');//wrong
//"a dog is not a cool-dog" -- nips
'a cat is not a cool-cat'.replace(/(?:\b([^-]))cat(?:\b([^-]))/gi,'$1dog$2');
//"a dog is not a cool-cat"
Basically, this question is the same as the question here:
Replace " ' " with " '' " in JavaScript
Regexp isn't the only way to replace multiple occurrences of a substring, far from it. Think flexible, think split!
var newText = "the cat looks like a cat".split('cat').join('dog');
Alternatively, to prevent replacing word parts—which the approved answer will do, too! You can get around this issue using regular expressions that are, I admit, somewhat more complex and as an upshot of that, a tad slower, too:
var regText = "the cat looks like a cat".replace(/(?:(^|[^a-z]))(([^a-z]*)(?=cat)cat)(?![a-z])/gi,"$1dog");
The output is the same as the accepted answer, however, using the /cat/g expression on this string:
var oops = 'the cat looks like a cat, not a caterpillar or coolcat'.replace(/cat/g,'dog');
//returns "the dog looks like a dog, not a dogerpillar or cooldog" ??
Oops indeed, this probably isn't what you want. What is, then? IMHO, a regex that only replaces 'cat' conditionally (i.e., not part of a word), like so:
var caterpillar = 'the cat looks like a cat, not a caterpillar or coolcat'.replace(/(?:(^|[^a-z]))(([^a-z]*)(?=cat)cat)(?![a-z])/gi,"$1dog");
//return "the dog looks like a dog, not a caterpillar or coolcat"
My guess is, this meets your needs. It's not foolproof, of course, but it should be enough to get you started. I'd recommend reading some more on these pages. This'll prove useful in perfecting this expression to meet your specific needs.
RegExp (regular expression) object
Regular-Expressions.info
Here is an example of .replace used with a callback function. In this case, it dramatically simplifies the expression and provides even more flexibility, like replacing with correct capitalisation or replacing both cat and cats in one go:
'Two cats are not 1 Cat! They\'re just cool-cats, you caterpillar'
.replace(/(^|.\b)(cat)(s?\b.|$)/gi,function(all,char1,cat,char2)
{
// Check 1st, capitalize if required
var replacement = (cat.charAt(0) === 'C' ? 'D' : 'd') + 'og';
if (char1 === ' ' && char2 === 's')
{ // Replace plurals, too
cat = replacement + 's';
}
else
{ // Do not replace if dashes are matched
cat = char1 === '-' || char2 === '-' ? cat : replacement;
}
return char1 + cat + char2;//return replacement string
});
//returns:
//Two dogs are not 1 Dog! They're just cool-cats, you caterpillar
These are the most common and readable methods.
var str = "Test abc test test abc test test test abc test test abc"
Method 1:
str = str.replace(/abc/g, "replaced text");
Method 2:
str = str.split("abc").join("replaced text");
Method 3:
str = str.replace(new RegExp("abc", "g"), "replaced text");
Method 4:
while(str.includes("abc")){
str = str.replace("abc", "replaced text");
}
Output:
console.log(str);
// Test replaced text test test replaced text test test test replaced text test test replaced text
Match against a global regular expression:
anotherString = someString.replace(/cat/g, 'dog');
For replacing a single time, use:
var res = str.replace('abc', "");
For replacing multiple times, use:
var res = str.replace(/abc/g, "");
str = str.replace(/abc/g, '');
Or try the replaceAll method, as recommended in this answer:
str = str.replaceAll('abc', '');
or:
var search = 'abc';
str = str.replaceAll(search, '');
EDIT: Clarification about replaceAll availability
The replaceAll method is added to String's prototype. This means it will be available for all string objects/literals.
Example:
var output = "test this".replaceAll('this', 'that'); // output is 'test that'.
output = output.replaceAll('that', 'this'); // output is 'test this'
Using RegExp in JavaScript could do the job for you. Just simply do something like below code, and don't forget the /g after which standout for global:
var str ="Test abc test test abc test test test abc test test abc";
str = str.replace(/abc/g, '');
If you think of reuse, create a function to do that for you, but it's not recommended as it's only one line function. But again, if you heavily use this, you can write something like this:
String.prototype.replaceAll = String.prototype.replaceAll || function(string, replaced) {
return this.replace(new RegExp(string, 'g'), replaced);
};
And simply use it in your code over and over like below:
var str ="Test abc test test abc test test test abc test test abc";
str = str.replaceAll('abc', '');
But as I mention earlier, it won't make a huge difference in terms of lines to be written or performance. Only caching the function may affect some faster performance on long strings and is also a good practice of DRY code if you want to reuse.
Say you want to replace all the 'abc' with 'x':
let some_str = 'abc def def lom abc abc def'.split('abc').join('x')
console.log(some_str) //x def def lom x x def
I was trying to think about something more simple than modifying the string prototype.
Use a regular expression:
str.replace(/abc/g, '');
Performance
Today 27.12.2019 I perform tests on macOS v10.13.6 (High Sierra) for the chosen solutions.
Conclusions
The str.replace(/abc/g, ''); (C) is a good cross-browser fast solution for all strings.
Solutions based on split-join (A,B) or replace (C,D) are fast
Solutions based on while (E,F,G,H) are slow - usually ~4 times slower for small strings and about ~3000 times (!) slower for long strings
The recurrence solutions (RA,RB) are slow and do not work for long strings
I also create my own solution. It looks like currently it is the shortest one which does the question job:
str.split`abc`.join``
str = "Test abc test test abc test test test abc test test abc";
str = str.split`abc`.join``
console.log(str);
Details
The tests were performed on Chrome 79.0, Safari 13.0.4 and Firefox 71.0 (64 bit). The tests RA and RB use recursion. Results
Short string - 55 characters
You can run tests on your machine HERE. Results for Chrome:
Long string: 275 000 characters
The recursive solutions RA and RB gives
RangeError: Maximum call stack size exceeded
For 1M characters they even break Chrome
I try to perform tests for 1M characters for other solutions, but E,F,G,H takes so much time that browser ask me to break script so I shrink test string to 275K characters. You can run tests on your machine HERE. Results for Chrome
Code used in tests
var t="Test abc test test abc test test test abc test test abc"; // .repeat(5000)
var log = (version,result) => console.log(`${version}: ${result}`);
function A(str) {
return str.split('abc').join('');
}
function B(str) {
return str.split`abc`.join``; // my proposition
}
function C(str) {
return str.replace(/abc/g, '');
}
function D(str) {
return str.replace(new RegExp("abc", "g"), '');
}
function E(str) {
while (str.indexOf('abc') !== -1) { str = str.replace('abc', ''); }
return str;
}
function F(str) {
while (str.indexOf('abc') !== -1) { str = str.replace(/abc/, ''); }
return str;
}
function G(str) {
while(str.includes("abc")) { str = str.replace('abc', ''); }
return str;
}
// src: https://stackoverflow.com/a/56989553/860099
function H(str)
{
let i = -1
let find = 'abc';
let newToken = '';
if (!str)
{
if ((str == null) && (find == null)) return newToken;
return str;
}
while ((
i = str.indexOf(
find, i >= 0 ? i + newToken.length : 0
)) !== -1
)
{
str = str.substring(0, i) +
newToken +
str.substring(i + find.length);
}
return str;
}
// src: https://stackoverflow.com/a/22870785/860099
function RA(string, prevstring) {
var omit = 'abc';
var place = '';
if (prevstring && string === prevstring)
return string;
prevstring = string.replace(omit, place);
return RA(prevstring, string)
}
// src: https://stackoverflow.com/a/26107132/860099
function RB(str) {
var find = 'abc';
var replace = '';
var i = str.indexOf(find);
if (i > -1){
str = str.replace(find, replace);
i = i + replace.length;
var st2 = str.substring(i);
if(st2.indexOf(find) > -1){
str = str.substring(0,i) + RB(st2, find, replace);
}
}
return str;
}
log('A ', A(t));
log('B ', B(t));
log('C ', C(t));
log('D ', D(t));
log('E ', E(t));
log('F ', F(t));
log('G ', G(t));
log('H ', H(t));
log('RA', RA(t)); // use reccurence
log('RB', RB(t)); // use reccurence
<p style="color:red">This snippet only presents codes used in tests. It not perform test itself!<p>
Replacing single quotes:
function JavaScriptEncode(text){
text = text.replace(/'/g,'&apos;')
// More encode here if required
return text;
}
Using
str = str.replace(new RegExp("abc", 'g'), "");
worked better for me than the previous answers. So new RegExp("abc", 'g') creates a regular expression what matches all occurrences ('g' flag) of the text ("abc"). The second part is what gets replaced to, in your case empty string ("").
str is the string, and we have to override it, as replace(...) just returns result, but not overrides. In some cases you might want to use that.
This is the fastest version that doesn't use regular expressions.
Revised jsperf
replaceAll = function(string, omit, place, prevstring) {
if (prevstring && string === prevstring)
return string;
prevstring = string.replace(omit, place);
return replaceAll(prevstring, omit, place, string)
}
It is almost twice as fast as the split and join method.
As pointed out in a comment here, this will not work if your omit variable contains place, as in: replaceAll("string", "s", "ss"), because it will always be able to replace another occurrence of the word.
There is another jsperf with variants on my recursive replace that go even faster (http://jsperf.com/replace-all-vs-split-join/12)!
Update July 27th 2017: It looks like RegExp now has the fastest performance in the recently released Chrome 59.
Loop it until number occurrences comes to 0, like this:
function replaceAll(find, replace, str) {
while (str.indexOf(find) > -1) {
str = str.replace(find, replace);
}
return str;
}
If what you want to find is already in a string, and you don't have a regex escaper handy, you can use join/split:
function replaceMulti(haystack, needle, replacement)
{
return haystack.split(needle).join(replacement);
}
someString = 'the cat looks like a cat';
console.log(replaceMulti(someString, 'cat', 'dog'));
function replaceAll(str, find, replace) {
var i = str.indexOf(find);
if (i > -1){
str = str.replace(find, replace);
i = i + replace.length;
var st2 = str.substring(i);
if(st2.indexOf(find) > -1){
str = str.substring(0,i) + replaceAll(st2, find, replace);
}
}
return str;
}
I like this method (it looks a little cleaner):
text = text.replace(new RegExp("cat","g"), "dog");
String.prototype.replaceAll - ECMAScript 2021
The new String.prototype.replaceAll() method returns a new string with all matches of a pattern replaced by a replacement. The pattern can be either a string or a RegExp, and the replacement can be either a string or a function to be called for each match.
const message = 'dog barks meow meow';
const messageFormatted = message.replaceAll('meow', 'woof')
console.log(messageFormatted);
Of course in 2021 the right answer is:
String.prototype.replaceAll()
console.log(
'Change this and this for me'.replaceAll('this','that') // Normal case
);
console.log(
'aaaaaa'.replaceAll('aa','a') // Challenged case
);
If you don't want to deal with replace() + RegExp.
But what if the browser is from before 2020?
In this case we need polyfill (forcing older browsers to support new features) (I think for a few years will be necessary).
I could not find a completely right method in answers. So I suggest this function that will be defined as a polyfill.
My suggested options for replaceAll polyfill:
replaceAll polyfill (with global-flag error) (more principled version)
if (!String.prototype.replaceAll) { // Check if the native function not exist
Object.defineProperty(String.prototype, 'replaceAll', { // Define replaceAll as a prototype for (Mother/Any) String
configurable: true, writable: true, enumerable: false, // Editable & non-enumerable property (As it should be)
value: function(search, replace) { // Set the function by closest input names (For good info in consoles)
return this.replace( // Using native String.prototype.replace()
Object.prototype.toString.call(search) === '[object RegExp]' // IsRegExp?
? search.global // Is the RegEx global?
? search // So pass it
: function(){throw new TypeError('replaceAll called with a non-global RegExp argument')}() // If not throw an error
: RegExp(String(search).replace(/[.^$*+?()[{|\\]/g, "\\$&"), "g"), // Replace all reserved characters with '\' then make a global 'g' RegExp
replace); // passing second argument
}
});
}
replaceAll polyfill (With handling global-flag missing by itself) (my first preference) - Why?
if (!String.prototype.replaceAll) { // Check if the native function not exist
Object.defineProperty(String.prototype, 'replaceAll', { // Define replaceAll as a prototype for (Mother/Any) String
configurable: true, writable: true, enumerable: false, // Editable & non-enumerable property (As it should be)
value: function(search, replace) { // Set the function by closest input names (For good info in consoles)
return this.replace( // Using native String.prototype.replace()
Object.prototype.toString.call(search) === '[object RegExp]' // IsRegExp?
? search.global // Is the RegEx global?
? search // So pass it
: RegExp(search.source, /\/([a-z]*)$/.exec(search.toString())[1] + 'g') // If not, make a global clone from the RegEx
: RegExp(String(search).replace(/[.^$*+?()[{|\\]/g, "\\$&"), "g"), // Replace all reserved characters with '\' then make a global 'g' RegExp
replace); // passing second argument
}
});
}
Minified (my first preference):
if(!String.prototype.replaceAll){Object.defineProperty(String.prototype,'replaceAll',{configurable:!0,writable:!0,enumerable:!1,value:function(search,replace){return this.replace(Object.prototype.toString.call(search)==='[object RegExp]'?search.global?search:RegExp(search.source,/\/([a-z]*)$/.exec(search.toString())[1]+'g'):RegExp(String(search).replace(/[.^$*+?()[{|\\]/g,"\\$&"),"g"),replace)}})}
Try it:
if(!String.prototype.replaceAll){Object.defineProperty(String.prototype,'replaceAll',{configurable:!0,writable:!0,enumerable:!1,value:function(search,replace){return this.replace(Object.prototype.toString.call(search)==='[object RegExp]'?search.global?search:RegExp(search.source,/\/([a-z]*)$/.exec(search.toString())[1]+'g'):RegExp(String(search).replace(/[.^$*+?()[{|\\]/g,"\\$&"),"g"),replace)}})}
console.log(
'Change this and this for me'.replaceAll('this','that')
); // Change that and that for me
console.log(
'aaaaaa'.replaceAll('aa','a')
); // aaa
console.log(
'{} (*) (*) (RegEx) (*) (\*) (\\*) [reserved characters]'.replaceAll('(*)','X')
); // {} X X (RegEx) X X (\*) [reserved characters]
console.log(
'How (replace) (XX) with $1?'.replaceAll(/(xx)/gi,'$$1')
); // How (replace) ($1) with $1?
console.log(
'Here is some numbers 1234567890 1000000 123123.'.replaceAll(/\d+/g,'***')
); // Here is some numbers *** *** *** and need to be replaced.
console.log(
'Remove numbers under 233: 236 229 711 200 5'.replaceAll(/\d+/g, function(m) {
return parseFloat(m) < 233 ? '' : m
})
); // Remove numbers under 233: 236 711
console.log(
'null'.replaceAll(null,'x')
); // x
// The difference between My first preference and the original:
// Now in 2022 with browsers > 2020 it should throw an error (But possible it be changed in future)
// console.log(
// 'xyz ABC abc ABC abc xyz'.replaceAll(/abc/i,'')
// );
// Browsers < 2020:
// xyz xyz
// Browsers > 2020
// TypeError: String.prototype.replaceAll called with a non-global RegExp
Browser support:
Internet Explorer 9 and later (rested on Internet Explorer 11).
All other browsers (after 2012).
The result is the same as the native replaceAll in case of the first argument input is:
null, undefined, Object, Function, Date, ... , RegExp, Number, String, ...
Ref: 22.1.3.19 String.prototype.replaceAll ( searchValue, replaceValue)
+ RegExp Syntax
Important note: As some professionals mention it, many of recursive functions that suggested in answers, will return the wrong result. (Try them with the challenged case of the above snippet.)
Maybe some tricky methods like .split('searchValue').join('replaceValue') or some well managed functions give same result, but definitely with much lower performance than native replaceAll() / polyfill replaceAll() / replace() + RegExp
Other methods of polyfill assignment
Naive, but supports even older browsers (be better to avoid)
For example, we can support IE7+ too, by not using Object.defineProperty() and using my old naive assignment method:
if (!String.prototype.replaceAll) {
String.prototype.replaceAll = function(search, replace) { // <-- Naive method for assignment
// ... (Polyfill code Here)
}
}
And it should work well for basic uses on IE7+.
But as here #sebastian-simon explained about, that can make secondary problems in case of more advanced uses. E.g.:
for (var k in 'hi') console.log(k);
// 0
// 1
// replaceAll <-- ?
Fully trustable, but heavy
In fact, my suggested option is a little optimistic. Like we trusted the environment (browser and Node.js), it is definitely for around 2012-2021. Also it is a standard/famous one, so it does not require any special consideration.
But there can be even older browsers or some unexpected problems, and polyfills still can support and solve more possible environment problems. So in case we need the maximum support that is possible, we can use polyfill libraries like:
https://polyfill.io/
Specially for replaceAll:
<script src="https://polyfill.io/v3/polyfill.min.js?features=String.prototype.replaceAll"></script>
The simplest way to do this without using any regular expression is split and join, like the code here:
var str = "Test abc test test abc test test test abc test test abc";
console.log(str.split('abc').join(''));
var str = "ff ff f f a de def";
str = str.replace(/f/g,'');
alert(str);
http://jsfiddle.net/ANHR9/
while (str.indexOf('abc') !== -1)
{
str = str.replace('abc', '');
}
If the string contains a similar pattern like abccc, you can use this:
str.replace(/abc(\s|$)/g, "")
As of August 2020 there is a Stage 4 proposal to ECMAScript that adds the replaceAll method to String.
It's now supported in Chrome 85+, Edge 85+, Firefox 77+, Safari 13.1+.
The usage is the same as the replace method:
String.prototype.replaceAll(searchValue, replaceValue)
Here's an example usage:
'Test abc test test abc test.'.replaceAll('abc', 'foo'); // -> 'Test foo test test foo test.'
It's supported in most modern browsers, but there exist polyfills:
core-js
es-shims
It is supported in the V8 engine behind an experimental flag --harmony-string-replaceall.
Read more on the V8 website.
The previous answers are way too complicated. Just use the replace function like this:
str.replace(/your_regex_pattern/g, replacement_string);
Example:
var str = "Test abc test test abc test test test abc test test abc";
var res = str.replace(/[abc]+/g, "");
console.log(res);
After several trials and a lot of fails, I found that the below function seems to be the best all-rounder when it comes to browser compatibility and ease of use. This is the only working solution for older browsers that I found. (Yes, even though old browser are discouraged and outdated, some legacy applications still make heavy use of OLE browsers (such as old Visual Basic 6 applications or Excel .xlsm macros with forms.)
Anyway, here's the simple function.
function replaceAll(str, match, replacement){
return str.split(match).join(replacement);
}
If you are trying to ensure that the string you are looking for won't exist even after the replacement, you need to use a loop.
For example:
var str = 'test aabcbc';
str = str.replace(/abc/g, '');
When complete, you will still have 'test abc'!
The simplest loop to solve this would be:
var str = 'test aabcbc';
while (str != str.replace(/abc/g, '')){
str.replace(/abc/g, '');
}
But that runs the replacement twice for each cycle. Perhaps (at risk of being voted down) that can be combined for a slightly more efficient but less readable form:
var str = 'test aabcbc';
while (str != (str = str.replace(/abc/g, ''))){}
// alert(str); alerts 'test '!
This can be particularly useful when looking for duplicate strings.
For example, if we have 'a,,,b' and we wish to remove all duplicate commas.
[In that case, one could do .replace(/,+/g,','), but at some point the regex gets complex and slow enough to loop instead.]

Javascript string.match refuses to return an array of more than one match

I have a string that I expect to be formatted like so:
{List:[Names:a,b,c][Ages:1,2,3]}
My query looks like this in javascript:
var str = "{List:[Names:a,b,c][Ages:1,2,3]}";
var result = str.match(/^\{List:\[Names:([a-zA-z,]*)\]\[Ages:([0-9,]*)\]\}$/g);
Note: I recognize that with this regex it would pass with something like "Ages:,,,", but I'm not worried about that at the moment.
I was expecting to get this back:
result[0] = "{List:[Names:a,b,c][Ages:1,2,3]}"
result[1] = "a,b,c"
result[2] = "1,2,3"
But no matter what I seem to do to the regular expression, it refuses to return an array of more than one match, I just get the full string back (because it passes, which is a start):
result = ["{List:[Names:a,b,c][Ages:1,2,3]}"]
I've looked through a bunch of questions on here already, as well as other 'intro' articles, and none of them seem to address something this basic. I'm sure it's something foolish that I've overlooked, but I truly have no idea what it is :(
So this is a difference in how the global flag is applied in Regular Expressions in JavaScript.
In .match, the global flag (/g at the end) will return an array of every incident where the regular expression matches the string. Without that flag, .match will return an array of all of the groupings in the string.
eg:
var str = "{List:[Names:a,b,c][Ages:1,2,3]}";
str += str;
// removed ^ and $ for demonstration purposes
var results = str.match(/\{List:\[Names:([a-zA-z,]*)\]\[Ages:([0-9,]*)\]\}/g)
console.log(results)
// ["{List:[Names:a,b,c][Ages:1,2,3]}", "{List:[Names:a,b,c][Ages:1,2,3]}"]
str = "{List:[Names:a,b,c][Ages:1,2,3]}{List:[Names:a,b,c][Ages:3,4,5]}";
results = str.match(/\{List:\[Names:([a-zA-z,]*)\]\[Ages:([0-9,]*)\]\}/g);
console.log(results)
//["{List:[Names:a,b,c][Ages:1,2,3]}", "{List:[Names:a,b,c][Ages:3,4,5]}"]
Now, if we remove that /g flag:
// leaving str as above
results = str.match(/\{List:\[Names:([a-zA-z,]*)\]\[Ages:([0-9,]*)\]\}/);
console.log(results)
//["{List:[Names:a,b,c][Ages:1,2,3]}", "a,b,c", "1,2,3"]
And as a note as to why regex.exec worked, that is because:
If the regular expression does not include the g flag, returns the same result as regexp.exec(string).
You're looking for the form needle.exec(haystack)
From my console:
> haystack = "{List:[Names:a,b,c][Ages:1,2,3]}";
"{List:[Names:a,b,c][Ages:1,2,3]}"
> needle = /^\{List:\[Names:([a-zA-z,]*)\]\[Ages:([0-9,]*)\]\}$/g ;
/^\{List:\[Names:([a-zA-z,]*)\]\[Ages:([0-9,]*)\]\}$/g
> needle.exec(haystack);
["{List:[Names:a,b,c][Ages:1,2,3]}", "a,b,c", "1,2,3"]

RegEx for match/replacing JavaScript comments (both multiline and inline)

I need to remove all JavaScript comments from a JavaScript source using the JavaScript RegExp object.
What I need is the pattern for the RegExp.
So far, I've found this:
compressed = compressed.replace(/\/\*.+?\*\/|\/\/.*(?=[\n\r])/g, '');
This pattern works OK for:
/* I'm a comment */
or for:
/*
* I'm a comment aswell
*/
But doesn't seem to work for the inline:
// I'm an inline comment
I'm not quite an expert for RegEx and it's patterns, so I need help.
Also, I' would like to have a RegEx pattern which would remove all those HTML-like comments.
<!-- HTML Comment //--> or <!-- HTML Comment -->
And also those conditional HTML comments, which can be found in various JavaScript sources.
Thanks.
NOTE: Regex is not a lexer or a parser. If you have some weird edge case where you need some oddly nested comments parsed out of a string, use a parser. For the other 98% of the time this regex should work.
I had pretty complex block comments going on with nested asterisks, slashes, etc. The regular expression at the following site worked like a charm:
http://upshots.org/javascript/javascript-regexp-to-remove-comments
(see below for original)
Some modifications have been made, but the integrity of the original regex has been preserved. In order to allow certain double-slash (//) sequences (such as URLs), you must use back reference $1 in your replacement value instead of an empty string. Here it is:
/\/\*[\s\S]*?\*\/|([^\\:]|^)\/\/.*$/gm
// JavaScript:
// source_string.replace(/\/\*[\s\S]*?\*\/|([^\\:]|^)\/\/.*$/gm, '$1');
// PHP:
// preg_replace("/\/\*[\s\S]*?\*\/|([^\\:]|^)\/\/.*$/m", "$1", $source_string);
DEMO: https://regex101.com/r/B8WkuX/1
FAILING USE CASES: There are a few edge cases where this regex fails. An ongoing list of those cases is documented in this public gist. Please update the gist if you can find other cases.
...and if you also want to remove <!-- html comments --> use this:
/\/\*[\s\S]*?\*\/|([^\\:]|^)\/\/.*|<!--[\s\S]*?-->$/
(original - for historical reference only)
// DO NOT USE THIS - SEE ABOVE
/(\/\*([\s\S]*?)\*\/)|(\/\/(.*)$)/gm
try this,
(\/\*[\w\'\s\r\n\*]*\*\/)|(\/\/[\w\s\']*)|(\<![\-\-\s\w\>\/]*\>)
should work :)
I have been putting togethor an expression that needs to do something similar.
the finished product is:
/(?:((["'])(?:(?:\\\\)|\\\2|(?!\\\2)\\|(?!\2).|[\n\r])*\2)|(\/\*(?:(?!\*\/).|[\n\r])*\*\/)|(\/\/[^\n\r]*(?:[\n\r]+|$))|((?:=|:)\s*(?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/))|((?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/)[gimy]?\.(?:exec|test|match|search|replace|split)\()|(\.(?:exec|test|match|search|replace|split)\((?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/))|(<!--(?:(?!-->).)*-->))/g
Scary right?
To break it down, the first part matches anything within single or double quotation marks
This is necessary to avoid matching quoted strings
((["'])(?:(?:\\\\)|\\\2|(?!\\\2)\\|(?!\2).|[\n\r])*\2)
the second part matches multiline comments delimited by /* */
(\/\*(?:(?!\*\/).|[\n\r])*\*\/)
The third part matches single line comments starting anywhere in the line
(\/\/[^\n\r]*(?:[\n\r]+|$))
The fourth through sixth parts matchs anything within a regex literal
This relies on a preceding equals sign or the literal being before or after a regex call
((?:=|:)\s*(?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/))
((?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/)[gimy]?\.(?:exec|test|match|search|replace|split)\()
(\.(?:exec|test|match|search|replace|split)\((?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/))
and the seventh which I originally forgot removes the html comments
(<!--(?:(?!-->).)*-->)
I had an issue with my dev environment issuing errors for a regex that broke a line, so I used the following solution
var ADW_GLOBALS = new Object
ADW_GLOBALS = {
quotations : /((["'])(?:(?:\\\\)|\\\2|(?!\\\2)\\|(?!\2).|[\n\r])*\2)/,
multiline_comment : /(\/\*(?:(?!\*\/).|[\n\r])*\*\/)/,
single_line_comment : /(\/\/[^\n\r]*[\n\r]+)/,
regex_literal : /(?:\/(?:(?:(?!\\*\/).)|\\\\|\\\/|[^\\]\[(?:\\\\|\\\]|[^]])+\])+\/)/,
html_comments : /(<!--(?:(?!-->).)*-->)/,
regex_of_doom : ''
}
ADW_GLOBALS.regex_of_doom = new RegExp(
'(?:' + ADW_GLOBALS.quotations.source + '|' +
ADW_GLOBALS.multiline_comment.source + '|' +
ADW_GLOBALS.single_line_comment.source + '|' +
'((?:=|:)\\s*' + ADW_GLOBALS.regex_literal.source + ')|(' +
ADW_GLOBALS.regex_literal.source + '[gimy]?\\.(?:exec|test|match|search|replace|split)\\(' + ')|(' +
'\\.(?:exec|test|match|search|replace|split)\\(' + ADW_GLOBALS.regex_literal.source + ')|' +
ADW_GLOBALS.html_comments.source + ')' , 'g'
);
changed_text = code_to_test.replace(ADW_GLOBALS.regex_of_doom, function(match, $1, $2, $3, $4, $5, $6, $7, $8, offset, original){
if (typeof $1 != 'undefined') return $1;
if (typeof $5 != 'undefined') return $5;
if (typeof $6 != 'undefined') return $6;
if (typeof $7 != 'undefined') return $7;
return '';
}
This returns anything captured by the quoted string text and anything found in a regex literal intact but returns an empty string for all the comment captures.
I know this is excessive and rather difficult to maintain but it does appear to work for me so far.
This works for almost all cases:
var RE_BLOCKS = new RegExp([
/\/(\*)[^*]*\*+(?:[^*\/][^*]*\*+)*\//.source, // $1: multi-line comment
/\/(\/)[^\n]*$/.source, // $2 single-line comment
/"(?:[^"\\]*|\\[\S\s])*"|'(?:[^'\\]*|\\[\S\s])*'/.source, // - string, don't care about embedded eols
/(?:[$\w\)\]]|\+\+|--)\s*\/(?![*\/])/.source, // - division operator
/\/(?=[^*\/])[^[/\\]*(?:(?:\[(?:\\.|[^\]\\]*)*\]|\\.)[^[/\\]*)*?\/[gim]*/.source
].join('|'), // - regex
'gm' // note: global+multiline with replace() need test
);
// remove comments, keep other blocks
function stripComments(str) {
return str.replace(RE_BLOCKS, function (match, mlc, slc) {
return mlc ? ' ' : // multiline comment (replace with space)
slc ? '' : // single/multiline comment
match; // divisor, regex, or string, return as-is
});
}
The code is based on regexes from jspreproc, I wrote this tool for the riot compiler.
See http://github.com/aMarCruz/jspreproc
In plain simple JS regex, this:
my_string_or_obj.replace(/\/\*[\s\S]*?\*\/|([^:]|^)\/\/.*$/gm, ' ')
a bit simpler -
this works also for multiline - (<!--.*?-->)|(<!--[\w\W\n\s]+?-->)
Simple regex ONLY for multi-lines:
/\*((.|\n)(?!/))+\*/
The accepted solution does not capture all common use cases. See examples here: https://regex101.com/r/38dIQk/1.
The following regular expression should match JavaScript comments more reliably:
/(?:\/\*(?:[^\*]|\**[^\*\/])*\*+\/)|(?:\/\/[\S ]*)/g
For demonstration, visit the following link: https://regex101.com/r/z99Nq5/1/.
This is late to be of much use to the original question, but maybe it will help someone.
Based on #Ryan Wheale's answer, I've found this to work as a comprehensive capture to ensure that matches exclude anything found inside a string literal.
/(?:\r\n|\n|^)(?:[^'"])*?(?:'(?:[^\r\n\\']|\\'|[\\]{2})*'|"(?:[^\r\n\\"]|\\"|[\\]{2})*")*?(?:[^'"])*?(\/\*(?:[\s\S]*?)\*\/|\/\/.*)/g
The last group (all others are discarded) is based on Ryan's answer. Example here.
This assumes code is well structured and valid javascript.
Note: this has not been tested on poorly structured code which may or may not be recoverable depending on the javascript engine's own heuristics.
Note: this should hold for valid javascript < ES6, however, ES6 allows multi-line string literals, in which case this regex will almost certainly break, though that case has not been tested.
However, it is still possible to match something that looks like a comment inside a regex literal (see comments/results in the Example above).
I use the above capture after replacing all regex literals using the following comprehensive capture extracted from es5-lexer here and here, as referenced in Mike Samuel's answer to this question:
/(?:(?:break|case|continue|delete|do|else|finally|in|instanceof|return|throw|try|typeof|void|[+]|-|[.]|[/]|,|[*])|[!%&(:;<=>?[^{|}~])?(\/(?![*/])(?:[^\\\[/\r\n\u2028\u2029]|\[(?:[^\]\\\r\n\u2028\u2029]|\\(?:[^\r\n\u2028\u2029ux]|u[0-9A-Fa-f]{4}|x[0-9A-Fa-f]{2}))+\]|\\(?:[^\r\n\u2028\u2029ux]|u[0-9A-Fa-f]{4}|x[0-9A-Fa-f]{2}))*\/[gim]*)/g
For completeness, see also this trivial caveat.
If you click on the link below you find a comment removal script written in regex.
These are 112 lines off code that work together also works with mootools and Joomla and drupal and other cms websites.
Tested it on 800.000 lines of code and comments. works fine.
This one also selects multiple parenthetical like ( abc(/nn/('/xvx/'))"// testing line") and comments that are between colons and protect them.
23-01-2016..! This is the code with the comments in it.!!!!
Click Here
I was looking for a quick Regex solution too, but none of the answers provided work 100%. Each one ends up breaking the source code in some way, mostly due to comments detected inside string literals. E.g.
var string = "https://www.google.com/";
Becomes
var string = "https:
For the benefit of those coming in from google, I ended up writing a short function (in Javascript) that achieves what the Regex couldn't do. Modify for whatever language you are using to parse Javascript.
function removeCodeComments(code) {
var inQuoteChar = null;
var inBlockComment = false;
var inLineComment = false;
var inRegexLiteral = false;
var newCode = '';
for (var i=0; i<code.length; i++) {
if (!inQuoteChar && !inBlockComment && !inLineComment && !inRegexLiteral) {
if (code[i] === '"' || code[i] === "'" || code[i] === '`') {
inQuoteChar = code[i];
}
else if (code[i] === '/' && code[i+1] === '*') {
inBlockComment = true;
}
else if (code[i] === '/' && code[i+1] === '/') {
inLineComment = true;
}
else if (code[i] === '/' && code[i+1] !== '/') {
inRegexLiteral = true;
}
}
else {
if (inQuoteChar && ((code[i] === inQuoteChar && code[i-1] != '\\') || (code[i] === '\n' && inQuoteChar !== '`'))) {
inQuoteChar = null;
}
if (inRegexLiteral && ((code[i] === '/' && code[i-1] !== '\\') || code[i] === '\n')) {
inRegexLiteral = false;
}
if (inBlockComment && code[i-1] === '/' && code[i-2] === '*') {
inBlockComment = false;
}
if (inLineComment && code[i] === '\n') {
inLineComment = false;
}
}
if (!inBlockComment && !inLineComment) {
newCode += code[i];
}
}
return newCode;
}
2019:
All other answers are incomplete and full of shortcomings. I take the time to write complete answer that WORK
function stripComments(code){
const savedText = [];
return code
.replace(/(['"`]).*?\1/gm,function (match) {
var i = savedText.push(match);
return (i-1)+'###';
})
// remove // comments
.replace(/\/\/.*/gm,'')
// now extract all regex and save them
.replace(/\/[^*\n].*\//gm,function (match) {
var i = savedText.push(match);
return (i-1)+'###';
})
// remove /* */ comments
.replace(/\/\*[\s\S]*\*\//gm,'')
// remove <!-- --> comments
.replace(/<!--[\s\S]*-->/gm, '')
.replace(/\d+###/gm,function(match){
var i = Number.parseInt(match);
return savedText[i];
})
}
var cleancode = stripComments(stripComments.toString())
console.log(cleancode)
Other answers not working on samples code like that:
// won't execute the creative code ("Can't execute code form a freed script"),
navigator.userAgent.match(/\b(MSIE |Trident.*?rv:|Edge\/)(\d+)/);
function stripComments(code){
const savedText = [];
return code
// extract strings and regex
.replace(/(['"`]).*?\1/gm,function (match) {
savedText.push(match);
return '###';
})
// remove // comments
.replace(/\/\/.*/gm,'')
// now extract all regex and save them
.replace(/\/[^*\n].*\//gm,function (match) {
savedText.push(match);
return '###';
})
// remove /* */ comments
.replace(/\/\*[\s\S]*\*\//gm,'')
// remove <!-- --> comments
.replace(/<!--[\s\S]*-->/gm, '')
/*replace \ with \\ so we not lost \b && \t*/
.replace(/###/gm,function(){
return savedText.shift();
})
}
var cleancode = stripComments(stripComments.toString())
console.log(cleancode)
for /**/ and //
/(?:(?:\/\*(?:[^*]|(?:\*+[^*\/]))*\*+\/)|(?:(?<!\:|\\\|\')\/\/.*))/gm
I wonder if this was a trick question given by
a professor to students. Why? Because it seems
to me it is IMPOSSIBLE to do this, with
Regular Expressions, in the general case.
Your (or whoever's code it is) can contain
valid JavaScript like this:
let a = "hello /* ";
let b = 123;
let c = "world */ ";
Now if you have a regexp which removes everything
between a pair of /* and */, it would break the code
above, it would remove the executable code in the
middle as well.
If you try to devise a regexp that would not
remove comments which contain quotes then
you cannot remove such comments. That applies
to single-quote, double-quotes and back-quotes.
You can not remove (all) comments with Regular
Expressions in JavaScript, it seems to me,
maybe someone can point out a way how to do
it for the case above.
What you can do is build a small parser which
goes through the code character by character
and knows when it is inside a string and when
it is inside a comment, and when it is inside
a comment inside a string and so on.
I'm sure there are good open source JavaScript
parsers that can do this. Maybe some of the
packaging and minifying tools can do this for
you as well.
For block comment:
https://regex101.com/r/aepSSj/1
Matches slash character (the \1) only if slash character is followed by asterisk.
(\/)(?=\*)
maybe followed by another asterisk
(?:\*)
followed by first group of match, or zero or more times from something...maybe, without remember the match but capture as a group.
((?:\1|[\s\S])*?)
followed by asterisk and first group
(?:\*)\1
For block and/or inline comment:
https://regex101.com/r/aepSSj/2
where | mean or and (?=\/\/(.*)) capture anything after any //
or https://regex101.com/r/aepSSj/3
to capture the third part too
all in: https://regex101.com/r/aepSSj/8
DEMO: https://onecompiler.com/javascript/3y825u3d5
const context = `
<html>
<script type="module">
/* I'm a comment */
/*
* I'm a comment aswell url="https://example.com/";
*/
var re = /\\/*not a comment!*/;
var m = /\\//.test("\"not a comment!\"");
var re = /"/; // " thiscommentishandledasascode!
const s1 = "multi String \\
\\"double quote\\" \\
// single commet in str \\
/* multiple lines commet in str \\
secend line */ \\
last line";
const s2 = 's2"s';
const url = "https://example.com/questions/5989315/";
let a = "hello /* ";
let b = 123;
let c = "world */ ";
//public static final String LETTERS_WORK_FOLDER = "/Letters/Generated/Work";
console.log(/*comment in
console.log*/ "!message at console.log");
function displayMsg( // the end comment
/*commet arg1*/ a, ...args) {
console.log("Hello World!", a, ...args)
}
<\/script>
<body>
<!-- HTML Comment //--> or <!-- HTML Comment -->
<!--
function displayMsg() {
alert("Hello World!")
}
//-->
</body>
</html>
`;
console.log("before:\n" + context);
console.log("<".repeat(100));
const save = {'txt':[], 'comment':[], 'regex': []};
const context2 =
context.replace(/(['"`]|\/[\*\/]{0,1}|<!\-\-)(?:(?=(?<=\/\*))[\s\S]*?\*\/|(?=(?<=\/\/)).*|(?=(?<=<!\-\-))[\s\S]*?\-\->|(?=(?<=[\s\=]\/)).+?(?<!\\)\/|(?=(?<=['"`]))[\s\S]*?(?<!\\)\1)/g,
function (m) {
const t = (m[0].match(/["'`]/) && 'txt') || (m.match(/^(\/\/|\/\*|<)/) && 'comment') || 'regex';
save[t].push(m);
return '${save.'+t+'['+(save[t].length - 1)+']}';
}).replace(/[\S\s]*/, function(m) {
console.log("watch:\n"+m);
console.log(">".repeat(100));
/*
##remove comment
save.comment = save.comment.map(_ => _.replace(/[\S\s]+/,""));
##replace comment
save.comment = save.comment.map(_ => _.replace(/console\.log/g, 'CONSOLE.LOG'));
##replace text
save.txt = save.txt.map(_ => _.replace(/console\.log/g, 'CONSOLE.LOG'));
##replace your code
m = m.replace(/console\.log/g, 'console.warn');
*/
// console.warn("##remove comment -> save.comment.fill('');");
save.comment.fill('');
return m;
}).replace(/\$\{save.(\w+)\[(\d+)\]\}/g, function(m, t, id) {
return save[t][id];
}).replace(/[\S\s]*/, function(m) {
console.log("result:", m);
// console.log("compare:", (context === m));
return m;
})
My English is not good, can someone help translate what I have written, I will be very grateful
Consider some problems
A.There may be strings in comments, or comments in strings, like
/*
const url="https://example.com/";
*/
const str = "i am s string and /*commet in string*/";
B. " or ' or ` in a string will be escaped with
like
const str = "my name is \"john\"";
const str2 = 'i am "john\'s" friend';
Combining the above multiple regex replaces will cause some problems
Consider regex find to the beginning part
" ' ` // /* <!--
use regex
(['"`]|\/[\*\/]|<!\-\-)
(['"`]|/[*/]|<!\-\-) result as \1
\1 is one of ' or " or
`
or /* or // or <!--
use If-Then-Else Conditionals in Regular Expressions
https://www.regular-expressions.info/conditional.html
(?:(?=(?<=\/\*))[\s\S]*?\*\/|(?=(?<=\/\/)).*|(?=(?<=<!\-\-))[\s\S]*?\-\->|[^\1]*?(?<!\\)\1)
if (?=(?<=\/\*))[\s\S]*?\*\/
(?=(?<=\/\*)) positive lookbehind (?<=\/\*) beacuse/*
It's a multi-line comment, so it should be followed by the latest one */
[\s\S]*?\*\/ match complete /*..\n..\n. */
elseif (?=(?<=\/\/)).*
(?=(?<=//)).* positive lookbehind
(?<=\/\/) catch // single line commet
.* match complete // any single commet
elseif (?=(?<=<!\-\-))[\s\S]*?\-\->
(?=(?<=<!--)) positive lookbehind (?<=<!\-\-) ,
[\s\S]*?\-\-> match complete
<!--..\n..\n. /*/*\-\->
else [^\1]*?(?<!\\)\1
Finally need to process the string
use regex [\s\S]*?\1
maybe the wrong result with "STR\" or 'STR"S\'
at [\s\S]*?we can use "positive lookbehind"
add this [\s\S]*?(?<!\\)\1 to filter escape quotes
end
Based on above attempts and using UltraEdit , mostly Abhishek Simon, I found this to work for inline comments and handles all of the characters within the comment.
(\s\/\/|$\/\/)[\w\s\W\S.]*
This matches comments at the start of the line or with a space before //
//public static final String LETTERS_WORK_FOLDER =
"/Letters/Generated/Work";
but not
"http://schemas.us.com.au/hub/'>" +
so it is only not good for something like
if(x){f(x)}//where f is some function
it just needs to be
if(x){f(x)} //where f is function

Case insensitive string replacement in JavaScript?

I need to highlight, case insensitively, given keywords in a JavaScript string.
For example:
highlight("foobar Foo bar FOO", "foo") should return "<b>foo</b>bar <b>Foo</b> bar <b>FOO</b>"
I need the code to work for any keyword, and therefore using a hardcoded regular expression like /foo/i is not a sufficient solution.
What is the easiest way to do this?
(This an instance of a more general problem detailed in the title, but I feel that it's best to tackle with a concrete, useful example.)
You can use regular expressions if you prepare the search string. In PHP e.g. there is a function preg_quote, which replaces all regex-chars in a string with their escaped versions.
Here is such a function for javascript (source):
function preg_quote (str, delimiter) {
// discuss at: https://locutus.io/php/preg_quote/
// original by: booeyOH
// improved by: Ates Goral (https://magnetiq.com)
// improved by: Kevin van Zonneveld (https://kvz.io)
// improved by: Brett Zamir (https://brett-zamir.me)
// bugfixed by: Onno Marsman (https://twitter.com/onnomarsman)
// example 1: preg_quote("$40")
// returns 1: '\\$40'
// example 2: preg_quote("*RRRING* Hello?")
// returns 2: '\\*RRRING\\* Hello\\?'
// example 3: preg_quote("\\.+*?[^]$(){}=!<>|:")
// returns 3: '\\\\\\.\\+\\*\\?\\[\\^\\]\\$\\(\\)\\{\\}\\=\\!\\<\\>\\|\\:'
return (str + '')
.replace(new RegExp('[.\\\\+*?\\[\\^\\]$(){}=!<>|:\\' + (delimiter || '') + '-]', 'g'), '\\$&')
}
So you could do the following:
function highlight(str, search) {
return str.replace(new RegExp("(" + preg_quote(search) + ")", 'gi'), "<b>$1</b>");
}
function highlightWords( line, word )
{
var regex = new RegExp( '(' + word + ')', 'gi' );
return line.replace( regex, "<b>$1</b>" );
}
You can enhance the RegExp object with a function that does special character escaping for you:
RegExp.escape = function(str)
{
var specials = /[.*+?|()\[\]{}\\$^]/g; // .*+?|()[]{}\$^
return str.replace(specials, "\\$&");
}
Then you would be able to use what the others suggested without any worries:
function highlightWordsNoCase(line, word)
{
var regex = new RegExp("(" + RegExp.escape(word) + ")", "gi");
return line.replace(regex, "<b>$1</b>");
}
Regular expressions are fine as long as keywords are really words, you can just use a RegExp constructor instead of a literal to create one from a variable:
var re= new RegExp('('+word+')', 'gi');
return s.replace(re, '<b>$1</b>');
The difficulty arises if ‘keywords’ can have punctuation in, as punctuation tends to have special meaning in regexps. Unfortunately unlike most other languages/libraries with regexp support, there is no standard function to escape punctation for regexps in JavaScript.
And you can't be totally sure exactly what characters need escaping because not every browser's implementation of regexp is guaranteed to be exactly the same. (In particular, newer browsers may add new functionality.) And backslash-escaping characters that are not special is not guaranteed to still work, although in practice it does.
So about the best you can do is one of:
attempting to catch each special character in common browser use today [add: see Sebastian's recipe]
backslash-escape all non-alphanumerics. care: \W will also match non-ASCII Unicode characters, which you don't really want.
just ensure that there are no non-alphanumerics in the keyword before searching
If you are using this to highlight words in HTML which already has markup in, though, you've got trouble. Your ‘word’ might appear in an element name or attribute value, in which case attempting to wrap a < b> around it will cause brokenness. In more complicated scenarios possibly even an HTML-injection to XSS security hole. If you have to cope with markup you will need a more complicated approach, splitting out ‘< ... >’ markup before attempting to process each stretch of text on its own.
What about something like this:
if(typeof String.prototype.highlight !== 'function') {
String.prototype.highlight = function(match, spanClass) {
var pattern = new RegExp( match, "gi" );
replacement = "<span class='" + spanClass + "'>$&</span>";
return this.replace(pattern, replacement);
}
}
This could then be called like so:
var result = "The Quick Brown Fox Jumped Over The Lazy Brown Dog".highlight("brown","text-highlight");
For those poor with disregexia or regexophobia:
function replacei(str, sub, f){
let A = str.toLowerCase().split(sub.toLowerCase());
let B = [];
let x = 0;
for (let i = 0; i < A.length; i++) {
let n = A[i].length;
B.push(str.substr(x, n));
if (i < A.length-1)
B.push(f(str.substr(x + n, sub.length)));
x += n + sub.length;
}
return B.join('');
}
s = 'Foo and FOO (and foo) are all -- Foo.'
t = replacei(s, 'Foo', sub=>'<'+sub+'>')
console.log(t)
Output:
<Foo> and <FOO> (and <foo>) are all -- <Foo>.
Why not just create a new regex on each call to your function? You can use:
new Regex([pat], [flags])
where [pat] is a string for the pattern, and [flags] are the flags.

Categories