I want to replace a bad word with asterisks ***. However, there is a problem when the bad word is contained in an another word I don't want to replace it.
for(var i = 0; i < forbidden.length; i++) {
if(textBoxValue.search(forbidden[i]) > -1) {
textBoxValue = textBoxValue.replace(forbidden[i], '');
}
}
For example if the bad word is "are", if it is in another word like "aren't" I don't want it to appear as "***n't". I only want to replace the word if it is by itself.
One option is to use a regular expression with a word boundary on each side, to ensure that a matched word is standalone:
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(new RegExp('\\b' + word + '\\b', 'g'), '');
});
For example:
let textBoxValue = 'bigwordfoo foo bar barbaz';
const forbidden = ['foo', 'bar'];
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(new RegExp('\\b' + word + '\\b', 'g'), '');
});
console.log(textBoxValue);
If you actually want to replace with asterisks, and not the empty string, use a replacer function instead:
let textBoxValue = 'bigwordfoo foo bar barbaz';
const forbidden = ['foo', 'bar'];
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(
new RegExp('\\b' + word + '\\b', 'g'),
word => '*'.repeat(word.length)
);
});
console.log(textBoxValue);
Of course, note that word restrictions are generally pretty easy to overcome by anyone who really wants to. Humans can almost always come up with ways to fool heuristics.
If any of the words to blacklist contain special characters in a regular expression, escape them first before passing to new RegExp:
const escape = s => s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
let textBoxValue = 'bigwordfoo foo ba$r ba$rbaz';
const forbidden = ['foo', 'ba$r'];
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(
new RegExp('\\b' + escape(word) + '\\b', 'g'),
word => '*'.repeat(word.length)
);
});
console.log(textBoxValue);
You can create a dynamic regex with all the forbidden words separated by a | to create an alternation. You can wrap this with word boundary (\b) to replace only full word matches.
For the following list of forbidden words, the dynamic regex ends up being
/\b(?:bad|nasty|dreadful)\b/g
The second parameter to replace, gets the matched word as a parameter. You can use repeat to get * repeated the same number of times as the length of the word to be replaced
function replaceBadWords(textBoxValue, forbidden) {
const regex = new RegExp(`\\b(?:${forbidden.join('|')})\\b`, 'g')
return textBoxValue.replace(regex, m => "*".repeat(m.length))
}
const forbidden = ['bad', 'nasty', 'dreadful']
console.log(replaceBadWords('string with some nasty words in it', forbidden))
console.log(replaceBadWords("bad gets replaced with asterisks but badminton won't", forbidden))
If you're not yet using a library (Or if you want to use one)
You can check this repo out.
First, they already have a list of bad words so you don't need to think about them and think what you missed.
They support placeholders like:
var Filter = require('bad-words');
var customFilter = new Filter({ placeHolder: 'x'});
customFilter.clean('Don't be an ash0le'); //Don't be an xxxxxx
and you can add your own bad words like or remove it:
var filter = new Filter();
// add to list
filter.addWords('some', 'bad', 'word');
// remove from list
filter.removeWords('hells', 'sadist');
And also a multi lingual support if you have the correct regex.
Related
I have a regex pattern that works fine in regex101.com: ~<a .*?">(*SKIP)(*FAIL)|\bword\b
I am trying to make it a Regexp so it can be used in the replace() function in JavaScript.
The line of JavaScript code is:
var regex = new RegExp("~<a.*?\">(*SKIP)(*FAIL)|\\b"+ word + "\\b", 'g');
Where word is the word I'm trying to match.
When I run it though, the console shows the following error:
Uncaught (in promise) SyntaxError: Invalid regular expression:
/~<a.*?">(*SKIP)(*FAIL)|word/: Nothing to repeat
Am I escaping characters wrong?
I tried backslash-escaping every special character I could find (?, *, < and so on) in my JavaScript code and it still spat out that error.
You can work around the missing (*SKIP)(*FAIL) support in JavaScript using capturing groups in the pattern and a bit of code logic.
Note the (*SKIP)(*FAIL) verb sequence is explained in my YT video called "Skipping matches in specific contexts (with SKIP & FAIL verbs)". You can also find a demo of JavaScript lookarounds for four different scenarions: extracting, replacing, removing and splitting.
Let's adjust the code for the current question. Let's assume word always consists of word characters (digits, letters or underscores).
Extracting: Capture the word into Group 1 and only extract Group 1 values:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`<a .*?">|\b(${word})\b`, 'gi');
console.log(Array.from(text.matchAll(regex), x=>x[1]).filter(Boolean)); // => 1st word and `>foo<`
Removing: Capture the context you need to keep into Group 1 and replace with a backreference to this group:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
console.log(text.replace(regex, '$1')); // => foobar
Replacing: Capture the context you need to keep into Group 1 and when it is used, replace with Group 1 value, else, replace with what you need in a callback function/arrow function used as the replacement argument:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
console.log(text.replace(regex, (match, group1) => group1 || 'buz' ));
// => buz buz foobar
Splitting: This is the most intricate scenario and it requires a bit more coding:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
let m, res = [], offset = 0;
while (m = regex.exec(text)) { // If there is a match and...
if (m[1] === undefined) { // if Group 1 is not matched
// put the substring to result array
res.push(text.substring(offset, m.index)) // Put the value to array
offset = m.index + m[0].length // Set the new chunk start position
}
}
if (offset < text.length) { // If there is any more text after offset
res.push(text.substr(offset)) // add it to the result array
}
console.log(res);
// => ["", " ", " foobar"]
I have a regex pattern that works fine in regex101.com: ~<a .*?">(*SKIP)(*FAIL)|\bword\b
I am trying to make it a Regexp so it can be used in the replace() function in JavaScript.
The line of JavaScript code is:
var regex = new RegExp("~<a.*?\">(*SKIP)(*FAIL)|\\b"+ word + "\\b", 'g');
Where word is the word I'm trying to match.
When I run it though, the console shows the following error:
Uncaught (in promise) SyntaxError: Invalid regular expression:
/~<a.*?">(*SKIP)(*FAIL)|word/: Nothing to repeat
Am I escaping characters wrong?
I tried backslash-escaping every special character I could find (?, *, < and so on) in my JavaScript code and it still spat out that error.
You can work around the missing (*SKIP)(*FAIL) support in JavaScript using capturing groups in the pattern and a bit of code logic.
Note the (*SKIP)(*FAIL) verb sequence is explained in my YT video called "Skipping matches in specific contexts (with SKIP & FAIL verbs)". You can also find a demo of JavaScript lookarounds for four different scenarions: extracting, replacing, removing and splitting.
Let's adjust the code for the current question. Let's assume word always consists of word characters (digits, letters or underscores).
Extracting: Capture the word into Group 1 and only extract Group 1 values:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`<a .*?">|\b(${word})\b`, 'gi');
console.log(Array.from(text.matchAll(regex), x=>x[1]).filter(Boolean)); // => 1st word and `>foo<`
Removing: Capture the context you need to keep into Group 1 and replace with a backreference to this group:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
console.log(text.replace(regex, '$1')); // => foobar
Replacing: Capture the context you need to keep into Group 1 and when it is used, replace with Group 1 value, else, replace with what you need in a callback function/arrow function used as the replacement argument:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
console.log(text.replace(regex, (match, group1) => group1 || 'buz' ));
// => buz buz foobar
Splitting: This is the most intricate scenario and it requires a bit more coding:
const text = `foo foo foobar`;
const word = 'foo';
const regex = new RegExp(String.raw`(<a .*?">)|\b${word}\b`, 'gi');
let m, res = [], offset = 0;
while (m = regex.exec(text)) { // If there is a match and...
if (m[1] === undefined) { // if Group 1 is not matched
// put the substring to result array
res.push(text.substring(offset, m.index)) // Put the value to array
offset = m.index + m[0].length // Set the new chunk start position
}
}
if (offset < text.length) { // If there is any more text after offset
res.push(text.substr(offset)) // add it to the result array
}
console.log(res);
// => ["", " ", " foobar"]
Is it possible to have a dynamically generated string
var string= "a,b,c";
and make a regex out of that that accepts any of the strings words as accepted matches?
regExApproved = new RegExp(string, 'ig');
So that we can search a data-attribute of an element and show it if it has any of the accepted matches in the attribute
$(this).attr("data-payment").match(regExApproved)
The problem im facing is the regex takes the whole string and makes that the only match case.
I need the regex to break the string down and accept ANY word in it as a match
Is this possible?
I do not want to use a forloop and make LOTS of different regex Matches for each word in the string but maybe I have to?
I suggest using
var regExApproved = new RegExp(string.split(",").map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), 'i');
Then, to check if a regex matches a string, it makes more sense to use RegExp#test method:
regExApproved.test($(this).attr("data-payment"))
Notes:
.split(",") - splits into comma-separated chunks
.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')) (or .map(function(x) return x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'); })) - escapes the alternatives
.join('|') creates the final alternation pattern.
Edit
It seems that your string is an array of values. In that case, use
var string= ['visa','mastercard'];
var regExApproved = new RegExp(string.join('|'), 'i');
// If the items must be matched as whole words:
// var regExApproved = new RegExp('\\b(?:' + string.join('|') + ')\\b', 'i');
// If the array items contain special chars:
// var regExApproved = new RegExp(string.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), 'i');
console.log(regExApproved.test("There is MasterCard here"));
If your string is a list of comma-separated words, you can split it by comma then use a combination of Array methods to check if any word is matched inside the test string.
This is how should be your code:
string.split(",").map(s => new RegExp('\\b' + s + '\\b')).some(r => $(this).attr("data-payment").match(r))
Where we have:
string.split(",") to split the string by comma.
.map(s => new RegExp('\\b' + s + '\\b')) to return a relevant regex for each word in the string using word-boundary.
.some(r => $(this).attr("data-payment").match(r)) to check if our string matches any one of the created regexes.
Demo:
var string = "is,dog,car";
let str = "this is just a test";
if (string.split(",").map(s => new RegExp('\\b' + s + '\\b')).some(r => str.match(r))) {
console.log("Matched");
}
I have the following case that I am trying to solve.
Javascript Method that highlights keywords in a phrase.
vm.highlightKeywords = (phrase, keywords) => {
keywords = keywords.split(' ');
let highlightedFrase = phrase;
angular.forEach(keywords, keyword => {
highlightedFrase = highlightedFrase.replace(new RegExp(keyword + "(?![^<])*?>)(<\/[a-z]*>)", "gi"), function(match) {
return '<span class="highlighted-search-text">' + match + </span>';
});
});
return $sce.trustAsHtml(highlightedFrase)
}
How can I write a regular expression that will match this case so that I can replace the substrings
keyowrds = 'temperature high'
phrase = 'The temperature is <span class="highlight">hig</span>h'
ReGex Case
https://regex101.com/r/V8o6gN/5
If I'm not mistaken, your basically wanting to find each word that is a word in your keywords variable and match them in your string so you can wrap them in a span.
You'll want to first turn your keywords into a RegExp, then do a global match. Something like this:
const keywordsString = "cake pie cookies";
const keywords = keywordsString.split(/\s/);
// equivalent to: /(cake|pie|cookies)/g
const pattern = new RegExp(`(${keywords.join('|')})`, 'g');
const phrase = "I like cake, pie and cookies";
const result = phrase.replace(pattern, match => `<span>${match}</span>`);
console.log(result);
Basically, you want a pattern where your keywords are pipe (|) separated and wrapped in parentheses (()). Then you just want to do a global search (g flag) so you match all of them.
With the global flag, there is no need to do a loop. You can get them all in one shot.
Based on #samanime answer.
I remove duplicates, trim spaces and highlight longer words first.
The only problem is a matches that span an element boundary. For example ["elem", "lemen"] in "The element"
const tokens = [...new Set(state.highlight.split(' '))]
.map(s => s.trim())
.filter(s => s.length)
.sort((a,b) => b.length - a.length);
const pattern = new RegExp(`(${tokens.join('|')})`, 'ig');
const highlighted = this.state.value.replace(pattern, match => `<b>${match}</b>`);
console.log(highlighted);
I want to add a (variable) tag to values with regex, the pattern works fine with PHP but I have troubles implementing it into JavaScript.
The pattern is (value is the variable):
/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is
I escaped the backslashes:
var str = $("#div").html();
var regex = "/(?!(?:[^<]+>|[^>]+<\\/a>))\\b(" + value + ")\\b/is";
$("#div").html(str.replace(regex, "" + value + ""));
But this seem not to be right, I logged the pattern and its exactly what it should be.
Any ideas?
To create the regex from a string, you have to use JavaScript's RegExp object.
If you also want to match/replace more than one time, then you must add the g (global match) flag. Here's an example:
var stringToGoIntoTheRegex = "abc";
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
In the general case, escape the string before using as regex:
Not every string is a valid regex, though: there are some speciall characters, like ( or [. To work around this issue, simply escape the string before turning it into a regex. A utility function for that goes in the sample below:
function escapeRegExp(stringToGoIntoTheRegex) {
return stringToGoIntoTheRegex.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
}
var stringToGoIntoTheRegex = escapeRegExp("abc"); // this is the only change from above
var regex = new RegExp("#" + stringToGoIntoTheRegex + "#", "g");
// at this point, the line above is the same as: var regex = /#abc#/g;
var input = "Hello this is #abc# some #abc# stuff.";
var output = input.replace(regex, "!!");
alert(output); // Hello this is !! some !! stuff.
JSFiddle demo here.
Note: the regex in the question uses the s modifier, which didn't exist at the time of the question, but does exist -- a s (dotall) flag/modifier in JavaScript -- today.
If you are trying to use a variable value in the expression, you must use the RegExp "constructor".
var regex = "(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b";
new RegExp(regex, "is")
I found I had to double slash the \b to get it working. For example to remove "1x" words from a string using a variable, I needed to use:
str = "1x";
var regex = new RegExp("\\b"+str+"\\b","g"); // same as inv.replace(/\b1x\b/g, "")
inv=inv.replace(regex, "");
You don't need the " to define a regular expression so just:
var regex = /(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/is; // this is valid syntax
If value is a variable and you want a dynamic regular expression then you can't use this notation; use the alternative notation.
String.replace also accepts strings as input, so you can do "fox".replace("fox", "bear");
Alternative:
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(value)\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(" + value + ")\b/", "is");
var regex = new RegExp("/(?!(?:[^<]+>|[^>]+<\/a>))\b(.*?)\b/", "is");
Keep in mind that if value contains regular expressions characters like (, [ and ? you will need to escape them.
I found this thread useful - so I thought I would add the answer to my own problem.
I wanted to edit a database configuration file (datastax cassandra) from a node application in javascript and for one of the settings in the file I needed to match on a string and then replace the line following it.
This was my solution.
dse_cassandra_yaml='/etc/dse/cassandra/cassandra.yaml'
// a) find the searchString and grab all text on the following line to it
// b) replace all next line text with a newString supplied to function
// note - leaves searchString text untouched
function replaceStringNextLine(file, searchString, newString) {
fs.readFile(file, 'utf-8', function(err, data){
if (err) throw err;
// need to use double escape '\\' when putting regex in strings !
var re = "\\s+(\\-\\s(.*)?)(?:\\s|$)";
var myRegExp = new RegExp(searchString + re, "g");
var match = myRegExp.exec(data);
var replaceThis = match[1];
var writeString = data.replace(replaceThis, newString);
fs.writeFile(file, writeString, 'utf-8', function (err) {
if (err) throw err;
console.log(file + ' updated');
});
});
}
searchString = "data_file_directories:"
newString = "- /mnt/cassandra/data"
replaceStringNextLine(dse_cassandra_yaml, searchString, newString );
After running, it will change the existing data directory setting to the new one:
config file before:
data_file_directories:
- /var/lib/cassandra/data
config file after:
data_file_directories:
- /mnt/cassandra/data
Much easier way: use template literals.
var variable = 'foo'
var expression = `.*${variable}.*`
var re = new RegExp(expression, 'g')
re.test('fdjklsffoodjkslfd') // true
re.test('fdjklsfdjkslfd') // false
Using string variable(s) content as part of a more complex composed regex expression (es6|ts)
This example will replace all urls using my-domain.com to my-other-domain (both are variables).
You can do dynamic regexs by combining string values and other regex expressions within a raw string template. Using String.raw will prevent javascript from escaping any character within your string values.
// Strings with some data
const domainStr = 'my-domain.com'
const newDomain = 'my-other-domain.com'
// Make sure your string is regex friendly
// This will replace dots for '\'.
const regexUrl = /\./gm;
const substr = `\\\.`;
const domain = domainStr.replace(regexUrl, substr);
// domain is a regex friendly string: 'my-domain\.com'
console.log('Regex expresion for domain', domain)
// HERE!!! You can 'assemble a complex regex using string pieces.
const re = new RegExp( String.raw `([\'|\"]https:\/\/)(${domain})(\S+[\'|\"])`, 'gm');
// now I'll use the regex expression groups to replace the domain
const domainSubst = `$1${newDomain}$3`;
// const page contains all the html text
const result = page.replace(re, domainSubst);
note: Don't forget to use regex101.com to create, test and export REGEX code.
var string = "Hi welcome to stack overflow"
var toSearch = "stack"
//case insensitive search
var result = string.search(new RegExp(toSearch, "i")) > 0 ? 'Matched' : 'notMatched'
https://jsfiddle.net/9f0mb6Lz/
Hope this helps