Regex (or not) replace all matches with dynamic variables - javascript

Let's say that I have the following string taken from user input:
The ##firstvar## went to the ##secondvar## with the ##thirdvar##.
Where the values "firstvar" "secondvar" and "thirdvar" all came from user input as well, so they aren't known before runtime.
Is there a way to replace all the matches between sets of "##" with a corresponding cached variable?
Say for example I have these values cached:
cache[firstvar] = "dog"
cache[secondvar] = "river"
cache[thirdvar] = "cat"
I want the final output string to be:
The dog went to the river with the cat.
I've tried regex replace but can't figure it out when the replacements are dynamic like this.

You can replace them by using a function as second argument in String.prototype.replace().
const cache = { firstvar: "dog", secondvar: "river", thirdvar: "cat" },
text = "The ##firstvar## went to the ##secondvar## with the ##thirdvar##.",
regex = /##(.*?)##/g;
console.log( text.replace(regex, (_match, group1) => cache[group1]) );

Related

Replace text but not if contain specific characters?

In JavaScript, I am using the below code to replace text that matches a certain string. The replacement wraps the string like this: "A(hello)". It works great but if there are two strings that are the same, for example: "Hello hi Hello", only the first one will get marked and if I am trying twice, it will get marked double, like this: "A(A(Hello)) Hi Hello".
A solution to this could be to not replace a word if it contains "A(" or is between "A(" and ")"; both would work.
Any idea how it can be achieved?
Note: I cant use replaceAll because if there is already a word that is replaced and a new word is added, then the first one will be overwritten. Therefore I need a solution like above. For example,If I have a string saying "Hello hi", and I mark Hello, it will say "A(Hello) hi", but if I then add Hello again to the text and replace it, it will look like this: A(A(Hello)) hi A(Hello).
Here is what I got so far:
let text = "Hello hi Hello!"
let selection = "Hello"
let A = `A(${selection})`
let addWoman = text.replace(selection, A)
You can use a negative lookahead assertion in your pattern that fails the match if we A( before full word Hello:
(?<!A\()\bHello\b
And replace it with A($&)
RegEx Demo
Code:
let text = "Hello hi Hello!";
let selection = "Hello";
let A = `A(${selection})`;
let re = new RegExp(`(?<!A\\()\\b${selection}\\b`, "g");
let addWoman = text.replace(re, A);
console.log(addWoman);
console.log(addWoman.replace(re, A));
A solution to this could be to not replace a word if it contains "A(" or is between "A(" and ")"; both would work.
To avoid re-matching selection inside a A(...) string, you can match A(...) and capture it into a group so as to know if the group matched, it should be kept, else, match the word of your choice:
let text = "Hello hi Hello!"
let selection = "Hello"
let A = `A(${selection})`
const rx = new RegExp(String.raw`(A\([^()]*\))|${selection.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')}`, 'g')
let addWoman = text.replace(rx, (x,y) => y || A)
console.log(addWoman);
// Replacing the second time does not modify the string:
console.log(addWoman.replace(rx, (x,y) => y || A))
The regex will look like /(A\([^()]*\))|Hello/g, it matches
(A\([^()]*\)) - Group 1: A and then ( followed with zero or more chars other than ( and ) and then a ) char
| - or
Hello - a Hello string.

is there a way for the content.replace to sort of split them into more words than these?

const filter = ["bad1", "bad2"];
client.on("message", message => {
var content = message.content;
var stringToCheck = content.replace(/\s+/g, '').toLowerCase();
for (var i = 0; i < filter.length; i++) {
if (content.includes(filter[i])){
message.delete();
break
}
}
});
So my code above is a discord bot that deletes the words when someone writes ''bad1'' ''bad2''
(some more filtered bad words that i'm gonna add) and luckily no errors whatsoever.
But right now the bot only deletes these words when written in small letters without spaces in-between or special characters.
I think i have found a solution but i can't seem to put it into my code, i mean i tried different ways but it either deleted lowercase words or didn't react at all and instead i got errors like ''cannot read property of undefined'' etc.
var badWords = [
'bannedWord1',
'bannedWord2',
'bannedWord3',
'bannedWord4'
];
bot.on('message', message => {
var words = message.content.toLowerCase().trim().match(/\w+|\s+|[^\s\w]+/g);
var containsBadWord = words.some(word => {
return badWords.includes(word);
});
This is what i am looking at. the var words line. specifically (/\w+|\s+|[^\s\w]+/g);.
Anyway to implement that into my const filter code (top/above) or a different approach?
Thanks in advance.
Well, I'm not sure what you're trying to do with .match(/\w+|\s+|[^\s\w]+/g). That's some unnecessary regex just to get an array of words and spaces. And it won't even work if someone were to split their bad word into something like "t h i s".
If you want your filter to be case insensitive and account for spaces/special characters, a better solution would probably require more than one regex, and separate checks for the split letters and the normal bad word check. And you need to make sure your split letters check is accurate, otherwise something like "wash it" might be considered a bad word despite the space between the words.
A Solution
So here's a possible solution. Note that it is just a solution, and is far from the only solution. I'm just going to use hard-coded string examples instead of message.content, to allow this to be in a working snippet:
//Our array of bad words
var badWords = [
'bannedWord1',
'bannedWord2',
'bannedWord3',
'bannedWord4'
];
//A function that tests if a given string contains a bad word
function testProfanity(string) {
//Removes all non-letter, non-digit, and non-space chars
var normalString = string.replace(/[^a-zA-Z0-9 ]/g, "");
//Replaces all non-letter, non-digit chars with spaces
var spacerString = string.replace(/[^a-zA-Z0-9]/g, " ");
//Checks if a condition is true for at least one element in badWords
return badWords.some(swear => {
//Removes any non-letter, non-digit chars from the bad word (for normal)
var filtered = swear.replace(/\W/g, "");
//Splits the bad word into a 's p a c e d' word (for spaced)
var spaced = filtered.split("").join(" ");
//Two different regexes for normal and spaced bad word checks
var checks = {
spaced: new RegExp(`\\b${spaced}\\b`, "gi"),
normal: new RegExp(`\\b${filtered}\\b`, "gi")
};
//If the normal or spaced checks are true in the string, return true
//so that '.some()' will return true for satisfying the condition
return spacerString.match(checks.spaced) || normalString.match(checks.normal);
});
}
var result;
//Includes one banned word; expected result: true
var test1 = "I am a bannedWord1";
result = testProfanity(test1);
console.log(result);
//Includes one banned word; expected result: true
var test2 = "I am a b a N_N e d w o r d 2";
result = testProfanity(test2);
console.log(result);
//Includes one banned word; expected result: true
var test3 = "A bann_eD%word4, I am";
result = testProfanity(test3);
console.log(result);
//Includes no banned words; expected result: false
var test4 = "No banned words here";
result = testProfanity(test4);
console.log(result);
//This is a tricky one. 'bannedWord2' is technically present in this string,
//but is 'bannedWord22' really the same? This prevents something like
//"wash it" from being labeled a bad word; expected result: false
var test5 = "Banned word 22 isn't technically on the list of bad words...";
result = testProfanity(test5);
console.log(result);
I've commented each line thoroughly, such that you understand what I am doing in each line. And here it is again, without the comments or testing parts:
var badWords = [
'bannedWord1',
'bannedWord2',
'bannedWord3',
'bannedWord4'
];
function testProfanity(string) {
var normalString = string.replace(/[^a-zA-Z0-9 ]/g, "");
var spacerString = string.replace(/[^a-zA-Z0-9]/g, " ");
return badWords.some(swear => {
var filtered = swear.replace(/\W/g, "");
var spaced = filtered.split("").join(" ");
var checks = {
spaced: new RegExp(`\\b${spaced}\\b`, "gi"),
normal: new RegExp(`\\b${filtered}\\b`, "gi")
};
return spacerString.match(checks.spaced) || normalString.match(checks.normal);
});
}
Explanation
As you can see, this filter is able to deal with all sorts of punctuation, capitalization, and even single spaces/symbols in between the letters of a bad word. However, note that in order to avoid the "wash it" scenario I described (potentially resulting in the unintentional deletion of a clean message), I made it so that something like "bannedWord22" would not be treated the same as "bannedWord2". If you want it to do the opposite (therefore treating "bannedWord22" the same as "bannedWord2"), you must remove both of the \\b phrases in the normal check's regex.
I will also explain the regex, such that you fully understand what is going on here:
[^a-zA-Z0-9 ] means "select any character not in the ranges of a-z, A-Z, 0-9, or space" (meaning all characters not in those specified ranges will be replaced with an empty string, essentially removing them from the string).
\W means "select any character that is not a word character", where "word character" refers to the characters in ranges a-z, A-Z, 0-9, and underscore.
\b means "word boundary", essentially indicating when a word starts or stops. This includes spaces, the beginning of a line, and the end of a line. \b is escaped with an additional \ (to become \\b) in order to prevent javascript from confusing the regex token with strings' escape sequences.
The flags g and i used in both of the regex checks indicate "global" and "case-insensitive", respectively.
Of course, to get this working with your discord bot, all you have to do in your message handler is something like this (and be sure to replace badWords with your filter variable in testProfanity()):
if (testProfanity(message.content)) return message.delete();
If you want to learn more about regex, or if you want to mess around with it and/or test it out, this is a great resource for doing so.

javascript replace tokens using string split

Hi I have a text something like
"Welcome back ##Firstname ##Lastname. You Last accessed on ##Date"
My objective is to replace these tokens with actual values.
So what i did was
var str = "Welcome back ##Firstname ##Lastname. You Last accessed on ##Date;
var data = str.split('#');
My idea was - once i do this, my data will have an array of values something like
["Welcome back", "#FirstName" , "#LastName", "You Last accessed on" , "#Date"]
Once i have this, i can easily replace the tokens because i will know which one are properties and which one are static string. But fool i am since JS has other ideas.
it instead split it as :
["Welcome back ", "", "Firstname ", "", "Lastname. You Last accessed on ", "", "Date"]
What am i doing wrong? or what is the best way to replace tokens in a string?
I looked here. Did not like the approach much. Not a fan of curly brackets. would like to do it the "#" way - Since it will be easy for Content authors
Another regex option, split on /#(#\w+)[^\w#]+/, captures the name part while throwing off the first #, assuming the name identifiers are always made up of word characters:
var str = "Welcome back ##Firstname ##Lastname. You Last accessed on ##Date;"
var data = str.split(/#(#\w+)[^\w#]+/);
console.log(data.filter(s => s !== ""));
You can split "#" characters which are followed by "#" characters
var str = "Welcome back ##Firstname ##Lastname. You Last accessed on ##Date";
var res = str.split(/#(?=#)/);
console.log(res);
you can replace the '##' by some character followed by '#' like ',#' and then replace on that new character.
var str = "Welcome back ##Firstname ##Lastname. You Last accessed on ##Date";
var data = str.replace(/##/g, ',#').split(',');
console.log(data);
As your delemiters end with a space, may split by space:
.split(" ")
And then iterate and replace all words beginning with ##
var replaceBy={
lastname:"Doe",
name:"John"
}
var result= input.split(" ").map(function(word){
if(word[0]=="#" && word[1]=="#"){
return replaceBy[word.substr(2)] || "error";
}
return word:
}).join(" ");
However, it might be easier to suround your identifiers with delemiters e.g.:
Hi ##lastname##!
So you can do
.split("##")
And every second element is automatically an identifier.
Do this:
data = str.replace("##Firstname", "Robert").replace("##Lastname","Polson").replace('##Date','yesterday') ;

Replace words in text

I'm programming a part of a Web application in which I replace words from a text. I used the Replace function, but I replaced text that I do not want (below put an example). Now I have implemented a function that by splitting the text into words, but when I want to replace two contiguous words in the text. Obviously, it doesn't work.
The first option:
var str = "iRobot Roomba balbalblablbalbla";
str.replace(/robot/gi, 'Robota');
output -> iRobota Roomba ........(fail !)
Second code:
var patterns: [
{
match: 'robot',
replacement: 'Robota'
},{
match: 'ipad',
replacement: 'tablet'
},
......... more
];
var temp = str.split(' ');
var newStr = temp.map(function(el) {
patterns.forEach(function(item) {
if( el.search( new RegExp( '^'+item.match+'$', 'gi') ) > -1 ) {
el = item.replacement;
return el;
}
});
return el;
});
return newStr.join(' ');
With this last code does not replace a two-word text, as the check only makes one. I have been searching the Internet for some solution and I have not found anything similar.
I just happen to do a split of the word to check (item.match) and if it have more than one element, create a temporal variable and check the contiguous elements, but I guess it affects performance and I do not know if there is a better and easier option.
Can anyone think of a better option?
Thanks !
As I understand, you only want to match whole words and not sub-strings.
The solution would be to add word boundaries to your regex :
str.replace(/\brobot\b/gi, 'Robota');
This will only match whole "robot" words.

Dynamically match multiple words in a string to another one using regExp

We all know that regExp can match two strings using a common word in both strings as criterion for the match. This code will match "I like apples" to "apples" because they both have "apples" in common.
var str1 = "apples";
var test1 = "I like apples";
var reg1 = new RegExp(str1, "i");
if (test1.match(reg1)) {
console.log(str1 + " is matched");
}
But what if instead of matching the strings using a single common word (or a common part of the strings), I need to match the two strings using multiple common words which may be separated by other words ? Let me show you an example :
test2 = "I like juicy red fruits" should match str2 = "juicy fruits" because test2 contains all of str2 second key words (see object bellow), even though it has other words inbetween.
Tricky part is that I can't know exaclty what one the strings will be, so I have to match them dynamically. The string I don't know is the value of a user input field, and there are many possibilities. This value is to be matched to the strings stored in this object :
var str2 = {
"red apples": "fruits",
"juicy fruits": "price : $10"
};
So whatever the user types in the input field, it must be a match if and only if it contains all the words of one of the object properties. So in this example, "juicy red and ripe fruits" should match the second object property, because it contains all of its keywords.
Edit : My goal is to output the value associated to the strings I'm matching. In my example if the first key is matched, 'fruits' should be output. If 'juicy fruits' is matched, it should output 'price : $10'. Getting the strings associated to the object keys is the reason why the user has to search for them using the input.
Is it possible to do this with regExp using pure javascript ?
Here is what I (poorly) tried to do : https://jsfiddle.net/Hal_9100/fzhr0t9q/1/
For the situation you're describing, you don't even need regular expressions. If you split the search string on spaces; you can check every one of the words to match is contained within the array of search words.
function matchesAllWords(searchWords, inputString) {
var wordsToMatch = inputString.toLowerCase().split(' ');
return wordsToMatch.every(
word => searchWords.indexOf(word) >= 0);
}
In the snippet below, typing in the input causes a recalculation of the searchWords. The matching li elements are then given the .match class to highlight them.
function updateClasses(e) {
var searchWords = e.target.value.toLowerCase().split(' ');
listItems.forEach(listItem => listItem.classList.remove('match'));
listItems.filter(
listItem =>
matchesAllWords(searchWords, listItem.innerText))
.forEach(
matchingListItem =>
matchingListItem.classList.add('match'));
}
function matchesAllWords(searchWords, inputString) {
var wordsToMatch = inputString.toLowerCase().split(' ');
return wordsToMatch.every(
word => searchWords.indexOf(word) >= 0);
}
function searchProperties(e) {
var searchWords = e.target.value.toLowerCase().split(' ');
for (var property in propertiesToSearch) {
if (matchesAllWords(searchWords, property)) {
console.log(property, propertiesToSearch[property]);
}
}
}
var propertiesToSearch = {
"red apples": 1,
"juicy fruit": 2
};
listItems = [].slice.call(
document.getElementById('matches').querySelectorAll('li')
);
document.getElementById('search').addEventListener('keyup', updateClasses);
document.getElementById('search').addEventListener('keyup', searchProperties);
.match {
color: green;
}
<label for="search">
Search:
</label>
<input type="text" name="search" id="search" />
<ul id="matches">
<li>red apples</li>
<li>juicy fruits</li>
</ul>
Update To use this kind of implementation to search for a property, use a for .. in loop like below. Again, see the snippet for this working in context.
function searchProperties(e) {
var searchWords = e.target.value.toLowerCase().split(' ');
for (var property in propertiesToSearch) {
if (matchesAllWords(searchWords, property)) {
console.log(property, propertiesToSearch[property]);
}
}
}
var propertiesToSearch = {
"red apples": 1,
"juicy fruit": 2
};
I think you might benefit from transforming your data structure from an object literal to an array like so:
const wordGroupings = [
'red apples',
'juicy fruits'
];
I wrote a processString function that accepts a string and some word-groupings as inputs and returns a filtered list of the word-groupings from where each word in the word-grouping occurs in the input string.
In other words let's imagine the test string is:
const testString = 'I like big, frozen whales.';
And let's further imagine that your wordGroupings array looked something like this:
const wordGroupings = [
'whales frozen',
'big frozen whales toucan',
'big frozen',
'sweaty dance club',
'frozen whales big'
]
The output of calling processString(wordGroupings, testString) would be:
[
'whales frozen',
'big frozen',
'frozen whales big'
]
If this is what you're looking for, here is the implementation of processString(..):
function processString(wordGroupings, testString) {
var regexes = wordGroupings.map(words => ({
origString: words,
regex: new RegExp(words.replace(/\s+/g, '|'), 'g'),
expCount: words.split(/\s+/g).length
})
);
filtered = regexes.filter(({regex, expCount}) =>
(testString.match(regex) || []).length === expCount
);
return filtered.map(dataObj => dataObj.origString);
}
Hope this helps!

Categories