JS - Get original value of string replace using regex - javascript

We have a string:
var dynamicString = "This isn't so dynamic, but it will be in real life.";
User types in some input:
var userInput = "REAL";
I want to match on this input, and wrap it with a span to highlight it:
var result = " ... but it will be in <span class='highlight'>real</span> life.";
So I use some RegExp magic to do that:
// Escapes user input,
var searchString = userInput.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");
// Now we make a regex that matches *all* instances
// and (most important point) is case-insensitive.
var searchRegex = new RegExp(searchString , 'ig');
// Now we highlight the matches on the dynamic string:
dynamicString = dynamicString.replace(reg, '<span class="highlight">' + userInput + '</span>');
This is all great, except here is the result:
console.log(dynamicString);
// -> " ... but it will be in <span class='highlight'>REAL</span> life.";
I replaced the content with the user's input, which means the text now gets the user's dirty case-insensitivity.
How do I wrap all matches with the span shown above, while maintaining the original value of the matches?
Figured out, the ideal result would be:
// user inputs 'REAL',
// We get:
console.log(dynamicString);
// -> " ... but it will be in <span class='highlight'>real</span> life.";

You'd use regex capturing groups and backreferences to capture the match and insert it in the string
var searchRegex = new RegExp('('+userInput+')' , 'ig');
dynamicString = dynamicString.replace(searchRegex, '<span class="highlight">$1</span>');
FIDDLE

You can use it without capturing groups too.
dynamicString = text.replace(new RegExp(userInput, 'ig'), '<span class="highlight">$&</span>');

Related

Highlighting lines that contain a phrase using regex

I'm highlighting lines that contain a certain phrase using regex.
My current highlight function will read the whole text and place every instance of the phrase within a highlight span.
const START = "<span name='highlight' style='background-color: yellow;'>";
const END = "</span>"
function highlight(text, toReplace) {
let reg = new RegExp(toReplace, 'ig');
return text.replace(reg, START + toReplace + END);
}
I want to expand my regex so that, for each phrase, it highlights from the preceding <br> to the following <br>.
highlight("This<br>is some text to<br>highlight.", "text");
Current output:
This<br>is some<span name="highlight" style="background-color:yellow;">text</span> to<br>highlight."
Wanted output:
This<br><span name="highlight" style="background-color:yellow;">is some text to</span><br>highlight.
You may want to match all chars other than < and > before and after the text and it is advisable to escape the literal text you pass to the RegExp constructor. Also, to replace with the whole match, just use $& placeholder:
const START = "<span name='highlight' style='background-color: yellow;'>";
const END = "</span>"
function highlight(text, toReplace) {
let reg = new RegExp("(<br/?>)|[^<>]*" + toReplace.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&') + "[^<>]*", 'ig');
return text.replace(reg, function ($0,$1) { return $1 ? $1 : START + $0 + END; });
}
console.log(highlight("This<br>is some text to<br>highlight.", "text"));
console.log(highlight("This<br>is a bunch of<br>text", "b"));
The regex will look like /[^<>]*text[^<>]*/gi, it will match 0 or more chars other than < and >, then text in a case insensitive way and then again 0 or more chars other than < and >, and the $& in the replacement will put the matched value into the highlighting tags.
My guess is that this simple expression,
(<br>)(.*?)(\1)
might work here.
const regex = /(<br>)(.*?)(\1)/gs;
const str = `This<br>is some text to<br>highlight. This<br>is some text to<br>highlight. This<br>is some text to<br>highlight.
This<br>is some
text to<br>highlight. This<br>is some text to<br>highlight. This<br>is some text to<br>highlight.`;
const subst = `$1<span name='highlight' style='background-color: yellow;'>$2</span>$3`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log(result);
In this demo, the expression is explained, if you might be interested.

How can I ban a word

I want to replace a bad word with asterisks ***. However, there is a problem when the bad word is contained in an another word I don't want to replace it.
for(var i = 0; i < forbidden.length; i++) {
if(textBoxValue.search(forbidden[i]) > -1) {
textBoxValue = textBoxValue.replace(forbidden[i], '');
}
}
For example if the bad word is "are", if it is in another word like "aren't" I don't want it to appear as "***n't". I only want to replace the word if it is by itself.
One option is to use a regular expression with a word boundary on each side, to ensure that a matched word is standalone:
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(new RegExp('\\b' + word + '\\b', 'g'), '');
});
For example:
let textBoxValue = 'bigwordfoo foo bar barbaz';
const forbidden = ['foo', 'bar'];
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(new RegExp('\\b' + word + '\\b', 'g'), '');
});
console.log(textBoxValue);
If you actually want to replace with asterisks, and not the empty string, use a replacer function instead:
let textBoxValue = 'bigwordfoo foo bar barbaz';
const forbidden = ['foo', 'bar'];
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(
new RegExp('\\b' + word + '\\b', 'g'),
word => '*'.repeat(word.length)
);
});
console.log(textBoxValue);
Of course, note that word restrictions are generally pretty easy to overcome by anyone who really wants to. Humans can almost always come up with ways to fool heuristics.
If any of the words to blacklist contain special characters in a regular expression, escape them first before passing to new RegExp:
const escape = s => s.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
let textBoxValue = 'bigwordfoo foo ba$r ba$rbaz';
const forbidden = ['foo', 'ba$r'];
forbidden.forEach((word) => {
textBoxValue = textBoxValue.replace(
new RegExp('\\b' + escape(word) + '\\b', 'g'),
word => '*'.repeat(word.length)
);
});
console.log(textBoxValue);
You can create a dynamic regex with all the forbidden words separated by a | to create an alternation. You can wrap this with word boundary (\b) to replace only full word matches.
For the following list of forbidden words, the dynamic regex ends up being
/\b(?:bad|nasty|dreadful)\b/g
The second parameter to replace, gets the matched word as a parameter. You can use repeat to get * repeated the same number of times as the length of the word to be replaced
function replaceBadWords(textBoxValue, forbidden) {
const regex = new RegExp(`\\b(?:${forbidden.join('|')})\\b`, 'g')
return textBoxValue.replace(regex, m => "*".repeat(m.length))
}
const forbidden = ['bad', 'nasty', 'dreadful']
console.log(replaceBadWords('string with some nasty words in it', forbidden))
console.log(replaceBadWords("bad gets replaced with asterisks but badminton won't", forbidden))
If you're not yet using a library (Or if you want to use one)
You can check this repo out.
First, they already have a list of bad words so you don't need to think about them and think what you missed.
They support placeholders like:
var Filter = require('bad-words');
var customFilter = new Filter({ placeHolder: 'x'});
customFilter.clean('Don't be an ash0le'); //Don't be an xxxxxx
and you can add your own bad words like or remove it:
var filter = new Filter();
// add to list
filter.addWords('some', 'bad', 'word');
// remove from list
filter.removeWords('hells', 'sadist');
And also a multi lingual support if you have the correct regex.

Regex match on string only, not substrings

I'm adding strings to a textarea when values in a table are clicked. It has to be possible to select and deselect values in the table, and they will add/remove themselves from the textarea. The textarea has to be a string, and the added values can't be wrapped in any other characters.
The values that are being added could potentailly have any characters in, and may have one of the other of the values as a substring, here are some examples: HOLE 1, HOLE 11, HOLE 17, HOLE (1), cutaway, cut, away, cut-away, Commentator (SMITH, John), (GOAL), GOAL
Once a value has been appended to the textarea, and it's clicked again to deselect it, I'm searching for the value and removing it like so:
var regex = new RegExp("(?![ .,]|^)?(" + mySelectedText + ")(?=[., ]|$)", 'g');
var newDescriptionText = myTextAreaText.replace(regex, '');
The regex matches correctly for strings/substrings of text e.g. cutaway and away however wont work for anything beginning with a bracket e.g. (GOAL). Adding the word boundary selector to the start of the expression \b, will make the regex match for strings that start with a bracket but wont work for strings/substrings containing the same text.
Is there a way to achieve this using regex? Or some other method?
Here's a working CodePen example of the adding/removing from table.
You can use word boundaries (\b) to avoid issue when you deselect away and have cutaway in the list. Just change the regex to:
regex = new RegExp("(?![ .,]|^)?(\\b" + cellText + "\\b)(?=[., ]|$)", 'g');
^^^ ^^^
Here's the code I changed to make it works:
removeFromDescription = function(cell) {
cell.classList.remove(activeClass);
// Remove from the active cells arry
var itemIndex = tempAnnotation.activeCells.indexOf(cell.textContent);
tempAnnotation.activeCells.splice(itemIndex, 1);
// Do the regex find/replace
var annotationBoxText = annotation.value,
cellText = regexEscape(cell.textContent), // Escape any funky characters from the string
regex = new RegExp("(^| )" + cellText + "( |$)", 'g');
var newDescription = annotationBoxText.replace(regex, ' ');
setAnnotationBoxValue(newDescription);
console.info('cellText: ', cellText);
console.info('annotationBoxText:', annotationBoxText);
console.info('newDescription: ', newDescription);
};
regexEscape = function(s) {
return s.replace(/([-\/\\^$*+?.()|[\]{}])/g, `\\$&`);
};
setAnnotationBoxValue = function(newValue) {
annotation.value = newValue;
};

Regular Expression to match compound words using only the first word

I am trying to create a regular expression in JS which will match the occurences of box and return the full compound word
Using the string:
the box which is contained within a box-wrap has a box-button
I would like to get:
[box, box-wrap, box-button]
Is this possible to match these words only using the string box?
This is what I have tried so far but it does not return the results I desire.
http://jsfiddle.net/w860xdme/
var str ='the box which is contained within a box-wrap has a box-button';
var regex = new RegExp('([\w-]*box[\w-]*)', 'g');
document.getElementById('output').innerHTML=str.match(regex);
Try this way:
([\w-]*box[\w-]*)
Regex live here.
Requested by comments, here is a working example in javascript:
function my_search(word, sentence) {
var pattern = new RegExp("([\\w-]*" + word + "[\\w-]*)", "gi");
sentence.replace(pattern, function(match) {
document.write(match + "<br>"); // here you can do what do you want
return match;
});
};
var phrase = "the box which is contained within a box-wrap " +
"has a box-button. it is inbox...";
my_search("box", phrase);
Hope it helps.
I'll just throw this out there:
(box[\w-]*)+
You can use this regex in JS:
var w = "box"
var re = new RegExp("\\b" + w + "\\S*");
RegEx Demo
This should work, note the 'W' is upper case.
http://www.w3schools.com/jsref/jsref_obj_regexp.asp
\Wbox\W
It looks like you're wanting to use the match with a regex. Match is a string method that will take a regex as an argument and return an array containing matches.
var str = "your string that contains all of the words you're looking for";
var regex = /you(\S)*(?=\s)/g;
var returnedArray = str.match(regex);
//console.log(returnedArray) returns ['you', 'you\'re']

Replace *variable* text but keep case

I need to replace some text in a javascript variable without losing case, ie.
"MAT" ==> "MATch string in db" instead of "Mat" ==> "match string in db"
Right now I'm doing this (doesn't keep the original case):
var user_str = $("#user-input").val();
var db_match = db_match.replace(new RegExp(user_str, "ig") , '<span class="u b match">' + user_str + '</span>');
I found cases where people needed to replace static text, but in my case I need to replace variable content instead.
Obviously this doesn't work either:
var user_str = $("#user-input").val();
db_match = db_match.replace(/(user_str)/ig, "<span class=u b match>$1</span>");
Any idea how to do this?
Use a group and build the regular expression
var str = "test";
var text = "Test this";
var re = new RegExp("(" + str + ")","gi");
console.log(text.replace(re, "<span>$1</span>")) //"<span>Test</span> this"

Categories