I have a forum and I would like to automatically parse some of the major links. For example, if a user makes a post like this:
You should visit StackOverflow. I found it on Wikipedia.
it would automatically parse it like this:
You should visit StackOverflow. I found it on Wikipedia.
Is this even doable using JavaScript only?
Thanks for assistance. :-)
What you want to create a clean and extensible code is create a library of word => link then you can iterate over that and do your replace inside your code.
Here is a fiddle demo doing that http://jsfiddle.net/MjV84/
$(function() {
var text = $('#foo').text(),
library = {
stackoverflow: 'http://stackoverflow.com',
wikipedia: 'http://wikipedia.com'
},
name;
for (name in library) {
text = text.replace(new RegExp(name, 'gi'), function(word) {
return ''+word+'';
});
};
$('#foo ').html(text);
});
If you're pre-processing the text, you can use the replace function with a callback and a regular expression using an alternation:
var str = "You should visit StackOverflow. I found it on Wikipedia.";
str = str.replace(/StackOverflow|Wikipedia|etc/gi, function(m) {
var href;
switch (m.toLowerCase()) {
case "stackoverflow";
href = "http://stackoverflow.com";
break;
case "wikipedia";
href = "http://en.wikipedia.org";
break;
// ...and so on...
}
return '' + m + '';
});
YMMD points out that the above requires defining each keyword twice, which is true. When I've had to do this with a large number of keywords, I've done it by having an object with the keywords as keys, the href values as values, and built the expression dynamically:
// ==== Setup code you presumably do once
// The substitutions -- make sure the keys are in lower case
var substitutions = {
"stackoverflow": "http://stackoverflow.com",
"wikipedia": "http://en.wikipedia.org",
// ...and so on...
};
// Build the regex. Here I've used `Object.keys` which is an ES5 feature
// but you can use an ES5 shim (since it's something a shim can provide).
// Note that if your keywords include any special regular expression
// characters, you'll have to loop through the keys manually and escape
// those.
var subrex = new RegExp(Object.keys(substitutions).join("|"), "gi");
// ==== Where you're using it
var str = "You should visit StackOverflow. I found it on Wikipedia.";
str = str.replace(subrex, function(m) {
return '' + m + '';
});
Live example | source
Yes, use String.replace(regex, replaceString) to do that.
Here is an example:
var text = "You should visit StackOverflow. I found it on Wikipedia.";
var newText=text.replace(/stackoverflow/gi,
"<a href='http://www.stackoverflow.com/'>StackOverflow</a>");
The g stands for global, so it will replace all instances, and the i means case-insensitive search.
In case you are replacing common words, like "dictionary" to link to dictionary.com it would be better if you only replaced it if your users added a special tag, for example:
"You should visit StackOverflow. I found it on Wikipedia."
shouldn't be replaced with links unless it was written like this:
"You should visit &StackOverflow. I found it on Wikipedia."
Then your method would just need to add the special symbol.
Also, I would have the data in an array like this:
var linkArray = [ ["StackOverflow", "http://www.stackoverflow.com/", "Description"],
["Wikipedia", "http://wikipedia.org/", "Free encyclopedia"] ];
Then create a loop to find and replace the instances:
function addLinks(textInput) {
for (var i=0; i<linkArray.length; i++) {
textInput = addLink(textInput, linkArray[i]);
}
return textInput;
}
function addLink(textInput, link) {
var replaceString = "<a href=\"" + link[1] + "\" title=\""
+ link[2] + "\">"
+ link[0] + "</a>";
return textInput.replace(new RegExp("&"+link[0], "gi"), replaceString);
}
All the previous answers using the i modifier on the regular expression fail if the target
string contains variants of the substitution strings differing by case. This is because the
target string substring does not match the substitutions attribute name.
This version solves this by capturing each of the substitution strings and searching the arguments array for the found string.
function substitute (str) { 'use strict';
var substitutions = {
"Stack Overflow": "http://stackoverflow.com",
"Wikipedia": "http://en.wikipedia.org",
// ...and so on...
},
skeys = Object.keys (substitutions);
// build regexp which will capture each match separtely
return str.replace (new RegExp ('(' + skeys.join(")|(") + ')', "gi"), function (m0) {
// Now scan the arguments array (omitting the last two arugments which
// are the source string and match index)
for (var ai, i = arguments.length - 2; --i;) {
// The index of the argument (less 1) corresponds to the index in skeys of
// the name in the substitutions
if ((ai = arguments[i])) {
return '' + ai + '';
}
}
return m0;
});
}
var str = "You should visit stack overflow. I found it on Wikipedia.";
// check in console log that links are correctly built.
console.log (substitute (str));
document.write (substitute (str));
See the jsfiddle : http://jsfiddle.net/NmGGN/
Related
I'm trying to do something that would be similar to turning a url slug-like variable into text that could be used for a title.
So, I have a variable for example that is like this:
var thisID = 'athlete-profile';
function myFunc(thisID) {
// i need to use thisID as the id and href in a loop that generates a string of <li><a>'s\
function makeTitle(thisID) {
// convert thisID to text so for this example it would return 'Athlete Profile'
return 'Athlete Profile';
}
for () {
var str = '<li id="'+thisID+'">'+makeTitle(thisID)+'';
}
// make sense?
}
I'd like to not use a regex to do this if possible somehow, but I don't think there's a way to do it without one. So any one who knows how to do this type of thing let me know, it would be a great help.
Thanks
I would advise you to use regular expression. But if you really don't want to use regular expressions, the solution below would work for simple cases. Feel free to modify it as you like it.
function makeTitle(slug) {
var words = slug.split('-');
for (var i = 0; i < words.length; i++) {
var word = words[i];
words[i] = word.charAt(0).toUpperCase() + word.slice(1);
}
return words.join(' ');
}
console.log(
makeTitle("athlete-profile")
)
function titleize(slug) {
var words = slug.split("-");
return words.map(function(word) {
return word.charAt(0).toUpperCase() + word.substring(1).toLowerCase();
}).join(' ');
}
console.log(titleize("athlete-profile"))
It works pretty simply:
It splits the string by - into words.
It maps each word into title case.
It joins the resulting words with spaces.
Do it in one line:
'athlete-profile'.split("-").join(" ").replace(/\w\S*/g, function(txt){return txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase()})
Output: Athlete Profile
The makeTitle() part of your question can be implemented something like this:
function makeTitle(thisID) {
return thisID.replace(/-/g, " ").replace(/\b[a-z]/g, function() {
return arguments[0].toUpperCase();
});
}
console.log(makeTitle("athlete-profile"))
The first .replace() changes all hyphens to spaces, and then the second .replace() takes any lower-case letter that follows a word boundary and makes it upper-case.
(For more information see the MDN doco for .replace().)
As far as doing it without using regular expressions, I'm not sure why you'd specifically want to avoid them, especially when the required expressions are pretty simple in this case (especially if you do the hyphen to space and first letter capitalisation in two steps as shown above). But there are endless ways to do this without regex using various combinations of JavaScript's string manipulation methods.
Do it like this
let someString = 'im a string';
console.log(someString.replace(/-/g, ' ')
.replace(/\w\S*/g, function (txt) {
return
txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase()
})
)
Output: Im A String
Short and great way:
const slugToText = (slug) => {
return slug.toLowerCase().replace(/-/g,' ')
}
Much Simplified answer
we can use String.prototype.replaceAll method to easily achieve this
function convertSlugToString(slug) {
return slug.replaceAll("-", " ");
}
incase you want to make sure the output is all lowercase then you can do the following
function convertSlugToString(slug) {
return slug.toLowerCase().replaceAll("-", " ");
}
Additional info:
String.prototype.replaceAll() is a ES2021 feature and it also has a great browser support with 93.64% global coverage, click here for more info
if you want to support IE then refer to the other answers
I am allowing my users to wrap words with "*", "/", "_", and "-" as a shorthand way to indicate they'd like to bold, italicize, underline, or strikethrough their text. Unfortunately, when the page is filled with text using this markup, I'm seeing a noticeable (borderline acceptable) slow down.
Here's the JavaScript I wrote to handle this task. Can you please provide feedback on how I could speed things up?
function handleContentFormatting(content) {
content = handleLineBreaks(content);
var bold_object = {'regex': /\*(.|\n)+?\*/i, 'open': '<b>', 'close': '</b>'};
var italic_object = {'regex': /\/(?!\D>|>)(.|\n)+?\//i, 'open': '<i>', 'close': '</i>'};
var underline_object = {'regex': /\_(.|\n)+?\_/i, 'open': '<u>', 'close': '</u>'};
var strikethrough_object = {'regex': /\-(.|\n)+?\-/i, 'open': '<del>', 'close': '</del>'};
var format_objects = [bold_object, italic_object, underline_object, strikethrough_object];
for( obj in format_objects ) {
content = handleTextFormatIndicators(content, format_objects[obj]);
}
return content;
}
//#param obj --- an object with 3 properties:
// 1.) the regex to search with
// 2.) the opening HTML tag that will replace the opening format indicator
// 3.) the closing HTML tag that will replace the closing format indicator
function handleTextFormatIndicators(content, obj) {
while(content.search(obj.regex) > -1) {
var matches = content.match(obj.regex);
if( matches && matches.length > 0) {
var new_segment = obj.open + matches[0].slice(1,matches[0].length-1) + obj.close;
content = content.replace(matches[0],new_segment);
}
}
return content;
}
Change your regex with the flags /ig and remove the while loop.
Change your for(obj in format_objects) loop with a normal for loop, because format_objects is an array.
Update
Okay, I took the time to write an even faster and simplified solution, based on your code:
function handleContentFormatting(content) {
content = handleLineBreaks(content);
var bold_object = {'regex': /\*([^*]+)\*/ig, 'replace': '<b>$1</b>'},
italic_object = {'regex': /\/(?!\D>|>)([^\/]+)\//ig, 'replace': '<i>$1</i>'},
underline_object = {'regex': /\_([^_]+)\_/ig, 'replace': '<u>$1</u>'},
strikethrough_object = {'regex': /\-([^-]+)\-/ig, 'replace': '<del>$1</del>'};
var format_objects = [bold_object, italic_object, underline_object, strikethrough_object],
i = 0, foObjSize = format_objects.length;
for( i; i < foObjSize; i++ ) {
content = handleTextFormatIndicators(content, format_objects[i]);
}
return content;
}
//#param obj --- an object with 2 properties:
// 1.) the regex to search with
// 2.) the replace string
function handleTextFormatIndicators(content, obj) {
return content.replace(obj.regex, obj.replace);
}
Here is a demo.
This will work with nested and/or not nested formatting boundaries. You can omit the function handleTextFormatIndicators altogether if you want to, and do the replacements inline inside handleContentFormatting.
Your code is forcing the browser to do a whole lot of repeated, wasted work. The approach you should be taking is this:
Concoct a regex that combines all of your "target" regexes with another that matches a leading string of characters that are not your special meta-characters.
Change the loop so that it does the following:
Grab the next match from the source string. That match, due to the way you changed your regex, will be a string of non-meta characters followed by your matched portion.
Append the non-meta characters and the replacement for the target portion onto a separate array of strings.
At the end of that process, the separate accumulator array can be joined and used to replace the content.
As to how to combine the regular expressions, well, it's not very pretty in JavaScript but it looks like this. First, you need a regex for a string of zero or more "uninteresting" characters. That should be the first capturing group in the regex. Next should be the alternates for the target strings you're looking for. Thus the general form is:
var tokenizer = /(uninteresting pattern)?(?:(target 1)|(target 2)|(target 3)| ... )?/;
When you match that against the source string, you'll get back a result array that will contain the following:
result[0] - entire chunk of string (not used)
result[1] - run of uninteresting characters
result[2] - either an instance of target type 1, or null
result[3] - either an instance of target type 2, or null
...
Thus you'll know which kind of replacement target you saw by checking which of the target regexes are non empty. (Note that in your case the targets can conceivably overlap; if you intend for that to work, then you'll have to approach this as a full-blown parsing problem I suspect.)
You can do things like:
function formatText(text){
return text.replace(
/\*([^*]*)\*|\/([^\/]*)\/|_([^_]*)_|-([^-]*)-/gi,
function(m, tb, ti, tu, ts){
if(typeof(tb) != 'undefined')
return '<b>' + formatText(tb) + '</b>';
if(typeof(ti) != 'undefined')
return '<i>' + formatText(ti) + '</i>';
if(typeof(tu) != 'undefined')
return '<u>' + formatText(tu) + '</u>';
if(typeof(ts) != 'undefined')
return '<del>' + formatText(ts) + '</del>';
return 'ERR('+m+')';
}
);
}
This will work fine on nested tags, but will not with overlapping tags, which are invalid anyway.
Example at http://jsfiddle.net/m5Rju/
I have a paragraph that's broken up into an array, split at the periods. I'd like to perform a regex on index[i], replacing it's contents with one instance of each letter that index[i]'s string value has.
So; index[i]:"This is a sentence" would return --> index[i]:"thisaenc"
I read this thread. But i'm not sure if that's what i'm looking for.
Not sure how to do this in regex, but here's a very simple function to do it without using regex:
function charsInString(input) {
var output='';
for(var pos=0; pos<input.length; pos++) {
char=input.charAt(pos).toLowerCase();
if(output.indexOf(char) == -1 && char != ' ') {output+=char;}
}
return output;
}
alert(charsInString('This is a sentence'));
As I'm pretty sure what you need cannot be achieved using a single regular expression, I offer a more general solution:
// collapseSentences(ary) will collapse each sentence in ary
// into a string containing its constituent chars
// #param {Array} the array of strings to collapse
// #return {Array} the collapsed sentences
function collapseSentences(ary){
var result=[];
ary.forEach(function(line){
var tmp={};
line.toLowerCase().split('').forEach(function(c){
if(c >= 'a' && c <= 'z') {
tmp[c]++;
}
});
result.push(Object.keys(tmp).join(''));
});
return result;
}
which should do what you want except that the order of characters in each sentence cannot be guaranteed to be preserved, though in most cases it is.
Given:
var index=['This is a sentence','This is a test','this is another test'],
result=collapseSentences(index);
result contains:
["thisaenc","thisae", "thisanoer"]
(\w)(?<!.*?\1)
This yields a match for each of the right characters, but as if you were reading right-to-left instead.
This finds a word character, then looks ahead for the character just matched.
Nevermind, i managed:
justC = "";
if (color[i+1].match(/A/g)) {justC += " L_A";}
if (color[i+1].match(/B/g)) {justC += " L_B";}
if (color[i+1].match(/C/g)) {justC += " L_C";}
if (color[i+1].match(/D/g)) {justC += " L_D";}
if (color[i+1].match(/E/g)) {justC += " L_E";}
else {color[i+1] = "L_F";}
It's not exactly what my question may have lead to belive is what i wanted, but the printout for this is what i was after, for use in a class: <span class="L_A L_C L_E"></span>
How about:
var re = /(.)((.*?)\1)/g;
var str = 'This is a sentence';
x = str.toLowerCase();
x = x.replace(/ /g, '');
while(x.match(re)) {
x=x.replace(re, '$1$3');
}
I don't think this can be done in one fell regex swoop. You are going to need to use a loop.
While my example was not written in your language of choice, it doesn't seem to use any regex features not present in javascript.
perl -e '$foo="This is a sentence"; while ($foo =~ s/((.).*?)\2/$1/ig) { print "<$1><$2><$foo>\n"; } print "$foo\n";'
Producing:
This aenc
Is it possible to do something like this?
var pattern = /some regex segment/ + /* comment here */
/another segment/;
Or do I have to use new RegExp() syntax and concatenate a string? I'd prefer to use the literal as the code is both more self-evident and concise.
Here is how to create a regular expression without using the regular expression literal syntax. This lets you do arbitary string manipulation before it becomes a regular expression object:
var segment_part = "some bit of the regexp";
var pattern = new RegExp("some regex segment" + /*comment here */
segment_part + /* that was defined just now */
"another segment");
If you have two regular expression literals, you can in fact concatenate them using this technique:
var regex1 = /foo/g;
var regex2 = /bar/y;
var flags = (regex1.flags + regex2.flags).split("").sort().join("").replace(/(.)(?=.*\1)/g, "");
var regex3 = new RegExp(expression_one.source + expression_two.source, flags);
// regex3 is now /foobar/gy
It's just more wordy than just having expression one and two being literal strings instead of literal regular expressions.
Just randomly concatenating regular expressions objects can have some adverse side effects. Use the RegExp.source instead:
var r1 = /abc/g;
var r2 = /def/;
var r3 = new RegExp(r1.source + r2.source,
(r1.global ? 'g' : '')
+ (r1.ignoreCase ? 'i' : '') +
(r1.multiline ? 'm' : ''));
console.log(r3);
var m = 'test that abcdef and abcdef has a match?'.match(r3);
console.log(m);
// m should contain 2 matches
This will also give you the ability to retain the regular expression flags from a previous RegExp using the standard RegExp flags.
jsFiddle
I don't quite agree with the "eval" option.
var xxx = /abcd/;
var yyy = /efgh/;
var zzz = new RegExp(eval(xxx)+eval(yyy));
will give "//abcd//efgh//" which is not the intended result.
Using source like
var zzz = new RegExp(xxx.source+yyy.source);
will give "/abcdefgh/" and that is correct.
Logicaly there is no need to EVALUATE, you know your EXPRESSION. You just need its SOURCE or how it is written not necessarely its value. As for the flags, you just need to use the optional argument of RegExp.
In my situation, I do run in the issue of ^ and $ being used in several expression I am trying to concatenate together! Those expressions are grammar filters used accross the program. Now I wan't to use some of them together to handle the case of PREPOSITIONS.
I may have to "slice" the sources to remove the starting and ending ^( and/or )$ :)
Cheers, Alex.
Problem If the regexp contains back-matching groups like \1.
var r = /(a|b)\1/ // Matches aa, bb but nothing else.
var p = /(c|d)\1/ // Matches cc, dd but nothing else.
Then just contatenating the sources will not work. Indeed, the combination of the two is:
var rp = /(a|b)\1(c|d)\1/
rp.test("aadd") // Returns false
The solution:
First we count the number of matching groups in the first regex, Then for each back-matching token in the second, we increment it by the number of matching groups.
function concatenate(r1, r2) {
var count = function(r, str) {
return str.match(r).length;
}
var numberGroups = /([^\\]|^)(?=\((?!\?:))/g; // Home-made regexp to count groups.
var offset = count(numberGroups, r1.source);
var escapedMatch = /[\\](?:(\d+)|.)/g; // Home-made regexp for escaped literals, greedy on numbers.
var r2newSource = r2.source.replace(escapedMatch, function(match, number) { return number?"\\"+(number-0+offset):match; });
return new RegExp(r1.source+r2newSource,
(r1.global ? 'g' : '')
+ (r1.ignoreCase ? 'i' : '')
+ (r1.multiline ? 'm' : ''));
}
Test:
var rp = concatenate(r, p) // returns /(a|b)\1(c|d)\2/
rp.test("aadd") // Returns true
Providing that:
you know what you do in your regexp;
you have many regex pieces to form a pattern and they will use same flag;
you find it more readable to separate your small pattern chunks into an array;
you also want to be able to comment each part for next dev or yourself later;
you prefer to visually simplify your regex like /this/g rather than new RegExp('this', 'g');
it's ok for you to assemble the regex in an extra step rather than having it in one piece from the start;
Then you may like to write this way:
var regexParts =
[
/\b(\d+|null)\b/,// Some comments.
/\b(true|false)\b/,
/\b(new|getElementsBy(?:Tag|Class|)Name|arguments|getElementById|if|else|do|null|return|case|default|function|typeof|undefined|instanceof|this|document|window|while|for|switch|in|break|continue|length|var|(?:clear|set)(?:Timeout|Interval))(?=\W)/,
/(\$|jQuery)/,
/many more patterns/
],
regexString = regexParts.map(function(x){return x.source}).join('|'),
regexPattern = new RegExp(regexString, 'g');
you can then do something like:
string.replace(regexPattern, function()
{
var m = arguments,
Class = '';
switch(true)
{
// Numbers and 'null'.
case (Boolean)(m[1]):
m = m[1];
Class = 'number';
break;
// True or False.
case (Boolean)(m[2]):
m = m[2];
Class = 'bool';
break;
// True or False.
case (Boolean)(m[3]):
m = m[3];
Class = 'keyword';
break;
// $ or 'jQuery'.
case (Boolean)(m[4]):
m = m[4];
Class = 'dollar';
break;
// More cases...
}
return '<span class="' + Class + '">' + m + '</span>';
})
In my particular case (a code-mirror-like editor), it is much easier to perform one big regex, rather than a lot of replaces like following as each time I replace with a html tag to wrap an expression, the next pattern will be harder to target without affecting the html tag itself (and without the good lookbehind that is unfortunately not supported in javascript):
.replace(/(\b\d+|null\b)/g, '<span class="number">$1</span>')
.replace(/(\btrue|false\b)/g, '<span class="bool">$1</span>')
.replace(/\b(new|getElementsBy(?:Tag|Class|)Name|arguments|getElementById|if|else|do|null|return|case|default|function|typeof|undefined|instanceof|this|document|window|while|for|switch|in|break|continue|var|(?:clear|set)(?:Timeout|Interval))(?=\W)/g, '<span class="keyword">$1</span>')
.replace(/\$/g, '<span class="dollar">$</span>')
.replace(/([\[\](){}.:;,+\-?=])/g, '<span class="ponctuation">$1</span>')
It would be preferable to use the literal syntax as often as possible. It's shorter, more legible, and you do not need escape quotes or double-escape backlashes. From "Javascript Patterns", Stoyan Stefanov 2010.
But using New may be the only way to concatenate.
I would avoid eval. Its not safe.
You could do something like:
function concatRegex(...segments) {
return new RegExp(segments.join(''));
}
The segments would be strings (rather than regex literals) passed in as separate arguments.
You can concat regex source from both the literal and RegExp class:
var xxx = new RegExp(/abcd/);
var zzz = new RegExp(xxx.source + /efgh/.source);
Use the constructor with 2 params and avoid the problem with trailing '/':
var re_final = new RegExp("\\" + ".", "g"); // constructor can have 2 params!
console.log("...finally".replace(re_final, "!") + "\n" + re_final +
" works as expected..."); // !!!finally works as expected
// meanwhile
re_final = new RegExp("\\" + "." + "g"); // appends final '/'
console.log("... finally".replace(re_final, "!")); // ...finally
console.log(re_final, "does not work!"); // does not work
No, the literal way is not supported. You'll have to use RegExp.
the easier way to me would be concatenate the sources, ex.:
a = /\d+/
b = /\w+/
c = new RegExp(a.source + b.source)
the c value will result in:
/\d+\w+/
I prefer to use eval('your expression') because it does not add the /on each end/ that ='new RegExp' does.
I need to highlight, case insensitively, given keywords in a JavaScript string.
For example:
highlight("foobar Foo bar FOO", "foo") should return "<b>foo</b>bar <b>Foo</b> bar <b>FOO</b>"
I need the code to work for any keyword, and therefore using a hardcoded regular expression like /foo/i is not a sufficient solution.
What is the easiest way to do this?
(This an instance of a more general problem detailed in the title, but I feel that it's best to tackle with a concrete, useful example.)
You can use regular expressions if you prepare the search string. In PHP e.g. there is a function preg_quote, which replaces all regex-chars in a string with their escaped versions.
Here is such a function for javascript (source):
function preg_quote (str, delimiter) {
// discuss at: https://locutus.io/php/preg_quote/
// original by: booeyOH
// improved by: Ates Goral (https://magnetiq.com)
// improved by: Kevin van Zonneveld (https://kvz.io)
// improved by: Brett Zamir (https://brett-zamir.me)
// bugfixed by: Onno Marsman (https://twitter.com/onnomarsman)
// example 1: preg_quote("$40")
// returns 1: '\\$40'
// example 2: preg_quote("*RRRING* Hello?")
// returns 2: '\\*RRRING\\* Hello\\?'
// example 3: preg_quote("\\.+*?[^]$(){}=!<>|:")
// returns 3: '\\\\\\.\\+\\*\\?\\[\\^\\]\\$\\(\\)\\{\\}\\=\\!\\<\\>\\|\\:'
return (str + '')
.replace(new RegExp('[.\\\\+*?\\[\\^\\]$(){}=!<>|:\\' + (delimiter || '') + '-]', 'g'), '\\$&')
}
So you could do the following:
function highlight(str, search) {
return str.replace(new RegExp("(" + preg_quote(search) + ")", 'gi'), "<b>$1</b>");
}
function highlightWords( line, word )
{
var regex = new RegExp( '(' + word + ')', 'gi' );
return line.replace( regex, "<b>$1</b>" );
}
You can enhance the RegExp object with a function that does special character escaping for you:
RegExp.escape = function(str)
{
var specials = /[.*+?|()\[\]{}\\$^]/g; // .*+?|()[]{}\$^
return str.replace(specials, "\\$&");
}
Then you would be able to use what the others suggested without any worries:
function highlightWordsNoCase(line, word)
{
var regex = new RegExp("(" + RegExp.escape(word) + ")", "gi");
return line.replace(regex, "<b>$1</b>");
}
Regular expressions are fine as long as keywords are really words, you can just use a RegExp constructor instead of a literal to create one from a variable:
var re= new RegExp('('+word+')', 'gi');
return s.replace(re, '<b>$1</b>');
The difficulty arises if ‘keywords’ can have punctuation in, as punctuation tends to have special meaning in regexps. Unfortunately unlike most other languages/libraries with regexp support, there is no standard function to escape punctation for regexps in JavaScript.
And you can't be totally sure exactly what characters need escaping because not every browser's implementation of regexp is guaranteed to be exactly the same. (In particular, newer browsers may add new functionality.) And backslash-escaping characters that are not special is not guaranteed to still work, although in practice it does.
So about the best you can do is one of:
attempting to catch each special character in common browser use today [add: see Sebastian's recipe]
backslash-escape all non-alphanumerics. care: \W will also match non-ASCII Unicode characters, which you don't really want.
just ensure that there are no non-alphanumerics in the keyword before searching
If you are using this to highlight words in HTML which already has markup in, though, you've got trouble. Your ‘word’ might appear in an element name or attribute value, in which case attempting to wrap a < b> around it will cause brokenness. In more complicated scenarios possibly even an HTML-injection to XSS security hole. If you have to cope with markup you will need a more complicated approach, splitting out ‘< ... >’ markup before attempting to process each stretch of text on its own.
What about something like this:
if(typeof String.prototype.highlight !== 'function') {
String.prototype.highlight = function(match, spanClass) {
var pattern = new RegExp( match, "gi" );
replacement = "<span class='" + spanClass + "'>$&</span>";
return this.replace(pattern, replacement);
}
}
This could then be called like so:
var result = "The Quick Brown Fox Jumped Over The Lazy Brown Dog".highlight("brown","text-highlight");
For those poor with disregexia or regexophobia:
function replacei(str, sub, f){
let A = str.toLowerCase().split(sub.toLowerCase());
let B = [];
let x = 0;
for (let i = 0; i < A.length; i++) {
let n = A[i].length;
B.push(str.substr(x, n));
if (i < A.length-1)
B.push(f(str.substr(x + n, sub.length)));
x += n + sub.length;
}
return B.join('');
}
s = 'Foo and FOO (and foo) are all -- Foo.'
t = replacei(s, 'Foo', sub=>'<'+sub+'>')
console.log(t)
Output:
<Foo> and <FOO> (and <foo>) are all -- <Foo>.
Why not just create a new regex on each call to your function? You can use:
new Regex([pat], [flags])
where [pat] is a string for the pattern, and [flags] are the flags.