Javascript - Replacing multiple parts of string in one go - javascript

I want to replace multiple parts of a string with different things. I have a series of URLs that contain these strings that need to change, they all follow the same pattern.
e.g.
'spanish-beginners-course'
'italian-beginners-course'
'spanish-italian-beginners-course'
I just want the result to be the languages e.g. spanish, italian, spanish italian
I have tried this as a test but it returns 'spanish undefined undefined'
const pageName = 'spanish-beginners-course'
const chars = { '-beginners': '', '-course': '', '-': ' ' }
const language = pageName.replace(/-|beginners|course/g, m => chars[m])

This is happening because your REGEX match finds beginners, but in your chars object there is no key called beginners - it's called -beginners. Same for course/-course.
const pageName = 'spanish-beginners-course'
const chars = { '-beginners': '', '-course': '', '-': ' ' }
const language = pageName.replace(/-|beginners|course/g, m => chars[m])
In any case your object is unnecessary, and so is REGEX (as #Alastair points out) since you're replacing a static, unchanging substring.
const language = pageName.replace('-beginners-course', '');

Your case is very simple, you can split and take first.
let pageName = "spanish-beginners-course";
let language = pageName.split(/-/)[0];
console.log(language); // spanish
pageName = "italian-beginners-course";
language = pageName.split(/-/)[0];
console.log(language); //italian
.as-console-row {color: blue!important}

You get null because this part -|beginners|course is an alternation which will match either -, beginners or course
You use the match to get the value from the object, but the object contains -beginners and -course
If you want to do the replacement, and there has to be at least 1 word before it, you could use a capturing group $1 in the replacement and match after it what you want to remove.
(\w+(?:-\w+)*)-beginners-course\b
Regex demo
const pageName = 'spanish-beginners-course';
const language = pageName.replace(/(\w+(?:-\w+)*)-beginners-course\b/g, "$1");
console.log(language)
[
'spanish-beginners-course',
'italian-beginners-course',
'spanish-italian-beginners-course',
'donotremove!-beginners-course'
]
.forEach(s => console.log(s.replace(/(\w+(?:-\w+)*)-beginners-course\b/g, "$1")));

Related

Replace substring with its exact length of another character

I am trying to replace a substring within a string with an exact number of other characters.
Example:
Input: Hello There, General Kenobie!
Output: xxxxx There, xxxxxxx Kenobie!
I can get this to work if I replace it with a preset string:
const text = "'Hello' There, 'General' Kenobie!"
const pattern = /(?:'([^']*)')|(?:"([^"]*)")/g;
console.log(text.replace(pattern, "xxx"));
Output: xxx There, xxx Kenobie!
What am I missing wrapping my head around.
Thanks!
You are using a hard-coded string of 'xxx' as your replacement string. So, that's what you are seeing... the string(s) replaced with 'xxx'.
The .replace() method actually supports a function as the replacement, instead of a string, so that's what you need here.
Docs: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#Specifying_a_function_as_a_parameter
const text = "'Hello' There, 'General' Kenobie!"
const pattern = /(?:'([^']*)')|(?:"([^"]*)")/g;
const newText = text.replace(pattern, (str, m) => 'x'.repeat(m.length));
console.log(newText);
You can always loop through the matches and replace each separately.
let text = "'Hello' There, 'General' Kenobie!"
const pattern = /(?:'([^']*)')|(?:"([^"]*)")/g;
let array1;
while ((array1 = pattern.exec(text)) !== null) {
wrap = array1[0][0];
text = text.replace(array1[0],wrap + "x".repeat(array1[0].length-2) + wrap);
}
console.log(text)

How to do multiple replace with a regex where the term to replace is stored in a variable?

I have a string which looks like below
givenText = "#0How #0much #0sales #0in #0batu #0where #0discount #0on #0sales?"
I have a dictionary which looks like below
termAssignment = {"sales": "#4", "batu": "#2"}
Now I want to replace the terms in the string which matches the keys of the dictionary. But as you can see, the two terms in the string that matches the keys are sales and batu. But I also want to replace the #n that is associated with the terms.
So basically the end result should be
"#0How #0much #4sales #0in #2batu #0where #0discount #0on #4sales?"
So if I do something like this
for(word in termAssignment) {
if(givenText.includes(word)) {
replaceTerm = "#0" + word
givenText = givenText.replace(/replaceTerm/g, termAssignment[word] + word)
}
}
But it doesn't do anything since the I need to apply global replace on a string and I have that string assigned to a variable.
How can I associate regex with a variable and do replace?
You can join the keys by |, pass to new RegExp, with #\d+ on the left, then use a replacer function. Make sure the part after the #n is in a lookahead and capturing group, then look up the matched capture group (the key)'s value on the object, and replace with that value:
const givenText = "#0How #0much #0sales #0in #0batu #0where #0discount #0on #0sales?";
const termAssignment = {"sales": "#4", "batu": "#2"};
const pattern = new RegExp(
`#\\d+(?=(${Object.keys(termAssignment).join('|')}))`,
'g'
);
const result = givenText.replace(pattern, (_, key) => termAssignment[key]);
console.log(result);
For this case, the generated pattern will be:
#\d+(?=(sales|batu))
It matches #, followed by one or more digits, then captures the word that follows for use in the replacement. In the replacement function, the whole match, eg #0, in the _ variable, is not used, but the second argument is what's contained in the first capturing group, so it'll be sales or batu. So all you need to do is look up what property on the object and return it so the #n part is replaced with the appropriate number.
Alternatively you can use split and map, without the need for a RegExp. Something like:
const givenText = "#0How #0much #0sales #0in #0batu #0where #0discount #0on #0sales?";
const terms = Object.entries({"sales": "#4", "batu": "#2"});
const parsed = givenText.split(` `)
.map(v => {
const term = terms.find(t => v.includes(t[0]));
return term ? `${term[1]}${v.slice(2)}` : v;
})
.join(` `);
console.log(parsed);
// can be simplified to
const parsed2 = givenText.split(` `)
.map(v => `${(terms.find(t => v.includes(t[0])) || [``,`#0`])[1]}${v.slice(2)}`)
.join(` `);
console.log(parsed2);
.as-console-wrapper { top: 0; max-height: 100% !important; }

How to parse JSON-like file with regex

I have this structure of my input data, it is just like JSON but not containing strings. I only need to parse few information from these data
{ .appVersion = "1230"; DisplayStrings = ( A ); customParameters = ( { name = Axes;.......(continues)}'''
the code looks like this, what happens here is that it matches but search until last appearance of semicolon. I tried all non-greedy tips and tricks that I have found, but I feel helpless.
const regex = /.appVersion = (".*"?);/
const found = data.match(regex)
console.log(found)
How can I access value saved under .appVersion variable, please?
You need to escape the . before appVersion since it is a special character in Regex and you can use \d instead of .* to match only digits. If you want just the number to be captured, without the quotes you can take them out of the parentheses.
const regex = /\.appVersion = "(\d+)";/
const found = data.match(regex)
const appVersion = found[1];
const string = '{ .appVersion = "1230"; DisplayStrings = (...(continues)';
const appVersion = string.match(/\.appVersion\s*=\s*"([^"]+)"/)[1];
If that's what you need...
I'm not sure where the format you're trying to parse comes from, but consider asking (making) your data provider return json string, so you could easily invoke JSON.parse() which works in both node and browser environments.
You can try the following:
var data='{ .appVersion = "1230"; DisplayStrings = ( A ); customParameters = ( { name = Axes;.......(continues)}';
const regex = /.appVersion = [^;]*/ //regex test: https://regex101.com/r/urX53f/1
const found = data.match(regex);
var trim = found.toString().replace(/"/g,''); // remove the "" if necessary
console.log(found.toString());
console.log(trim);
Your regex is looking for . which is "any character" in a regex. Escape it with a backslash:
/\.appVersion = ("\d+");/
Don't use .* to capture the value, It's greedy.
You can use something like \"[^\"]* - Match a quote, then Any character except quote, as many time as possible.
try
const regex = \.appVersion = \"([^\"]*)\";
Note that the first dot is should also be quoted, and the spaces should be exactly as in your example.

Matching an exact word from a string

I need a way to match a word against a string and not get false positives. Let me give an example of what I mean:
"/thing" should match the string "/a/thing"
"/thing" should match the string "/a/thing/that/is/here"
"/thing" should NOT match the string "/a/thing_foo"
Basically, it should match if the exact characters are there in the first string and the second, but not if there are run-ons in the second (such as an underscore like in thing_foo).
Right now, I'm doing this, which is not working.
let found = b.includes(a); // true
Hopefully my question is clear enough. Thanks for the help!
Boy did this turn in to a classic XY Problem.
If I had to guess, you want to know if a path contains a particular segment.
In that case, split the string on a positive lookahead for '/' and use Array.prototype.includes()
const paths = ["/a/thing", "/a/thing/that/is/here", "/a/thing_foo"]
const search = '/thing'
paths.forEach(path => {
const segments = path.split(/(?=\/)/)
console.log('segments', segments)
console.info(path, ':', segments.includes(search))
})
Using the positive lookahead expression /(?=\/)/ allows us to split the string on / whilst maintaining the / prefix in each segment.
Alternatively, if you're still super keen in using a straight regex solution, you'll want something like this
const paths = ["/a/thing", "/a/thing/that/is/here", "/a/thing_foo", "/a/thing-that/is/here"]
const search = '/thing'
const rx = new RegExp(search + '\\b') // note the escaped backslash
paths.forEach(path => {
console.info(path, ':', rx.test(path))
})
Note that this will return false positives if the search string is followed by a hyphen or tilde as those are considered to be word boundaries. You would need a more complex pattern and I think the first solution handles these cases better.
I'd recommend using regular expressions...
e.g. The following regular expression /\/thing$/ - matches anything that ends with /thing.
console.log(/\/thing$/.test('/a/thing')) // true
console.log(/\/thing$/.test('/a/thing_foo')) // false
Update: To use a variable...
var search = '/thing'
console.log(new RegExp(search + '$').test('/a/thing')) // true
console.log(new RegExp(search + '$').test('/a/thing_foo')) // false
Simply with following regex you can do it
var a = "/a/thing";
var b = "/a/thing/that/is/here";
var c = "/a/thing_foo";
var pattern = new RegExp(/(:?(thing)(([^_])|$))/);
pattern.test(a) // true
pattern.test(b) // true
pattern.test(c) // false

How can I concatenate regex literals in JavaScript?

Is it possible to do something like this?
var pattern = /some regex segment/ + /* comment here */
/another segment/;
Or do I have to use new RegExp() syntax and concatenate a string? I'd prefer to use the literal as the code is both more self-evident and concise.
Here is how to create a regular expression without using the regular expression literal syntax. This lets you do arbitary string manipulation before it becomes a regular expression object:
var segment_part = "some bit of the regexp";
var pattern = new RegExp("some regex segment" + /*comment here */
segment_part + /* that was defined just now */
"another segment");
If you have two regular expression literals, you can in fact concatenate them using this technique:
var regex1 = /foo/g;
var regex2 = /bar/y;
var flags = (regex1.flags + regex2.flags).split("").sort().join("").replace(/(.)(?=.*\1)/g, "");
var regex3 = new RegExp(expression_one.source + expression_two.source, flags);
// regex3 is now /foobar/gy
It's just more wordy than just having expression one and two being literal strings instead of literal regular expressions.
Just randomly concatenating regular expressions objects can have some adverse side effects. Use the RegExp.source instead:
var r1 = /abc/g;
var r2 = /def/;
var r3 = new RegExp(r1.source + r2.source,
(r1.global ? 'g' : '')
+ (r1.ignoreCase ? 'i' : '') +
(r1.multiline ? 'm' : ''));
console.log(r3);
var m = 'test that abcdef and abcdef has a match?'.match(r3);
console.log(m);
// m should contain 2 matches
This will also give you the ability to retain the regular expression flags from a previous RegExp using the standard RegExp flags.
jsFiddle
I don't quite agree with the "eval" option.
var xxx = /abcd/;
var yyy = /efgh/;
var zzz = new RegExp(eval(xxx)+eval(yyy));
will give "//abcd//efgh//" which is not the intended result.
Using source like
var zzz = new RegExp(xxx.source+yyy.source);
will give "/abcdefgh/" and that is correct.
Logicaly there is no need to EVALUATE, you know your EXPRESSION. You just need its SOURCE or how it is written not necessarely its value. As for the flags, you just need to use the optional argument of RegExp.
In my situation, I do run in the issue of ^ and $ being used in several expression I am trying to concatenate together! Those expressions are grammar filters used accross the program. Now I wan't to use some of them together to handle the case of PREPOSITIONS.
I may have to "slice" the sources to remove the starting and ending ^( and/or )$ :)
Cheers, Alex.
Problem If the regexp contains back-matching groups like \1.
var r = /(a|b)\1/ // Matches aa, bb but nothing else.
var p = /(c|d)\1/ // Matches cc, dd but nothing else.
Then just contatenating the sources will not work. Indeed, the combination of the two is:
var rp = /(a|b)\1(c|d)\1/
rp.test("aadd") // Returns false
The solution:
First we count the number of matching groups in the first regex, Then for each back-matching token in the second, we increment it by the number of matching groups.
function concatenate(r1, r2) {
var count = function(r, str) {
return str.match(r).length;
}
var numberGroups = /([^\\]|^)(?=\((?!\?:))/g; // Home-made regexp to count groups.
var offset = count(numberGroups, r1.source);
var escapedMatch = /[\\](?:(\d+)|.)/g; // Home-made regexp for escaped literals, greedy on numbers.
var r2newSource = r2.source.replace(escapedMatch, function(match, number) { return number?"\\"+(number-0+offset):match; });
return new RegExp(r1.source+r2newSource,
(r1.global ? 'g' : '')
+ (r1.ignoreCase ? 'i' : '')
+ (r1.multiline ? 'm' : ''));
}
Test:
var rp = concatenate(r, p) // returns /(a|b)\1(c|d)\2/
rp.test("aadd") // Returns true
Providing that:
you know what you do in your regexp;
you have many regex pieces to form a pattern and they will use same flag;
you find it more readable to separate your small pattern chunks into an array;
you also want to be able to comment each part for next dev or yourself later;
you prefer to visually simplify your regex like /this/g rather than new RegExp('this', 'g');
it's ok for you to assemble the regex in an extra step rather than having it in one piece from the start;
Then you may like to write this way:
var regexParts =
[
/\b(\d+|null)\b/,// Some comments.
/\b(true|false)\b/,
/\b(new|getElementsBy(?:Tag|Class|)Name|arguments|getElementById|if|else|do|null|return|case|default|function|typeof|undefined|instanceof|this|document|window|while|for|switch|in|break|continue|length|var|(?:clear|set)(?:Timeout|Interval))(?=\W)/,
/(\$|jQuery)/,
/many more patterns/
],
regexString = regexParts.map(function(x){return x.source}).join('|'),
regexPattern = new RegExp(regexString, 'g');
you can then do something like:
string.replace(regexPattern, function()
{
var m = arguments,
Class = '';
switch(true)
{
// Numbers and 'null'.
case (Boolean)(m[1]):
m = m[1];
Class = 'number';
break;
// True or False.
case (Boolean)(m[2]):
m = m[2];
Class = 'bool';
break;
// True or False.
case (Boolean)(m[3]):
m = m[3];
Class = 'keyword';
break;
// $ or 'jQuery'.
case (Boolean)(m[4]):
m = m[4];
Class = 'dollar';
break;
// More cases...
}
return '<span class="' + Class + '">' + m + '</span>';
})
In my particular case (a code-mirror-like editor), it is much easier to perform one big regex, rather than a lot of replaces like following as each time I replace with a html tag to wrap an expression, the next pattern will be harder to target without affecting the html tag itself (and without the good lookbehind that is unfortunately not supported in javascript):
.replace(/(\b\d+|null\b)/g, '<span class="number">$1</span>')
.replace(/(\btrue|false\b)/g, '<span class="bool">$1</span>')
.replace(/\b(new|getElementsBy(?:Tag|Class|)Name|arguments|getElementById|if|else|do|null|return|case|default|function|typeof|undefined|instanceof|this|document|window|while|for|switch|in|break|continue|var|(?:clear|set)(?:Timeout|Interval))(?=\W)/g, '<span class="keyword">$1</span>')
.replace(/\$/g, '<span class="dollar">$</span>')
.replace(/([\[\](){}.:;,+\-?=])/g, '<span class="ponctuation">$1</span>')
It would be preferable to use the literal syntax as often as possible. It's shorter, more legible, and you do not need escape quotes or double-escape backlashes. From "Javascript Patterns", Stoyan Stefanov 2010.
But using New may be the only way to concatenate.
I would avoid eval. Its not safe.
You could do something like:
function concatRegex(...segments) {
return new RegExp(segments.join(''));
}
The segments would be strings (rather than regex literals) passed in as separate arguments.
You can concat regex source from both the literal and RegExp class:
var xxx = new RegExp(/abcd/);
var zzz = new RegExp(xxx.source + /efgh/.source);
Use the constructor with 2 params and avoid the problem with trailing '/':
var re_final = new RegExp("\\" + ".", "g"); // constructor can have 2 params!
console.log("...finally".replace(re_final, "!") + "\n" + re_final +
" works as expected..."); // !!!finally works as expected
// meanwhile
re_final = new RegExp("\\" + "." + "g"); // appends final '/'
console.log("... finally".replace(re_final, "!")); // ...finally
console.log(re_final, "does not work!"); // does not work
No, the literal way is not supported. You'll have to use RegExp.
the easier way to me would be concatenate the sources, ex.:
a = /\d+/
b = /\w+/
c = new RegExp(a.source + b.source)
the c value will result in:
/\d+\w+/
I prefer to use eval('your expression') because it does not add the /on each end/ that ='new RegExp' does.

Categories