JavaScript regex refactoring - javascript

I'm performing this on a string:
var poo = poo
.replace(/[%][<]/g, "'<")
.replace(/[>][%]/g, ">'")
.replace(/[%]\s*[+]/g, "'+")
.replace(/[+]\s*[%]/g, "+'");
Given the similar if these statements, can these regexs be comebined somehow?

No, I don't think so. At least, I suspect for any transformation involving fewer replaces I can come up with a string that your original and the proposed alternative treat differently. However, it may be that the text you're working with wouldn't trigger the differences, and so for practical purposes a shorter transformation would work as well. Depends on the text.

You can simplify it a little bit. You don't need all the range syntax
poo
.replace(/%</g, "'<")
.replace(/>%/g, ">'")
.replace(/%\s*\+/g, "'+")
.replace(/\+\s*%/g, "+'");

Since in either case, the replacement only turns % into ' and removes spaces:
var poo = 'some annoying %< string >% with some % + text + %';
poo = poo.replace(/%<|>%|%\s*\+|\+\s*%/g, function(match) {
return match.replace('%', '\'').replace(/\s/g,'');
});
// "some annoying '< string >' with some ' + text + '"
Although that's not much simpler...

Using lookahead assertions and capturing:
var poo = poo.replace(/%(?=<)|(>)%|%\s*(?=\+)|(\+)\s*%/g, "$1$2'");
Using capturing alone:
var poo = poo.replace(/(>)%|(\+)\s*%|%(<)|%\s*(\+)/g, "$1$2'$3$4");
If JS's RegExp supported lookbehind assertions:
var poo = poo.replace(/%(?=<)|(?<=>)%|%\s*(?=\+)|(?<=\+)\s*%/g, "'");
but it doesn't.

Related

Regexp (recursive?) to match nested pattern alternative in Javascript? [duplicate]

Example string: $${a},{s$${d}$$}$$
I'd like to match $${d}$$ first and replace it some text so the string would become $${a},{sd}$$, then $${a},{sd}$$ will be matched.
Annoyingly, Javascript does not provide the PCRE recursive parameter (?R), so it is far from easy to deal with the nested issue. It can be done however.
I won't reproduce code, but if you check out Steve Levithan's blog, he has some good articles on the subject. He should do, he is probably the leading authority on RegExp in JS. He wrote XRegExp, which replaces most of the PCRE bits that are missing, there is even a Match Recursive plugin!
I wrote this myself:
String.prototype.replacerec = function (pattern, what) {
var newstr = this.replace(pattern, what);
if (newstr == this)
return newstr;
return newstr.replace(pattern, what);
};
Usage:
"My text".replacerec(/pattern/g,"what");
P.S: As suggested by #lededje, when using this function in production it's good to have a limiting counter to avoid stack overflow.
Since you want to do this recursively, you are probably best off doing multiple matches using a loop.
Regex itself is not well suited for recursive-anything.
you can try \$\${([^\$]*)}\$\$, the [^\$] mean do not capture if captured group contains $
var re = new RegExp(/\$\${([^\$]*)}\$\$/, 'g'),
original = '$${a},{s$${d}$$}$$',
result = original.replace(re, "$1");
console.log('original: ' + original)
console.log('result: ' + result);
var content = "your string content";
var found = true;
while (found) {
found = false;
content = content.replace(/regex/, () => { found = true; return "new value"; });
}
In general, Regexps are not well suited for that kind of problem. It's better to use state machine.

Get sub-string within non-standard pattern

I want to get a sub-string in the cases below:
TruckScaleEntry_DriverId_1535
^------^
EntryCreateForm_TruckScaleEntry_DriverId_1535_SelectWidget_Code
^------^
In the examples above I want the DriverId(It isn't always DriverId, it may change as well) but I never know which pattern I'm dealing with. Actually I got it to work with two regex and two methods(match and replace) together. But I want to know if there is a better - and simpler - way to achieve it.
What I got is:
console.log("TruckScaleEntry_DriverId_1535".match(/(?:.*TruckScaleEntry_)[^_]*/)[0].replace(/^.*_/, ''));
console.log("EntryCreateForm_TruckScaleEntry_DriverId_1535_SelectWidget_Code".match(/(?:.*TruckScaleEntry_)[^_]*/)[0].replace(/^.*_/, ''));
Yeah it is ugly. Can I do something clearer with just one regex and one method?
You could use this regular expression with replace:
var s = 'EntryCreateForm_TruckScaleEntry_DriverId_1535_SelectWidget_Code';
res = s.replace(/^.*?TruckScaleEntry_(.+?)_.*$/, '$1');
document.write(res + '<br>');
// Alternative
res = s.match(/TruckScaleEntry_(.+?)_/)[1];
document.write(res + '<br>');
Results of capturing groups are stored in match objects, which can be accessed by match[1] etc.
JS Demo
var str = 'EntryCreateForm_TruckScaleEntry_DriverId_1535_SelectWidget_Code';
var regex = /TruckScaleEntry_([^_]+)/;
document.writeln(str.match(regex)[1]);

JSLint - Bad Escaping for a Regex with variables

I've seen some posts about JSLint "bad escapement" warnings, but I just wanted to see if I'm doing this Regex correctly. (Note - I'm dabbler programmer).
I have a function (below) that attempts to parse out a variable from it's name in a long message. The regex is working well, but should I change something in response to the JSLint warning?
A very simplified version of msg could look like this essentially:
VariableName1 = Value1
VariableName2 = Value2
VariableName3 = Value3
The actual msg has different unstructured data above and below. I had to use a strange Regex since even though a more simple one worked on all the testing websites, it didn't work within the server application we are using, so this is the only way I could get it to work. The regular expression incorporates a variable.
Here is the parsing function I'm using:
function parseValue(msg, strValueName) {
var myRegexp = new RegExp(strValueName + ' = ([A-Z3][a-zA-Z\. 3]+)[\\n\\r]+', 'gm');
log('parseValue', 'myRegexp = ' + myRegexp.toString());
var match = myRegexp.exec(msg);
log('parseValue', 'returning match = ' + match[1] );
return match[1];
}
There is probably something much simpler that a 'real' programmer can come up with pretty easily. Any help would be appreciated.
THanks.
The problem that JSLint didn't like was the '.' character in the character class as pointed out by 'Explosion Pills'.
When I removed the '.' all was good.
Thanks.

jquery / javascript: regex to replace instances of an html tag

I'm trying to take some parsed XML data, search it for instances of the tag and replace that tag (and anything that may be inside the font tag), and replace it with a simple tag.
This is how I've been doing my regexes:
var emailReg = /^([\w-\.]+#([\w-]+\.)+[\w-]{2,4})?$/; //Test against valid email
console.log('regex: ' + emailReg.test(testString));
and so I figured the font regex would be something like this:
var fontReg = /'<'+font+'[^><]*>|<.'+font+'[^><]*>','g'/;
console.log('regex: ' + fontReg.test(testString));
but that isn't working. Anyone know a way to do this? Or what I might be doing wrong in the code above?
I think namuol's answer will serve you better then any RegExp-based solution, but I also think the RegExp deserves some explanation.
JavaScript doesn't allow for interpolation of variable values in RegExp literals.
The quotations become literal character matches and the addition operators become 1-or-more quantifiers. So, your current regex becomes capable of matching these:
# left of the pipe `|`
'<'font'>
'<''''fontttt'>
# right of the pipe `|`
<#'font'>','g'
<#''''fontttttt'>','g'
But, it will not match these:
<font>
</font>
To inject a variable value into a RegExp, you'll need to use the constructor and string concat:
var fontReg = new RegExp('<' + font + '[^><]*>|<.' + font + '[^><]*>', 'g');
On the other hand, if you meant for literal font, then you just needed:
var fontReg = /<font[^><]*>|<.font[^><]*>/g;
Also, each of those can be shortened by using .?, allowing the halves to be combined:
var fontReg = new RegExp('<.?' + font + '[^><]*>', 'g');
var fontReg = /<.?font[^><]*>/g;
If I understand your problem correctly, this should replace all font tags with simple span tags using jQuery:
$('font').replaceWith(function () {
return $('<span>').append($(this).contents());
});
Here's a working fiddle: http://jsfiddle.net/RhLmk/2/

Regular expression to replace spaces with dashes

I'm trying to work out what regular expression I would need to take a value and replace spaces with a dash (in Javascript)?
So say if I had North America, it would return me North-America ?
Can I do something like var foo = bar.replace(' ', '-') ?
It's better to use:
var string = "find this and find that".replace(/find/g, "found");
to replace all occurrences.
The best source of information for regular expressions in various languages that I've found is Regular-Expressions.info (and I linked directly to the Javascript section there).
As for your particular question, yes, you can do something like that. Did you try it?
var before = 'North America';
var after = before.replace(/ +/g, '-')
alert('"' + before + '" becomes "' + after + '"');
Use the site I showed you to analyze the regex above. Note how it replaces one or more spaces with a single hyphen, as you requested.
For the most regular expressions, you can do it by testing with the regular expression tester.

Categories