Match pattern except under one condition Regex - javascript

I'm trying to match a patterned with regex except when the pattern is escaped.
Test text:
This is AT\&T® is really cool Regex
You can see with my \& I'm manually escaping. And therefore, do not want the regex to match.
Regex:
const str = 'This is AT\&T® is really cool Regex'
str.replace(/\&(.*?)\;/g, '<sup>&$1;</sup>');
Expected output
This is AT&T<sup>®</sup> is really cool Regex
Hard to explain I guess but when the start of this regex looks for a & and ends with a ; however, if & is preceded with at \ like \& than do not match and look for the next \&(.*?)\;

You can use negative lookbehind
This regex works fine with the example
/(?<!\\)\&(.*?)\;/g
Edit 1
To workaround in JS you can use [^\\] that will match everything except backslash. The overall regex /[^\\]\&(.*?)\;/g It works for your example.

Since JavaScript have no support for lookbehind assertions - it is possible to add some custom substitution logic to achieve desired results. I've updated test string with examples of different kinds of html entities for test purposes:
const str = '&T;his is AT\\&T® is & really &12345; &xAB05; \\&cool; Regex'
console.log(str.replace(/&([a-z]+|[0-9]{1,5}|x[0-9a-f]{1,4});/ig, function (m0, m1, index, str) {
return (str.substr(index - 1, 1) !== '\\') ? '<sup>' + m0 + '</sup>' : m0;
}));

Related

RegEx - Match Character only when it's not proceeded or followed by same character

How would I match the quotations around "text" in the string below and not around "TEST TEXT" using RegEx. I wanted just quotations only when they are by themselves. I tried a negative lookahead (for a second quote) but it still captured the second of the two quotes around TEST TEXT.
This is some "text". This is also some ""TEST TEXT""
Be aware that I need this to scale so sometimes it would be right in the middle of a string so something like this:
/(\s|\w)(\")(?!")/g (using $2...)
Would work in this example but not if the string was:
This is some^"text".This is also some ""TEST TEXT""
I just need quotation marks by themselves.
EDIT
FYI, this needs to be Javascript RegEx so lookbehind would not be an option for me for this one.
Since you have not tagged any particular flavor of regex I am takig liberty of using lookbehind also. You can use:
(?<!")"(?!")[^"]*"
RegEx Demo
Update: For working with Javascript you can use this regex:
/""[^"]*""|(")([^"]*)(")/
And use captured group # 1 for your text.
RegEx Demo
I'm not sure if I really understood well your needs. I'll post this answer to check if it helps you but I can delete it if it doesn't.
So, is this what you want using this regex:
"\w+?"
Working demo
By the way, if you just want to get the content within "..." you can use this regex:
"(\w+?)"
Working demo
You can't do this with a pure JavaScript regexp. I am going to eat my words now however, as you can use the following solution using callback parameters:
var regex = /""+|(")/g
replaced = subject.replace(regex, function($0, $1) {
if ($1 == "\"") return "-"; // What to replace to?
else return $0;
});
"This is some -text-. This is also some ""TEST TEXT"""
If you're needing the regex to split the string, then you can use the above to replace matches to something distinctive, then split by them:
var regex = /""+|(")/g
replaced = subject.replace(regex, function($0, $1) {
if ($1 == "\"") return "☺";
else return $0;
});
splits = replaced.split("☺");
["This is some ", "text", ". This is also some ""TEST TEXT"""]
Referenced by:http://www.rexegg.com/regex-best-trick.html

Javascript regexp that matches '|' not preceded by '\' (lookbehind alternative)

I'm trying to split string name\|dial_num|032\|0095\\|\\0099|\9925 by delimiter | but it will skip \|.
I have found solution in this link: Javascript regexp that matches '.' not preceded by '\' (lookbehind alternative) but it skips \\| too.
The right result must be: [name\|dial_num,032\|0095\\,\\0099,\9925].
The rule is in case \\\| or \\\\\| or etc, | is still a valid delimiter but in case \\\\| or even more, it isn't.
Any help will be appreciate .
the usual workaround is to use match instead of split:
> s = "name\\|dial_num|032\\|0095\\\\|\\\\0099|\\9925"
"name\|dial_num|032\|0095\\|\\0099|\9925"
> s.match(/(\\.|[^|])+/g)
["name\|dial_num", "032\|0095\\", "\\0099", "\9925"]
As a side note, even if JS did support lookbehinds, it won't be a solution, because (?<!\\)| would also incorrectly skip \\|.
I challenged myself to use replace String method..
I got the right result using regex101.com (a popular online tester for PCRE, Javascript and Python regular expressions engines)
// input : name\|dial_num|032\|0095\\|\\0099|\9925
// regex : ([^\\|](?:\\\\)*)\| with global flag
// replacement : $1,
// output: name\|dial_num,032\|0095\\,\\0099,\9925 <= seams okey right !?
Test ..
var str = 'name\\|dial_num|032\\|0095\\\\|\\\\0099|\\9925';
str = str.replace(/([^\\|](?:\\\\)*)\|/g,'$1,');
console.log(str);
// > name\|dial_num,032\|0095\\,\\0099,\9925

javascript regexp "subword" replace

I have a phrase like
"everything is changing around me, wonderfull thing+, tthingxx"
and I want to modify every word that contains ***thing at the end of that word, or at most another character after "thing", like "+" or "h" or "x"...
something like
string = 'everything is changing around me, wonderful thing+, tthingxx'
regex = new RegExp('thing(\\+|[g-z])$','g');
string = string.replace(regex, '<b>thing$1</b>');
what I want? everything is changing around me, wonderful thing+, tthingxx
The result of my regexp? anything working... if I remove the $ all the words containing "thing" and at least another character after it are matched:
everything is changing around me, wonderful thing+, tthingxx
I tryed everything but - in first place I can't understand very well technical english - and second I did't find the answer around.
what I have to do??? thanks in advance
the solution I found was using this regular expression
/thing([+g-z]){0,1}\b/g
or with the RegExp (I need it because I have to pass a variable):
myvar = 'thing';
regex = new RegExp(myvar + "([+g-z]){0,1}\\b" , "g");
I was missing the escape \ when doing the regular expression in the second mode. But this isn't enough: the + goes out of the < b > and I don't really know why!!!
the solution that works as I want is the one by #Qtax:
/thing([+g-z])?(?!\w)/g
thank to the community!
To solve the issue with + not matching when using \b you could use (?!\w) instead of \b there, like:
thing[+g-z]?(?!\w)
Use boundary in your regex
\b\w+thing(\+|[g-z])?\b
If I understand what you want, then:
string = 'everything is changing around me, wonderful thing+, tthingxx';
string = string.replace(/thing(\b|[+g-z]$)/g, '<b>thing$1</b>');
...which results in:
every<b>thing</b> is changing around me, wonderful <b>thing</b>+, tthingxx
\b is a word boundary, so what the regular expression says is anywhere it finds "thing" followed by a word boundary or + or g-z at the end of the string, do the replacement.

JavaScript RegEx Match Failing

I am having issues matching a string using regex in javascript. I am trying to get everything up to the word "at". I am using the following and while it doesn't return any errors, it also doesn't do anything either.
var str = "Team A at Team B";
var matches = str.match(/(.*?)(?=at|$)/);
I tried multiple regex patterns before coming across this SO post, Regex to capture everything before first optional string, but it doesn't to return what I want.
Remove the ? at your first capturing group, and |$ from your second, and add ^ to mark beginning of string:
str.match(/^(.*)(?=at)/)
Alternatively (I personally find below easier to read, but your call):
str.substr(0, str.search(/\bat\b/))

Split string in JavaScript using a regular expression

I'm trying to write a regex for use in javascript.
var script = "function onclick() {loadArea('areaog_og_group_og_consumedservice', '\x26roleOrd\x3d1');}";
var match = new RegExp("'[^']*(\\.[^']*)*'").exec(script);
I would like split to contain two elements:
match[0] == "'areaog_og_group_og_consumedservice'";
match[1] == "'\x26roleOrd\x3d1'";
This regex matches correctly when testing it at gskinner.com/RegExr/ but it does not work in my Javascript. This issue can be replicated by testing ir here http://www.regextester.com/.
I need the solution to work with Internet Explorer 6 and above.
Can any regex guru's help?
Judging by your regex, it looks like you're trying to match a single-quoted string that may contain escaped quotes. The correct form of that regex is:
'[^'\\]*(?:\\.[^'\\]*)*'
(If you don't need to allow for escaped quotes, /'[^']*'/ is all you need.) You also have to set the g flag if you want to get both strings. Here's the regex in its regex-literal form:
/'[^'\\]*(?:\\.[^'\\]*)*'/g
If you use the RegExp constructor instead of a regex literal, you have to double-escape the backslashes: once for the string literal and once for the regex. You also have to pass the flags (g, i, m) as a separate parameter:
var rgx = new RegExp("'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'", "g");
while (result = rgx.exec(script))
print(result[0]);
The regex you're looking for is .*?('[^']*')\s*,\s*('[^']*'). The catch here is that, as usual, match[0] is the entire matched text (this is very normal) so it's not particularly useful to you. match[1] and match[2] are the two matches you're looking for.
var script = "function onclick() {loadArea('areaog_og_group_og_consumedservice', '\x26roleOrd\x3d1');}";
var parameters = /.*?('[^']*')\s*,\s*('[^']*')/.exec(script);
alert("you've done: loadArea("+parameters[1]+", "+parameters[2]+");");
The only issue I have with this is that it's somewhat inflexible. You might want to spend a little time to match function calls with 2 or 3 parameters?
EDIT
In response to you're request, here is the regex to match 1,2,3,...,n parameters. If you notice, I used a non-capturing group (the (?: ) part) to find many instances of the comma followed by the second parameter.
/.*?('[^']*')(?:\s*,\s*('[^']*'))*/
Maybe this:
'([^']*)'\s*,\s*'([^']*)'

Categories