Javascript Regex: negative lookbehind - javascript

I am trying to replace in a formula all floating numbers that miss the preceding zero. Eg:
"4+.5" should become: "4+0.5"
Now I read look behinds are not supported in JavaScript, so how could I achieve that? The following code also replaces, when a digit is preceding:
var regex = /(\.\d*)/,
formula1 = '4+1.5',
formula2 = '4+.5';
console.log(formula1.replace(regex, '0$1')); //4+10.5
console.log(formula2.replace(regex, '0$1')); //4+0.5

Try this regex (\D)(\.\d*)
var regex = /(\D)(\.\d*)/,
formula1 = '4+1.5',
formula2 = '4+.5';
console.log(formula1.replace(regex, '$10$2'));
console.log(formula2.replace(regex, '$10$2'));

You may use
s = s.replace(/\B\.\d/g, '0$&')
See the regex demo.
Details
\B\. - matches a . that is either at the start of the string or is not preceded with a word char (letter, digit or _)
\d - a digit.
The 0$& replacement string is adding a 0 right in front of the whole match ($&).
JS demo:
var s = "4+1.5\n4+.5";
console.log(s.replace(/\B\.\d/g, '0$&'));
Another idea is by using an alternation group that matches either the start of the string or a non-digit char, capturing it and then using a backreference:
var s = ".4+1.5\n4+.5";
console.log(s.replace(/(^|\D)(\.\d)/g, '$10$2'));
The pattern will match
(^|\D) - Group 1 (referred to with $1 from the replacement pattern): start of string (^) or any non-digit char
(\.\d) - Group 2 (referred to with $2 from the replacement pattern): a . and then a digit

Related

Regular expression capture with optional trailing underscore and number

I'm trying to find a regular expression that will match the base string without the optional trailing number (_123). e.g.:
lorem_ipsum_test1_123 -> capture lorem_ipsum_test1
lorem_ipsum_test2 -> capture lorem_ipsum_test2
I tried using the following expression, but it would only work when there is a trailing _number.
/(.+)(?>_[0-9]+)/
/(.+)(?>_[0-9]+)?/
Similarly, adding the ? (zero or more) quantifier only worked when there is no trailing _number, otherwise, the trailing _number would just be part of the first capture.
Any suggestions?
You may use the following expression:
^(?:[^_]+_)+(?!\d+$)[^_]+
^ Anchor beginning of string.
(?:[^_]+_)+ Repeated non capturing group. Negated character set for anything other than a _, followed by a _.
(?!\d+$) Negative lookahead for digits at the end of the string.
[^_]+ Negated character set for anything other than a _.
Regex demo here.
Please note that the \n in the character sets in the Regex demo are only for demonstration purposes, and should by all means be removed when using as a pattern in Javascript.
Javascript demo:
var myString = "lorem_ipsum_test1_123";
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
var myString = "lorem_ipsum_test2"
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
You might match any character and use a negative lookahead that asserts that what follows is not an underscore, one or more digits and the end of the string:
^(?:(?!_\d+$).)*
Explanation
^ Assert start of the string
(?: Non capturing group
(?! Negative lookahead to assert what is on the right side is not
_\d+$Match an underscore, one or more digits and assert end of the string
.) Match any character and close negative lookahead
)* Close non capturing group and repeat zero or more times
Regex demo
const strings = [
"lorem_ipsum_test1_123",
"lorem_ipsum_test2"
];
let pattern = /^(?:(?!_\d+$).)*/;
strings.forEach((s) => {
console.log(s + " ==> " + s.match(pattern)[0]);
});
You are asking for
/^(.*?)(?:_\d+)?$/
See the regex demo. The point here is that the first dot pattern must be non-greedy and the _\d+ should be wrapped with an optional non-capturing group and the whole pattern (especially the end) must be enclosed with anchors.
Details
^ - start of string
(.*?) - Capturing group 1: any zero or more chars other than line break chars, as few as possible due to the non-greedy ("lazy") quantifier *?
(?:_\d+)? - an optional non-capturing group matching 1 or 0 occurrences of _ and then 1+ digits
$ - end of string.
However, it seems easier to use a mere replacing approach,
s = s.replace(/_\d+$/, '')
If the string ends with _ and 1+ digits, the substring will get removed, else, the string will not change.
See this regex demo.
Try to check if the string contains the trailing number. If it does you get only the other part. Otherwise you get the whole string.
var str = "lorem_ipsum_test1_123"
if(/_[0-9]+$/.test(str)) {
console.log(str.match(/(.+)(?=_[0-9]+)/g))
} else {
console.log(str)
}
Or, a lot more concise:
str = str.replace(/_[0-9]+$/g, "")

regex intermediate

I have a string that is formatted like so: "SomeText ($4.56)"
I am using the following regex \${1}[0-9]+\.*[0-9]* to pull out currency data but it includes the $ sign.
I am intermediate and understand this as.
\$ is looking for The dollar sign
[0-9]+ is any number of digits after
\. is finding the decimal
*[0-9]* is finding zero or more digits after the decimal
I want the dollar sign removed though. How do I do this?
Thanks in advance.
That's very simple... Just use groups!
It will look like this:
\$([0-9]+\.*[0-9]*)
And group 1 will return the currency without the $.
You're doing good but removing a substring in JS with regex needs calling .replace() method. Regex:
\$(\d+(?:\.\d+)?)\b
Breakdown:
\$ Macth a literal dollar sign
( Start of capturing group one
\d+(?:\.\d+)? Match a sequence of digits with optional decimals
) End of of capturing group
\b Match a word boundary
JS code:
var str = "SomeText ($4.56)";
console.log(str.replace(/\$(\d+(?:\.\d+)?)\b/g, '$1'));
Surround the stuff you want in brackets to create a capture group.
\$(\d+\.\d{2})
Demo
var regex = /\$(\d+\.\d{2})/;
var string = "SomeText ($4.56)";
var matches = regex.exec(string);
console.log(matches); // all matches
console.log(matches[1]); // first capture group

How do I convert a PHP regex with a lookbehind to Javascript?

Javascript doesn't support lookbehinds in regexes. How do I convert the following PHP regex to Javascript?
regPattern="(?<!\\)\\x"
Here is the test case (in Node.js):
var str = '{"key":"abc \\x123 \xe2\x80\x93 xyz"}'
var newStr = str.replace(/regPattern/g, '\\u')
console.log(newStr); // output: '{"key":"abc \\x123 \ue2\u80\u93 xyz"}'
\\x123 doesn't match because it contains \\x, but \x matches.
Try this:
var newStr = str.replace(/([^\\]|^)\\x/g, '$1\\u');
In other words, match the ^ (start of string) or any non-\ character, followed by \x, capturing the first character in capture group 1.
Then replace the whole 3-character matched group with capture group 1, followed by \u.
For example, in abc?\x, the string ?\x will be matched, and capture group 1 will be ?. So we replace the match (?\x) with $1\u, which evaluates to ?\u. So abc?\x -> abc?\u.

javascript regex to check if first and last character are similar?

Is there any simple way to check if first and last character of a string are the same or not, only with regex?
I know you can check with charAt
var firstChar = str.charAt(0);
var lastChar = str.charAt(length-1);
console.log(firstChar===lastChar):
I'm not asking for this : Regular Expression to match first and last character
You can use regex with capturing group and its backreference to assert both starting and ending characters are same by capturing the first caharacter. To test the regex match use RegExp#test method.
var regex = /^(.).*\1$/;
console.log(
regex.test('abcdsa')
)
console.log(
regex.test('abcdsaasaw')
)
Regex explanation here :
^ asserts position at start of the string
1st Capturing Group (.)
.* matches any character (except newline) - between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\1 matches the same text as most recently matched by the 1st capturing group
$ asserts position at the end of the string
The . doesn't include newline character, in order include newline update the regex.
var regex = /^([\s\S])[\s\S]*\1$/;
console.log(
regex.test(`abcd
sa`)
)
console.log(
regex.test(`ab
c
dsaasaw`)
)
Refer : How to use JavaScript regex over multiple lines?
Regex explanation here :
[.....] - Match a single character present
\s - matches any whitespace character (equal to [\r\n\t\f\v ])
\S - matches any non-whitespace character (equal to [^\r\n\t\f ])
finally [\s\S] is matches any character.
You can try it
const rg = /^([\w\W]+)[\w\W]*\1$/;
console.log(
rg.test(`abcda`)
)
console.log(
rg.test(`aebcdae`)
)
console.log(
rg.test(`aebcdac`)
)
var rg = /^([a|b])([a|b]+)\1$|^[a|b]$/;
console.log(rg.test('aabbaa'))
console.log(rg.test('a'))
console.log(rg.test('b'))
console.log(rg.test('bab'))
console.log(rg.test('baba'))
This will make sure that characters are none other than a and b which have the same start and end.
It will also match single characters because they too start and end with same character.

translating RegEx syntax working in php and python to JS

I have this RegEx syntax: "(?<=[a-z])-(?=[a-z])"
It captures a dash between 2 lowercase letters. In example below the second dash is captured:
Krynica-Zdrój, ul. Uzdro-jowa
Unfortunately I can't use <= in JS.
My ultimate goal is to remove the hyphen with RegEx replace.
It seems to me you need to remove the hyphen in between lowercase letters.
Use
var s = "Krynica-Zdrój, ul. Uzdro-jowa";
var res = s.replace(/([a-z])-(?=[a-z])/g, "$1");
console.log(res);
Note the first lookbehind is turned into a simple capturing group and the second lookahead is OK to use since - potentially, if there are chunks of hyphenated single lowercase letters - it will be able to deal with overlapping matches.
Details:
([a-z]) - Group 1 capturing a lowercase ASCII letter
- - a hyphen
(?=[a-z]) - that is followed with a lowercase ASCII letter that is not added to the result
-/g - a global modifier, search for all occurrences of the pattern
"$1" - the replacement pattern containing just the backreference to the value stored in Group 1 buffer.
VBA sample code:
Sub RemoveExtraHyphens()
Dim s As String
Dim reg As New regexp
reg.pattern = "([a-z])-(?=[a-z])"
reg.Global = True
s = "Krynica-Zdroj, ul. Uzdro-jowa"
Debug.Print reg.Replace(s, "$1")
End Sub

Categories