regex intermediate - javascript

I have a string that is formatted like so: "SomeText ($4.56)"
I am using the following regex \${1}[0-9]+\.*[0-9]* to pull out currency data but it includes the $ sign.
I am intermediate and understand this as.
\$ is looking for The dollar sign
[0-9]+ is any number of digits after
\. is finding the decimal
*[0-9]* is finding zero or more digits after the decimal
I want the dollar sign removed though. How do I do this?
Thanks in advance.

That's very simple... Just use groups!
It will look like this:
\$([0-9]+\.*[0-9]*)
And group 1 will return the currency without the $.

You're doing good but removing a substring in JS with regex needs calling .replace() method. Regex:
\$(\d+(?:\.\d+)?)\b
Breakdown:
\$ Macth a literal dollar sign
( Start of capturing group one
\d+(?:\.\d+)? Match a sequence of digits with optional decimals
) End of of capturing group
\b Match a word boundary
JS code:
var str = "SomeText ($4.56)";
console.log(str.replace(/\$(\d+(?:\.\d+)?)\b/g, '$1'));

Surround the stuff you want in brackets to create a capture group.
\$(\d+\.\d{2})
Demo
var regex = /\$(\d+\.\d{2})/;
var string = "SomeText ($4.56)";
var matches = regex.exec(string);
console.log(matches); // all matches
console.log(matches[1]); // first capture group

Related

Regex for network service port definitions

my collegue and I try to build a Regex (Javascript) to validate an input field for a specific format.
The field should be a comma seperated list of port declarations and could look like this:
TCP/53,UDP/53,TCP/10-20,UDP/20-30
We tried this regex:
/^[TCP/\d+,|UDP/\d+,|TCP/\d+\-\d+,|UDP/\d+\-\d+,]*[TCP/\d+|UDP/\d+|TCP/\d+\-\d+|UDP/\d+\-\d+]$/g
the regex matches, but also matches other strings as well, like this one:
TCP/53UDP53,TCP/10-20UDP20-30
Thanks for any guidance!
You don't need all those alternations, and the [ ] are not used for grouping like that. You can also make the - and digits part optional using grouping (?:...)?
To match that string format:
^(?:TCP|UDP)\/\d+(?:-\d+)?(?:,(?:TCP|UDP)\/\d+(?:-\d+)?)*$
The pattern matches:
^ Start of string
(?:TCP|UDP) Match one of the alternatives
\/\d+(?:-\d+)? Match / 1+ digits and optionally - and 1+ digits
(?: Non capture group to repeat as a whole part
,(?:TCP|UDP)\/\d+(?:-\d+)? Match a , and repeat the same pattern
)* Close non capture group and optionally repeat (If there should be at least 1 comma, change the * to +)
$ End of string
Regex demo
Alternative: split up the string, use Array.filter and a relative simple RegExp for testing.
const valid = `TCP/53,UDP/53,TCP/10-20,UDP/20-30`;
const invalid = `TCP/53UDP53,TCP/10-20UDP20-30`;
console.log(`${valid} ok? ${checkInp(valid)}`);
console.log(`${invalid} ok? ${checkInp(invalid)}`);
function checkInp(str) {
return str.split(`,`)
.filter(v => /^(TCP|UDP)\/\d+(?:-\d+)*$/.test(v))
.join(`,`)
.length === str.length;
}

Regular expression capture with optional trailing underscore and number

I'm trying to find a regular expression that will match the base string without the optional trailing number (_123). e.g.:
lorem_ipsum_test1_123 -> capture lorem_ipsum_test1
lorem_ipsum_test2 -> capture lorem_ipsum_test2
I tried using the following expression, but it would only work when there is a trailing _number.
/(.+)(?>_[0-9]+)/
/(.+)(?>_[0-9]+)?/
Similarly, adding the ? (zero or more) quantifier only worked when there is no trailing _number, otherwise, the trailing _number would just be part of the first capture.
Any suggestions?
You may use the following expression:
^(?:[^_]+_)+(?!\d+$)[^_]+
^ Anchor beginning of string.
(?:[^_]+_)+ Repeated non capturing group. Negated character set for anything other than a _, followed by a _.
(?!\d+$) Negative lookahead for digits at the end of the string.
[^_]+ Negated character set for anything other than a _.
Regex demo here.
Please note that the \n in the character sets in the Regex demo are only for demonstration purposes, and should by all means be removed when using as a pattern in Javascript.
Javascript demo:
var myString = "lorem_ipsum_test1_123";
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
var myString = "lorem_ipsum_test2"
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
You might match any character and use a negative lookahead that asserts that what follows is not an underscore, one or more digits and the end of the string:
^(?:(?!_\d+$).)*
Explanation
^ Assert start of the string
(?: Non capturing group
(?! Negative lookahead to assert what is on the right side is not
_\d+$Match an underscore, one or more digits and assert end of the string
.) Match any character and close negative lookahead
)* Close non capturing group and repeat zero or more times
Regex demo
const strings = [
"lorem_ipsum_test1_123",
"lorem_ipsum_test2"
];
let pattern = /^(?:(?!_\d+$).)*/;
strings.forEach((s) => {
console.log(s + " ==> " + s.match(pattern)[0]);
});
You are asking for
/^(.*?)(?:_\d+)?$/
See the regex demo. The point here is that the first dot pattern must be non-greedy and the _\d+ should be wrapped with an optional non-capturing group and the whole pattern (especially the end) must be enclosed with anchors.
Details
^ - start of string
(.*?) - Capturing group 1: any zero or more chars other than line break chars, as few as possible due to the non-greedy ("lazy") quantifier *?
(?:_\d+)? - an optional non-capturing group matching 1 or 0 occurrences of _ and then 1+ digits
$ - end of string.
However, it seems easier to use a mere replacing approach,
s = s.replace(/_\d+$/, '')
If the string ends with _ and 1+ digits, the substring will get removed, else, the string will not change.
See this regex demo.
Try to check if the string contains the trailing number. If it does you get only the other part. Otherwise you get the whole string.
var str = "lorem_ipsum_test1_123"
if(/_[0-9]+$/.test(str)) {
console.log(str.match(/(.+)(?=_[0-9]+)/g))
} else {
console.log(str)
}
Or, a lot more concise:
str = str.replace(/_[0-9]+$/g, "")

Javascript Regex: negative lookbehind

I am trying to replace in a formula all floating numbers that miss the preceding zero. Eg:
"4+.5" should become: "4+0.5"
Now I read look behinds are not supported in JavaScript, so how could I achieve that? The following code also replaces, when a digit is preceding:
var regex = /(\.\d*)/,
formula1 = '4+1.5',
formula2 = '4+.5';
console.log(formula1.replace(regex, '0$1')); //4+10.5
console.log(formula2.replace(regex, '0$1')); //4+0.5
Try this regex (\D)(\.\d*)
var regex = /(\D)(\.\d*)/,
formula1 = '4+1.5',
formula2 = '4+.5';
console.log(formula1.replace(regex, '$10$2'));
console.log(formula2.replace(regex, '$10$2'));
You may use
s = s.replace(/\B\.\d/g, '0$&')
See the regex demo.
Details
\B\. - matches a . that is either at the start of the string or is not preceded with a word char (letter, digit or _)
\d - a digit.
The 0$& replacement string is adding a 0 right in front of the whole match ($&).
JS demo:
var s = "4+1.5\n4+.5";
console.log(s.replace(/\B\.\d/g, '0$&'));
Another idea is by using an alternation group that matches either the start of the string or a non-digit char, capturing it and then using a backreference:
var s = ".4+1.5\n4+.5";
console.log(s.replace(/(^|\D)(\.\d)/g, '$10$2'));
The pattern will match
(^|\D) - Group 1 (referred to with $1 from the replacement pattern): start of string (^) or any non-digit char
(\.\d) - Group 2 (referred to with $2 from the replacement pattern): a . and then a digit

Regex keeps finding character I want matched along with previous character

I have the following regex in javascript for a split operation since I can't do a negative look behind to find any delimiters , in a string that is not proceeded by one or more escape characters of \.
[^\\],
The regex works fine for finding where the commas not proceeded by \ are, but also finds the character that proceeds the comma as a match and thus splits the string incorrectly.
For example if I had the string
hello\,there,are
The result would be that e, matches my regex and not just ,. Making the split string array read
[hello\,ther] [are]
Why does the regex I am using keep finding the comma and the proceeding character instead of only matching the comma?
You cannot use split here because you'd need a lookbehind that JS regex does not support. Use a match with appropriate regex. Like the one below:
/(?:[^\\,]|\\.)+/g
See the regex demo.
The pattern matches 1 or more (+) sequences of any char other than , and \ ([^\\,]) or (|) any escaped character (excluding linebreak chars) with \\.
JS demo:
var regex = /(?:[^\\,]|\\.)+/g;
var str = "hello\\,there,are";
var res = str.match(regex);
console.log(res);

Regex for digits and a plus sign

The conditions of the regex are as follows:
Starts with either digits or a '+' sign and ends with digits.
This is going to be used to validate a certain type of number. What I got so far is:
/^\d*|\+\d*$/
This regex seems to match any string though. How would a regex that matches my conditions look like?
The regex will be used in a JavaScript function.
I think you want something like this,
^(?:[+\d].*\d|\d)$
^ Asserts that we are at the start.
[+\d] Matches a plus symbol or a digit.
.* Matches any character zero or more times.
\d Matches a digit.
| OR
\d A single digit.
$ Asserts that we are at the end.
Use this if you want to match also a line which has a single plus or digit.
^[+\d](?:.*\d)?$
DEMO
You need to use anchors ^ and $ on both sides of your regex and make first part + or digit) optional.
You can use this regex:
^([+\d].*)?\d$
RegEx Demo

Categories