Conditional(?) regex for numbers

Conditional(?) regex for numbers - javascript

i was trying to create a regex that could match numbers within brackets or not, for example:
(1.000.000,00) //match
(1.000,00) //match
(100,00) //match
(10) //match
(1) //match
(2.000.000,00 //dont't match
(2.000,00 //dont't match
(200,00 //dont't match
(20 //dont't match
(2 //dont't match
3.000.000,00) //dont't match
3.000,00) //dont't match
300,00) //dont't match
30) //dont't match
3) //dont't match
4.000.000,00 //should match
4.000,00 //should match
400,00 //should match
40 //should match
4 //should match
I need to match only numbers(in brackets or not), but only if they have all brackets (2) or none(0)
At the moment this is what i came up with: \((\d+[\.,]?)+\d*\), it matches the --match and doesn't match the --don't match but should match also the --should match
I've added the javascript tag because i'm using this regex in js and not all of the regex tokens work in the js regex constructor
I'm posting also a regex101 link

If supported, you can usea negative lookbehind to match either with or without parenthesis:
\(\d+(?:[.,]\d+)*\)|(?<!\S)\d+(?:[.,]\d+)*(?!\S)
\( Match (
\d+(?:[.,]\d+)* Match 1+ digits and optionally repeat matching . or , and again 1+ digits
\) Match )
| Or
(?<!\S) Negative lookbehind, assert a word boundary to the left
\d+(?:[.,]\d+)* Match 1+ digits and optionally repeat matching . or , and again 1+ digits
(?!\S) Negative lookahead, assert a whitespace boundary to the right
Regex demo
Another option could be matching optional parenthesis at both sides, and only keep the ones that have either an opening and closing parenthesis, or none.
const regex = /\(?\d+(?:[.,]?\d+)*\)?/
const strings = ["(1.000.000,00)", "(1.000,00)", "(100,00)", "(10)", "(1)", "(2.000.000,00", "(2.000,00", "(200,00", "(20", "(2", "3.000.000,00)", "3.000,00)", "300,00)", "30)", "3)", "4.000.000,00", "4.000,00", "400,00", "40", "4"];
strings.forEach(s => {
const m = s.match(regex);
const firstChar = s.charAt(0);
const lastChar = s.charAt(s.length - 1);
if (
m &&
(firstChar !== '(' && lastChar !== ')') ||
firstChar === '(' && lastChar === ')'
) {
console.log(s)
}
});

If you don't want to repeat the part matching numbers (which in this case is short, so maybe an exception to the DRY rule is warranted), you can reach for \(?((\d+[\.,]?)+\d*)\)?(?<=\(\1\)|(^|[^(\d.,])\1(?=($|[^\d.,)]))).

Edit: this is broken.
It will match numbers like (7 as it only matches the number and ignores the parentheses in that case.
Kept here for future reference.
It's usually easier to do regex in multiple passes, but here goes:
/(\((\d+[\.,]?)+\d*\))|(\d+[\.,]?\d*)/gm
You can test it on https://regex101.com/.
Usually it's better to process something in multiple passes as you can see the regex becomes even more unreadable.
I took your regex and just split it into two regexes: one that requires the parentheses and one that doesn't, then combined them with the or operator.
Note that this regex will allow things like "123.3,5.7" as one number, and the capturing groups will be nasty.

Related

Regex contains only numbers with optional ||/&& and number

examples where the regex should return true: 1&&2, 1||2, 1&&2||3, 1
examples where the regex should return false: 1||, 1&&, &&2
My regex is:
[0-9]+([\\|\\|\\&&][0-9])*
but it returns true if the input is 1&&&2.
Where is my mistake?

Note that [\|\|\&&] matches a single | or & char, not || or && sequences of chars. Also, the [0-9] without a quantifier matches only one digit. Without anchors, you may match a string partially inside a longer string.
You may use
^[0-9]+(?:(?:\|\||&&)[0-9]+)*$
Actually, to match anywhere inside a string, keep on using the pattern without anchors:
[0-9]+(?:(?:\|\||&&)[0-9]+)*
See the regex demo
Details
^ - start of string
[0-9]+ - 1+ digits
(?:(?:\|\||&&)[0-9])* - 0 or more repetitions of
(?:\|\||&&) - || or && sequence of characters
[0-9]+ - 1+ digits
$ - end of string.
JS demo:
const reg = /^[0-9]+(?:(?:\|\||&&)[0-9]+)*$/;
console.log( reg.test('1||2') ); // => true

How to match string with hypens in JavaScript?

I want to match certain parts of a URL that has the following form:
http://example.com/somepath/this-is-composed-of-hypens-3144/someotherpath
More precisely, I want to match only the part that is composed of all hypens and ends in numbers. So, I want to extract the part this-is-composed-of-hypens-3144 in the above URL. I have something like this:
const re = /[a-z]*-[a-z]*-[0-9]*/gis;
const match = re.exec(url);
return (match && match.length) ? match[0] : null;
However, this works only if there are 2 hypens, however, the number of hypens in my case can be arbitrary. How can I make my regex work for arbitrary number of hypens?

You may use
/\/([a-z]+(?:-[a-z]+)*-[0-9]+)(?:\/|$)/i
See the regex demo
Details
\/ - a / char
([a-z]+(?:-[a-z]*)*-[0-9]+) - Capturing group 1:
[a-z]+ - 1+ ASCII letters
(?:-[a-z]+)* - 0+ occurrences of - followed with 1+ ASCII letters
(?:\/|$) - either / or end of string.
If there can be any word chars, not just ASCII letters, you may replace each [a-z] with \w.
var s = "http://example.com/somepath/this-is-composed-of-hypens-3144/someotherpath";
var m = s.match(/\/([a-z]+(?:-[a-z]+)*-[0-9]+)(?:\/|$)/i);
if (m) {
console.log(m[1]);
}

Regular expression capture with optional trailing underscore and number

I'm trying to find a regular expression that will match the base string without the optional trailing number (_123). e.g.:
lorem_ipsum_test1_123 -> capture lorem_ipsum_test1
lorem_ipsum_test2 -> capture lorem_ipsum_test2
I tried using the following expression, but it would only work when there is a trailing _number.
/(.+)(?>_[0-9]+)/
/(.+)(?>_[0-9]+)?/
Similarly, adding the ? (zero or more) quantifier only worked when there is no trailing _number, otherwise, the trailing _number would just be part of the first capture.
Any suggestions?

You may use the following expression:
^(?:[^_]+_)+(?!\d+$)[^_]+
^ Anchor beginning of string.
(?:[^_]+_)+ Repeated non capturing group. Negated character set for anything other than a _, followed by a _.
(?!\d+$) Negative lookahead for digits at the end of the string.
[^_]+ Negated character set for anything other than a _.
Regex demo here.
Please note that the \n in the character sets in the Regex demo are only for demonstration purposes, and should by all means be removed when using as a pattern in Javascript.
Javascript demo:
var myString = "lorem_ipsum_test1_123";
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);
var myString = "lorem_ipsum_test2"
var myRegexp = /^(?:[^_]+_)+(?!\d+$)[^_]+/g;
var match = myRegexp.exec(myString);
console.log(match[0]);

You might match any character and use a negative lookahead that asserts that what follows is not an underscore, one or more digits and the end of the string:
^(?:(?!_\d+$).)*
Explanation
^ Assert start of the string
(?: Non capturing group
(?! Negative lookahead to assert what is on the right side is not
_\d+$Match an underscore, one or more digits and assert end of the string
.) Match any character and close negative lookahead
)* Close non capturing group and repeat zero or more times
Regex demo
const strings = [
"lorem_ipsum_test1_123",
"lorem_ipsum_test2"
];
let pattern = /^(?:(?!_\d+$).)*/;
strings.forEach((s) => {
console.log(s + " ==> " + s.match(pattern)[0]);
});

You are asking for
/^(.*?)(?:_\d+)?$/
See the regex demo. The point here is that the first dot pattern must be non-greedy and the _\d+ should be wrapped with an optional non-capturing group and the whole pattern (especially the end) must be enclosed with anchors.
Details
^ - start of string
(.*?) - Capturing group 1: any zero or more chars other than line break chars, as few as possible due to the non-greedy ("lazy") quantifier *?
(?:_\d+)? - an optional non-capturing group matching 1 or 0 occurrences of _ and then 1+ digits
$ - end of string.
However, it seems easier to use a mere replacing approach,
s = s.replace(/_\d+$/, '')
If the string ends with _ and 1+ digits, the substring will get removed, else, the string will not change.
See this regex demo.

Try to check if the string contains the trailing number. If it does you get only the other part. Otherwise you get the whole string.
var str = "lorem_ipsum_test1_123"
if(/_[0-9]+$/.test(str)) {
console.log(str.match(/(.+)(?=_[0-9]+)/g))
} else {
console.log(str)
}
Or, a lot more concise:
str = str.replace(/_[0-9]+$/g, "")

javascript regex to check if first and last character are similar?

Is there any simple way to check if first and last character of a string are the same or not, only with regex?
I know you can check with charAt
var firstChar = str.charAt(0);
var lastChar = str.charAt(length-1);
console.log(firstChar===lastChar):
I'm not asking for this : Regular Expression to match first and last character

You can use regex with capturing group and its backreference to assert both starting and ending characters are same by capturing the first caharacter. To test the regex match use RegExp#test method.
var regex = /^(.).*\1$/;
console.log(
regex.test('abcdsa')
)
console.log(
regex.test('abcdsaasaw')
)
Regex explanation here :
^ asserts position at start of the string
1st Capturing Group (.)
.* matches any character (except newline) - between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\1 matches the same text as most recently matched by the 1st capturing group
$ asserts position at the end of the string
The . doesn't include newline character, in order include newline update the regex.
var regex = /^([\s\S])[\s\S]*\1$/;
console.log(
regex.test(`abcd
sa`)
)
console.log(
regex.test(`ab
c
dsaasaw`)
)
Refer : How to use JavaScript regex over multiple lines?
Regex explanation here :
[.....] - Match a single character present
\s - matches any whitespace character (equal to [\r\n\t\f\v ])
\S - matches any non-whitespace character (equal to [^\r\n\t\f ])
finally [\s\S] is matches any character.

You can try it
const rg = /^([\w\W]+)[\w\W]*\1$/;
console.log(
rg.test(`abcda`)
)
console.log(
rg.test(`aebcdae`)
)
console.log(
rg.test(`aebcdac`)
)

var rg = /^([a|b])([a|b]+)\1$|^[a|b]$/;
console.log(rg.test('aabbaa'))
console.log(rg.test('a'))
console.log(rg.test('b'))
console.log(rg.test('bab'))
console.log(rg.test('baba'))
This will make sure that characters are none other than a and b which have the same start and end.
It will also match single characters because they too start and end with same character.

Regex needed to split a string by "."

I am in need for a regex in Javascript. I have a string:
'*window.some1.some\.2.(a.b + ")" ? cc\.c : d.n [a.b, cc\.c]).some\.3.(this.o.p ? ".mike." [ff\.]).some5'
I want to split this string by periods such that I get an array:
[
'*window',
'some1',
'some\.2', //ignore the . because it's escaped
'(a.b ? cc\.c : d.n [a.b, cc\.c])', //ignore everything inside ()
'some\.3',
'(this.o.p ? ".mike." [ff\.])',
'some5'
]
What regex will do this?

var string = '*window.some1.some\\.2.(a.b + ")" ? cc\\.c : d.n [a.b, cc\\.c]).some\\.3.(this.o.p ? ".mike." [ff\\.]).some5';
var pattern = /(?:\((?:(['"])\)\1|[^)]+?)+\)+|\\\.|[^.]+?)+/g;
var result = string.match(pattern);
result = Array.apply(null, result); //Convert RegExp match to an Array
Fiddle: http://jsfiddle.net/66Zfh/3/
Explanation of the RegExp. Match a consecutive set of characters, satisfying:
/ Start of RegExp literal
(?: Create a group without reference (example: say, group A)
\( `(` character
(?: Create a group without reference (example: say, group B)
(['"]) ONE `'` OR `"`, group 1, referable through `\1` (inside RE)
\) `)` character
\1 The character as matched at group 1, either `'` or `"`
| OR
[^)]+? Any non-`)` character, at least once (see below)
)+ End of group (B). Let this group occur at least once
| OR
\\\. `\.` (escaped backslash and dot, because they're special chars)
| OR
[^.]+? Any non-`.` character, at least once (see below)
)+ End of group (A). Let this group occur at least once
/g "End of RegExp, global flag"
/*Summary: Match everything which is not satisfying the split-by-dot
condition as specified by the OP*/
There's a difference between + and +?. A single plus attempts to match as much characters as possible, while a +? matches only these characters which are necessary to get the RegExp match. Example: 123 using \d+? > 1 and \d+ > 123.
The String.match method performs a global match, because of the /g, global flag. The match function with the g flag returns an array consisting of all matches subsequences.
When the g flag is omitted, only the first match will be selected. The array will then consist of the following elements:
Index 0: <Whole match>
Index 1: <Group 1>

The regex below :
result = subject.match(/(?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))/g);
Can be used to acquire the desired results. Group 1 has the results since you want to omit the .
Use this :
var myregexp = /(?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))/g;
var match = myregexp.exec(subject);
while (match != null) {
for (var i = 0; i < match.length; i++) {
// matched text: match[i]
}
match = myregexp.exec(subject);
}
Explanation :
// (?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))
//
// Match the regular expression below «(?:(\(.*?[^'"]\)|.*?[^\\])(?:\.|$))»
// Match the regular expression below and capture its match into backreference number 1 «(\(.*?[^'"]\)|.*?[^\\])»
// Match either the regular expression below (attempting the next alternative only if this one fails) «\(.*?[^'"]\)»
// Match the character “(” literally «\(»
// Match any single character that is not a line break character «.*?»
// Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match a single character NOT present in the list “'"” «[^'"]»
// Match the character “)” literally «\)»
// Or match regular expression number 2 below (the entire group fails if this one fails to match) «.*?[^\\]»
// Match any single character that is not a line break character «.*?»
// Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
// Match any character that is NOT a “A \ character” «[^\\]»
// Match the regular expression below «(?:\.|$)»
// Match either the regular expression below (attempting the next alternative only if this one fails) «\.»
// Match the character “.” literally «\.»
// Or match regular expression number 2 below (the entire group fails if this one fails to match) «$»
// Assert position at the end of the string (or before the line break at the end of the string, if any) «$»

It is notoriously difficult to use a Regex to do balanced parenthesis matching, especially in Javascript.
You would be way better off creating your own parser. Here's a clever way to do this that will utilize the strength of Regex's:
Create a Regex that matches and captures any "pattern of interest" - /(?:(\\.)|([\(\[\{])|([\)\]\}])|(\.))/g
Use string.replace(pattern, function (...)), and in the function, keep a count of opening braces and closing braces.
Add the matching text to a buffer.
If the split character is found and the opening and closing braces are balanced, add the buffer to your results array.
This solution will take a bit of work, and requires knowledge of closures, and you should probably see the documentation of string.replace, but I think it is a great way to solve your problem!
Update:
After noticing the number of questions related to this one, I decided to take on the above challenge.
Here is the live code to use a Regex to split a string.
This code has the following features:
Uses a Regex pattern to find the splits
Only splits if there are balanced parenthesis
Only splits if there are balanced quotes
Allows escaping of parenthesis, quotes, and splits using \
This code will work perfectly for your example.

not need regex for this work.
var s = '*window.some1.some\.2.(a.b + ")" ? cc\.c : d.n [a.b, cc\.c]).some\.3.(this.o.p ? ".mike." [ff\.]).some5';
console.log(s.match(/(?:\([^\)]+\)|.*?\.)/g));
output:
["*window.", "some1.", "some.", "2.", "(a.b + ")", "" ? cc.", "c : d.", "n [a.", "b, cc.", "c]).", "some.", "3.", "(this.o.p ? ".mike." [ff.])", "."]

So, was working with this, and now I see that #FailedDev is rather not a failure, since that was pretty nice. :)
Anyhow, here's my solution. I'll just post the regex only.
((\(.*?((?<!")\)(?!")))|((\\\.)|([^.]))+)
Sadly this won't work in your case however, since I'm using negative lookbehind, which I don't think is supported by javascript regex engine. It should work as intended in other engines however, as can be confirmed here: http://gskinner.com/RegExr/. Replace with $1\n.

We Keep Coding

JavaScript is the programming language of the Web.

Conditional(?) regex for numbers - javascript

If you don't want to repeat the part matching numbers (which in this case is short, so maybe an exception to the DRY rule is warranted), you can reach for \(?((\d+[\.,]?)+\d*)\)?(?<=\(\1\)|(^|[^(\d.,])\1(?=($|[^\d.,)]))).

Related

Regex contains only numbers with optional ||/&& and number

How to match string with hypens in JavaScript?

Regular expression capture with optional trailing underscore and number

javascript regex to check if first and last character are similar?

Regex needed to split a string by "."

Categories

Resources