how to reduce complexity in regex?

how to reduce complexity in regex? - javascript

I have a regex which finds all kind of money denoted in dollars,like $290,USD240,$234.45,234.5$,234.6usd
(\$)[0-9]+\.?([0-9]*)|usd+[0-9]+\.?([0-9]*)|[0-9]+\.?[0-9]*usd|[0-9]+\.?[0-9]*(\$)
This seems to works, but how can i avoid the complexity in my regex?

It is possible to make the regex a bit shorter by collapsing the currency indicators:
You can say USD OR $ amount instead of USD amount OR $ amount. This results in the following regex:
((\$|usd)[0-9]+\.?([0-9]*))|([0-9]+\.?[0-9]*(\$|usd))
Im not sure if you'll find this less complex, but at least it's easier to read because it's shorter
The character set [0-9] can also be replaced by \d -- the character class which matches any digit -- making the regex even shorter.
Doing this, the regex will look as follows:
((\$|usd)\d+\.?\d*)|(\d+\.?\d*(\$|usd))
Update:
According to #Toto this regex would be more performant using non-capturing groups (also removed the not-necessary capture group as pointed out by #Simon MᶜKenzie):
(?:\$|usd)\d+\.?\d*|\d+\.?\d*(?:\$|usd)
$.0 like amounts are not matched by the regex as #Gangnus pointed out. I updated the regex to fix this:
((\$|usd)((\d+\.?\d*)|(\.\d+)))|(((\d+\.?\d*)|(\.\d+))(\$|usd))
Note that I changed \d+\.?\d* into ((\d+\.?\d*)|(\.\d+)): It now either matches one or more digits, optionally followed by a dot, followed by zero or more digits; OR a dot followed by one or more digits.
Without unnecessary capturing groups and using non-capturing groups:
(?:\$|usd)(?:\d+\.?\d*|\.\d+)|(?:\d+\.?\d*|\.\d+)(?:\$|usd)

Try this
^(?:\$|usd)?(?:\d+\.?\d*)(?:\$|usd)?$

Reducing the complexity you are reducing the correctness. The following regex works correctly, but even it doesn't take lowcase. (but that could be managed by a key). All other current answers here simply haven't the correct substring for the decimal number.
^\s*(?:(?:(?:-?(?:usd|\$)|(?:usd|\$)-)(?:(?:0|[1-9]\d*)?(?:\.\d+)?(?<=\d)))|(?:-?(?:(?:0|[1-9]\d*)?(?:\.\d+)?(?<=\d))(?:usd|\$)))\s*$
Look here at the test results.
Make a correct line and only after that try to shorten it.

Related

javascript regex : possible to have a range in the quantifier? [duplicate]

I've written a regular expression that matches any number of letters with any number of single spaces between the letters. I would like that regular expression to also enforce a minimum and maximum number of characters, but I'm not sure how to do that (or if it's possible).
My regular expression is:
[A-Za-z](\s?[A-Za-z])+
I realized it was only matching two sets of letters surrounding a single space, so I modified it slightly to fix that. The original question is still the same though.
Is there a way to enforce a minimum of three characters and a maximum of 30?

Yes
Just like + means one or more you can use {3,30} to match between 3 and 30
For example [a-z]{3,30} matches between 3 and 30 lowercase alphabet letters
From the documentation of the Pattern class
X{n,m} X, at least n but not more than m times
In your case, matching 3-30 letters followed by spaces could be accomplished with:
([a-zA-Z]\s){3,30}
If you require trailing whitespace, if you don't you can use: (2-29 times letter+space, then letter)
([a-zA-Z]\s){2,29}[a-zA-Z]
If you'd like whitespaces to count as characters you need to divide that number by 2 to get
([a-zA-Z]\s){1,14}[a-zA-Z]
You can add \s? to that last one if the trailing whitespace is optional. These were all tested on RegexPlanet
If you'd like the entire string altogether to be between 3 and 30 characters you can use lookaheads adding (?=^.{3,30}$) at the beginning of the RegExp and removing the other size limitations
All that said, in all honestly I'd probably just test the String's .length property. It's more readable.

This is what you are looking for
^[a-zA-Z](\s?[a-zA-Z]){2,29}$
^ is the start of string
$ is the end of string
(\s?[a-zA-Z]){2,29} would match (\s?[a-zA-Z]) 2 to 29 times..

Actually Benjamin's answer will lead to the complete solution to the OP's question.
Using lookaheads it is possible to restrict the total number of characters AND restrict the match to a set combination of letters and (optional) single spaces.
The regex that solves the entire problem would become
(?=^.{3,30}$)^([A-Za-z][\s]?)+$
This will match AAA, A A and also fail to match AA A since there are two consecutive spaces.
I tested this at http://regexpal.com/ and it does the trick.

You should use
[a-zA-Z ]{20}
[For allowed characters]{for limiting of the number of characters}

JavaScript regular expression match amount

I'm trying to write a regular expression to match amounts. In my case, what I need is that either the amount should be a positive integer or if the decimal is used, it must be followed by one or two integers. So basically, the following are valid amounts:
34000
345.5
876.45
What I wrote was this: /[0-9]+(\.[0-9]{1,2}){0,1}/
My thinking was that by using parenthesis like so: (\.[0-9]{1,2}), I would be able to bundle the whole "decimal plus one or two integers" part. But it isn't happening. Among other problems, this regex is allowing stuff like 245. and 345.567 to slip through. :(
Help, please!

Your regular expression is good, but you need to match the beginning and end of the string. Otherwise, your regex can match only a portion of the string and still (correctly) return a match. To match the beginning of the string, use ^, for the end, use $.
Update: as Avinash has noted, you can replace {0,1} with ?. JS supports \d for digits, so the regex can be further simplified
Finally, since if are only testing against a regex, you can use a non-capturing group ( (?:...) instead of (...)), which offers better performance.
original:
/[0-9]+(\.[0-9]{1,2}){0,1}/.test('345.567')
Fixed, and faster ;)
/^\d+(?:\.\d{1,2})?$/.test('345.567')

Javascript regex: is there anyway to write a regex which gives true if backreference is NOT matched

so here is my problem: I'm checking an input of 2 years with a hyphen. Like:
2001-2015
To test this, I use the simple regex
/^([0-9]{4})-([0-9]{4})$/
I know groups aren't needed, and (19|20)[0-9]{2}, is a closer match to the basic year exp, but bear with me.
Now, if my requirement was to match the two years only if they are the same, i could have used a backreference like:
/^([0-9]{4})-\1$/
which matches 2000-2000 but not 2000-2014
My actual requirement is exactly the opposite. I want it to match if the years are different but not if they're same. That is, 2000-2014 should match. 2000-2000 should not.
And using the negative of the boolean I find is not an option. I need this for a huuuge regex which is supposed to match a whole lot of different date formats. This is just a part of it.
Is there any way to achieve this?

You can use a negative lookahead to achieve this:
^([0-9]{4})-(?!\1)[0-9]{4}$
Demo
This is almost the same pattern, except it inserts a condition check using the backreference.
(?!\1) will fail if \1 matches at its position.

You can use negative lookahead:
\b(\d{4})-(?!\1)\d{4}\b
RegEx Demo

Use Negative Lookahead.
Like this :
^([0-9]{4})-(?!\1)[0-9]{4}$
It does work on your example.
Explanation : (?!\1) Assert that it is impossible to match the regex \1. Then you just put your 4 digits requirement.

Allow only a single point in decimal numbers

How can I modify this regular expression to allow numbers with just one point?
/[^0-9\.]/g
It currently allows:
0
0.13
0.13.1 (this should not be allowable)

Your regex doesn't matches what you say it matches. You have used negation in character class, and that too without any quantifier. Currently it would match any non-digit character other than ..
For your requirement, you can use this regex:
/^\d+(\.\d+)?$/

Make the match a positive one:
/^\d*(\.\d+)?$/
Any number of digits, optionally followed by a point and at least one digit. But it’s not worth it to keep a negative match.
If you want to disallow an empty string (which the original regular expression wouldn’t do), you could do this:
/^(?=.)\d*(\.\d+)?$/
But you could also just check for an empty string, which looks better anyways.

I guess this should do /^(\d*)\.{0,1}(\d){0,1}$/ OR /^(\d*)\.?(\d){0,1}$/
(\d*) Represents number of digits before decimal.
\. followed by {0,1} OR ? will make sure that there is only one dot.
(\d){0,1} Allows only one digit after decimal.

You can try the following regex ^[-+]?\d*.?\d*$

Try,
(value.match(/^\d+([.]\d{0,1})?$/))

Try the following:
/^(\d*)(\.\d*)?$/g

RegEx string for three letters and two numbers with pre- and post- spaces

Two quick questions:
What would be a RegEx string for three letters and two numbers with space before and after them (i.e. " LET 12 ")?
Would you happen to know any good RegEx resources/tools?

For a good resource, try this website and the program RegexBuddy. You may even be able to figure out the answer to your question yourself using these sites.
To start you off you want something like this:
/^[a-zA-Z]{3}\s+[0-9]{2}$/
But the exact details depend on your requirements. It's probably a better idea that you learn how to use regular expressions yourself and then write the regular expression instead of just copying the answers here. The small details make a big difference. Examples:
What is a "letter"? Just A-Z or also foreign letters? What about lower case?
What is a "number"? Just 0-9 or also foreign numerals? Only integers? Only positive integers? Can there be leading zeros?
Should there be a single space between the letters and numbers? Or any amount of any whitespace? Even none?
Do you want to search for this string in a larger text? Or match a line exactly?
etc..
The answers to these questions will change the regular expression. It would be much faster for you in the long run to learn how to create the regular expression than to completely specify your requirements and wait for other people to reply.
I forgot to mention that there will be a space before and after. How do I include that?
Again you need to consider the questions:
Do you mean just one space or any amount of spaces? Possibly not always a space but only sometimes?
Do you mean literally a space character or any whitespace characters?
My guess is:
/^\s+[a-zA-Z]{3}\s+[0-9]{2}\s+$/

/[a-z]{3} [0-9]{2}/i will match 3 letters followed by a whitespace character, and then 2 numbers. [a-z] is a character class containing the letters a through z, and the {3} means that you want exactly 3 members of that class. The space character matches a literal space (alternately, you could use \s, which is a "shorthand" character class that matches any whitespace character). The i at the end is a pattern modifier specifying that your pattern is case-insenstive.
If you want the entire string to only be that, you need to anchor it with ^ and $:
/^[a-z]{3} [0-9]{2}$/i
Regular expression resources:
http://www.regular-expressions.info - great tutorial with a lot of information
http://rexv.org/ - online regular expression tester that supports a variety of engines.

^([A-Za-z]{3}) ([0-9]{2})$ assuming one space between the letters/numbers, as in your example. This will capture the letters and numbers separately.
I use http://gskinner.com/RegExr/ - it allows you to build a regex and test it with your own text.

As you can probably tell from the wide variety of answers, RegEx is a complex subject with a wide variety of opinions and preferences, and often more than one way of doing things. Here's my preferred solution.
^[a-zA-Z]{3}\s*\d{2}$
I used [a-zA-Z] instead of \w because \w sometimes includes underscores.
The \s* is to allow zero or more spaces.
I try to use character classes wherever possible, which is why I went with \d.

\w{3}\s{1}\d{2}
And I like this site.
EDIT:[a-zA-Z]{3}\s{1}\d{2} - The \w supports numeric characters too.

try this regularexpression
[^"\r\n]{3,}

We Keep Coding

JavaScript is the programming language of the Web.

how to reduce complexity in regex? - javascript

I have a regex which finds all kind of money denoted in dollars,like $290,USD240,$234.45,234.5$,234.6usd (\$)[0-9]+\.?([0-9])|usd+[0-9]+\.?([0-9])|[0-9]+\.?[0-9]usd|[0-9]+\.?[0-9](\$) This seems to works, but how can i avoid the complexity in my regex?

Try this ^(?:\$|usd)?(?:\d+\.?\d*)(?:\$|usd)?$

Related

javascript regex : possible to have a range in the quantifier? [duplicate]

JavaScript regular expression match amount

Javascript regex: is there anyway to write a regex which gives true if backreference is NOT matched

Allow only a single point in decimal numbers

RegEx string for three letters and two numbers with pre- and post- spaces

Categories

Resources

We Keep Coding

JavaScript is the programming language of the Web.

how to reduce complexity in regex? - javascript

I have a regex which finds all kind of money denoted in dollars,like $290,USD240,$234.45,234.5$,234.6usd (\$)[0-9]+\.?([0-9]*)|usd+[0-9]+\.?([0-9]*)|[0-9]+\.?[0-9]*usd|[0-9]+\.?[0-9]*(\$) This seems to works, but how can i avoid the complexity in my regex?

Try this ^(?:\$|usd)?(?:\d+\.?\d*)(?:\$|usd)?$

Related

javascript regex : possible to have a range in the quantifier? [duplicate]

JavaScript regular expression match amount

Javascript regex: is there anyway to write a regex which gives true if backreference is NOT matched

Allow only a single point in decimal numbers

RegEx string for three letters and two numbers with pre- and post- spaces

Categories

Resources

I have a regex which finds all kind of money denoted in dollars,like $290,USD240,$234.45,234.5$,234.6usd (\$)[0-9]+\.?([0-9])|usd+[0-9]+\.?([0-9])|[0-9]+\.?[0-9]usd|[0-9]+\.?[0-9](\$) This seems to works, but how can i avoid the complexity in my regex?