What is the best way to write a regular expression in javascript? - javascript

Good day!
I don't know regular expressions very well, but I tried to compose one. I need this regular expression matched a record for example:
The user enters any value in the text field that can start with 00x00 and end with 12x99, it must contain only the sign "x" and the first pair of numbers (the one before "x") must not exceed the number "12".
I tried a record like this:
/^(00|01|02|03|04|05|06|07|08|09|10|11|12)x([0-9]{2,2})&/
and it fits me, but it's too long expression, I'm sure there's something shorter. Asking for help from You !

You can shorten the expression quite a bit.
^(0\d|1[0-2])x\d{2}$
First you can remove the parenthesis around the entire expression, they are not required if you want a full match.
The you can replace every [0-9] block with the \d token.
Then the quantifier can be simplified if you want a strict quantity {2,2} to {2}
The first part is a bit more tricky. You can actually separate the match in 2 parts. You need to match every number from 00 to 09, and every number from 10 to 12.
So this is exactly what we are going to do.
First the match from 00 to 09, the first digit doesn't change, so that's easy. The second digit is a full range from 0 to 9, so we use \d as previously mentioned. That gives us 0\d.
The second half has the same fixed first digit, 1. Again that's easy. Then it's actually a shortened range from 0 to 2. That gives us 1[0-2].
Could be one or the other, so we encapsulate that part and use the | (or) token.
And that's it, we combine everything and get the expression above!

Related

Get integer number using regex in javascript

I am trying to write a regex to get only integer numbers e.g, 23, 234, 45, etc, and not select numbers with decimal points.
For Context :
I need it in a larger regex that I am writing to convert mixed fraction latex input
For example:
5\frac{7}{8}
But it should not select latex such as:
3.5\frac{7}{8}
The regex That I have so far is:
(^(.*)(?!(\.))(.*))\\frac{([^{}]+(?:{(?:[^{}]+)}|))}{([^{}]+(?:{(?:[^{}]+)}|))}
But it is for integer and decimal numbers alike. Need to change the regex for group1.
Maybe this will do it for you:
(?<!\d\.)\b(\d+)\\frac{([^{}]+(?:{(?:[^{}]+)}|))}{([^{}]+(?:{(?:[^{}]+)}|))}
It captures an integer expression before \fraq, unless it's preceded by a digit and a full stop.
(?<!\d\.) This ensures the number isn't preceded by a digit followed by a full stop, i.e.
the integer part of a floating.
\b Must be at the start of a number (to make sure we don't get a match with the end
of a multi digit number).
(\d+) Captures the integer number
\\frac Matches the string "\fraq"
The rest is the same as you original expression.
See it here at regex101.
Edit
Since there obviously are people out there, however unbelievable, that still haven't moved to a real browser ;) - the answer has to change to:
It depends of the syntax of latex, whether you can do it or not.
(And since I don't know that, I shouldn't have said anything in the first place ;)
The problem is that, without look behinds, you can't do it without matching characters outside the expression your interested in. In your regexr-example you clearly show that you want to be able to match expression not only at the beginning of strings, but also in the middle of the. Thus we need to be able to tell that the part before our match isn't the integer part of a decimal number. Now, doing this with matching isn't a problem. E.g.
(?:^|[^\d.])(\d+)\\frac{...
like Wiktor suggested in comments, will match if the expression is on the start of the line (^), or is preceded by something that isn't a decimal point or a digit ([^\d.]). That should do it. (Here at regex101.)
Well, as pointed out earlier, it depends on the syntax of latex. If two expressions can be directly adjacent, without any operators or stuff between them, you can't (as far as I can tell) do it with JS regex. Consider 1\fraq{0}{1}2\fraq{2}{3} (which I have no idea if it's syntactically correct). The first expression 1\fraq{0}{1} is a piece of a cake. But after that has been matched, we'd need to match a character before the second expression to verify the it doesn't start with a decimal number, but since the first expression already ate the characters, we can't. Because the test (?:^|[^\d.]) to verify that our expression doesn't start with a decimal number, would match one of the characters that actually belongs to our expression (the 2 in 2\fraq{2}{3}), thus making the match fail, because the remaining part doesn't start with the digit needed to satisfy the rest of the regex (\d+)\\frac{....
If, however, an expression always starts the string tested, or is preceded by and operator, or such, then it should be possible using
(?:^|[^\d.])(\d+)\\frac{([^{}]+(?:{(?:[^{}]+)})?)}{([^{}]+(?:{(?:[^{}]+)})?)}
Here at regex101.
(Sorry for my rambling)

Use RegExp to restrict input data

I try to force the user to enter a password with exact number of caracts.
The password have to be like that :
8 total caracts :
1 (exactly) uppercase,
3 (exactly) numbers
and lowercases (4 in that case).
The regex i have : (?=.*[a-z])(?=.*[A-Z]{1})(?=.*[0-9]{3})
The problem is we can put more than 3 numb and more then 1 uper and more than 8 caract..
Examples :
5p23qPsb -> OK
I9Opdi90 -> NOT OK
h7y1Rdw6 -> OK
IUD8954r -> NOT OK
An idea or some help ?
Thanks (I'm french so sorry for my english..)
You are definitely on the right track with this. You are stacking lookaheads, but lookaheads don't actually match anything. They just check to make sure that it can complete a match. So you'll need to add something like a .+ for it to actually match on. Here is a workdown on how you can validate your string.
Let's start with the entire string. We want it to be exactly 8 characters, so we can do something like this:
^(?=.{8}$)
Then, we want exactly one uppercase letter, so we can make a lookahead like this:
(?=.*[A-Z])
Next, we want to require exactly one digit, so we can use this:
(?=.*[0-9])
Finally, we want to require exactly 4 lowercase letters that may or may not be next to each other. Here's a lookahead that matches that:
(?=.*[a-z].*[a-z].*[a-z].*[a-z])
Now that we've got all of the pieces, we can set our actual expression to match any character, more than one time .+. The previous lookaheads will make sure it fits the requirements. Piecing everything together, we end up with an expression like this:
^(?=.{8}$)(?=.*[a-z].*[a-z].*[a-z].*[a-z])(?=.*[A-Z])(?=.*[0-9]).+
Here is a demo
/(?=.*?[A-Z]{1})(?=.*?[a-z]{4})(?=.*?[0-9]{3})/.test("123Aacds");

Javascript RegEx - Enforcing two max-lengths

It's been a while that I am juggling around this. Hope you can give me
some pointers.
All I want to achieve is, the string should contain EXACTLY 4 '-' and 10 digits in any giver order.
I created this regex : ^(-\d-){10}$
It does enforce max-length of 10 on digits but I am not getting a way to implement max-length of 4 for '-'
Thanks
Ok, here's a pattern:
^(?=(?:\d*?-){4}\d*$)(?=(?:-*?\d){10}-*$).{14}$
Demo
Explanation:
The main part is ^.{14}$ which simply checks there are 14 characters in the string.
Then, there are two lookaheads at the start:
(?=(?:\d*?-){4}\d*$)
(?=(?:-*?\d){10}-*$)
The first one checks the hyphens, and the second one checks the digits and make sure the count is correct. Both match the entire input string and are very similar so let's just take a look at the first one.
(?:\d*?-){4} matches any number of digits (or none) followed by a hyphen, four times. After this match, we know there are four hyphens. (I used an ungreedy quantifier (*?) just to prevent useless backtracking, as an optimization)
\d*$ just makes sure the rest of the string is only made of digits.

Javascript Regex for Javascript Regex and Digits

The title might seem a bit recursive, and indeed it is.
I am working on a Javascript which can highlight/color Javascript code displayed in HTML. Thus, in the Internet Browser, comments will be turned green, definitions (for, if, while, etc.) will be turned a dark blue and italic, numbers will be red, and so on for other elements. However, the coloring is not all that important.
I am trying to figure out two different regular expressions which have started to cause a minor headache.
1. Finding a regular expression using a regular expression
I want to find regular expressions within the script-tags of HTML using a Javascript, such as:
match(/findthis/i);
, where the regex part of course is "/findthis/i".
The rules are as follows:
Finding multiple occurrences (/g) is not important.
It must be on the same line (not /m).
Caseinsensitive (/i).
If a backward slash (ignore character) is followed directly by a forward slash, "/", the forward slash is part of the expression - not an escape character. E.g.: /itdoesntstop\/untilnow:/
Two forward slashes right next to each other (//) is: (A) At the beginning: Not a regex; it's a comment. (B) Later on: First slash is the end of the regex and the second slash is nothing but a character.
Regex continues until the line breaks or end of input (\n|$), or the escape character (second forward slash which complies with rule 4) is encountered. However, also as long as only alphabetic characters are encountered, following the second forward slash, they are considered part of the regex. E.g.: /aregex/allthisispartoftheregex
So far what I've got is this:
'\\/(?:[^\\/\\\\]|\\/\\*)*\\/([a-zA-Z]*)?'
However, it isn't consistent. Any suggestions?
2. Find digits (alphanumeric, floating) using a regular expression
Finding digits on their own is simple. However, finding floating numbers (with multiple periods) and letters including underscore is more of a challenge.
All of the below are considered numbers (a new number starts after each space):
3 3.1 3.1.4 3a 3.A 3.a1 3_.1
The rules:
Finding multiple occurrences (/g) is not important.
It must be on the same line (not /m).
Caseinsensitive (/i).
A number must begin with a digit. However, the number can be preceeded or followed by a non-word (\W) character. E.g.: "=9.9;" where "9.9" is the actual number. "a9" is not a number. A period before the number, ".9", is not considered part of the number and thus the actual number is "9".
Allowed characters: [a-zA-Z0-9_.]
What I've got:
'(^|\\W)\\d([a-zA-Z0-9_.]*?)(?=([^a-zA-Z0-9_.]|$))'
It doesn't work quite the way I want it.
For the first part, I think you are quite close. Here is what I would use (as a regex literal, to avoid all the double escapes):
/\/(?:[^\/\\\n\r]|\\.)+\/([a-z]*)/i
I don't know what you intended with your second alternative after the character class. But here the second alternative is used to consume backslashes and anything that follows them. The last part is important, so that you can recognize the regex ending in something like this: /backslash\\/. And the ? at the end of your regex was redundant. Otherwise this should be fine.
Test it here.
Your second regex is just fine for your specification. There are a few redundant elements though. The main thing you might want to do is capture everything but the possible first character:
/(?:^|\W)(\d[\w.]*)/i
Now the actual number (without the first character) will be in capturing group 1. Note that I removed the ungreediness and the lookahead, because greediness alone does exactly the same.
Test it here.

UK bank sort code javascript regular expression

I'm trying to create a regular expression in javascript for a UK bank sort code so that the user can input 6 digits, or 6 digits with a hyphen between pairs. For example "123456" or "12-34-56". Also not all of the digits can be 0.
So far I've got /(?!0{2}(-?0{2}){2})(\d{2}(-\d{2}){2})|(\d{6})/ and this jsFiddle to test.
This is my first regular expression so I'm not sure I'm doing it right. The test for 6 0-digits should fail and I thought the -? optional hyphen in the lookahead would cause it to treat it the same as 6 0-digits with hyphens, but it isn't.
I'd appreciate some help and any criticism if I'm doing it completely incorrectly!
Just to answer your question, you can validate user input with:
/^(?!(?:0{6}|00-00-00))(?:\d{6}|\d\d-\d\d-\d\d)$/.test(inputString)
It will strictly match only input in the form XX-XX-XX or XXXXXX where X are digits, and will exclude 00-00-00, 000000 along with any other cases (e.g. XX-XXXX or XXXX-XX).
However, in my opinion, as stated in other comments, I think it is still better if you force user to either always enter the hyphen, or none at all. Being extra strict when dealing with anything related to money saves (unknown) troubles later.
Since any of the digits can be zero, but not all at once, you should treat the one case where they are all zero as a single, special case.
You are checking for two digits (\d{2}), then an optional hyphen (-?), then another two digits (\d{2}) and another optional hyphen (-?), before another two digits (\d{2}).
Putting this together gives \d{2}-?\d{2}-?\d{2}, but you can simplify this further:
(\d{2}-?){2}\d{2}
You then use the following pseudocode to match the format but not 000000 or 00-00-00:
if (string.match("/(\d{2}-?){2}\d{2}/") && !string.match("/(00-?){2}00/"))
//then it's a valid code, you could also use (0{2}-?){2}0{2} to check zeros
You may wish to add the string anchors ^ (start) and $ (end) to check the entire string.

Categories