This is from an exercise on FCC beta and i can not understand how the following code means two consecutive numbers seeing how \D* means NOT 0 or more numbers and \d means number, so how does this accumulate to two numbers in a regexp?
let checkPass = /(?=\w{5,})(?=\D*\d)/;
This does not match two numbers. It doesn't really match anything except an empty string, as there is nothing preceding the lookup.
If you want to match two digits, you can do something like this:
(\d)(\d)
Or if you really want to do a positive lookup with the (?=\D*\d) section, you will have to do something like this:
\d(?=\D*\d)
This will match against the last digit which is followed by a bunch of non-digits and a single digit. A few examples (matched numbers highlighted):
2 hhebuehi3
^
245673
^^^^^
2v jugn45
^ ^
To also capture the second digit, you will have to put brackets around both numbers. Ie:
(\d)(?=\D*(\d))
Here it is in action.
In order to do what your original example wants, ie:
number
5+ \w characters
a non-number character
a number
... you will need to precede your original example with a \d character. This means that your lookups will actually match something which isn't just an empty string:
\d(?=\w{5,})(?=\D*\d)
IMPORTANT EDIT
After playing around a bit more with a JavaScript online console, I have worked out the problem with your original Regex.
This matches a string with 5 or more characters, including at least 1 number. This can match two numbers, but it can also match 1 number, 3 numbers, 12 numbers, etc. In order to match exactly two numbers in a string of 5-or-more characters, you should specify the number of digits you want in the second half of your lookup:
let regex = /(?=\w{5,})(?=\D*\d{2})/;
let string1 = "abcd2";
let regex1 = /(?=\w{5,})(?=\D*\d)/;
console.log("string 1 & regex 1: " + regex1.test(string1));
let regex2 = /(?=\w{5,})(?=\D*\d{2})/;
console.log("string 1 & regex 2: " + regex2.test(string1));
let string2 = "abcd23";
console.log("string 2 & regex 2: " + regex2.test(string2));
My original answer was about Regex in a vacuum and I glossed over the fact that you were using Regex in conjunction with JavaScript, which works a little differently when comparing Regex to a string. I still don't know why your original answer was supposed to match two numbers, but I hope this is a bit more helpful.
?= Positive lookahead
w{5,} matches any word character (equal to [a-zA-Z0-9_])
{5,}. matches between 5 and unlimited
\D* matches any character that\'s not a digit (equal to [^0-9])
* matches between zero and unlimited
\d matches a digit (equal to [0-9])
This expression is global - so tries to match all
You can always check your expression using regex101
Related
I'm attempting to string match 5-digit coupon codes spread throughout a HTML web page. For example, 53232, 21032, 40021 etc... I can handle the simpler case of any string of 5 digits with [0-9]{5}, though this also matches 6, 7, 8... n digit numbers. Can someone please suggest how I would modify this regular expression to match only 5 digit numbers?
>>> import re
>>> s="four digits 1234 five digits 56789 six digits 012345"
>>> re.findall(r"\D(\d{5})\D", s)
['56789']
if they can occur at the very beginning or the very end, it's easier to pad the string than mess with special cases
>>> re.findall(r"\D(\d{5})\D", " "+s+" ")
Without padding the string for special case start and end of string, as in John La Rooy answer one can use the negatives lookahead and lookbehind to handle both cases with a single regular expression
>>> import re
>>> s = "88888 999999 3333 aaa 12345 hfsjkq 98765"
>>> re.findall(r"(?<!\d)\d{5}(?!\d)", s)
['88888', '12345', '98765']
full string: ^[0-9]{5}$
within a string: [^0-9][0-9]{5}[^0-9]
Note: There is problem in using \D since \D matches any character that is not a digit , instead use \b.
\b is important here because it matches the word boundary but only at end or beginning of a word .
import re
input = "four digits 1234 five digits 56789 six digits 01234,56789,01234"
re.findall(r"\b\d{5}\b", input)
result : ['56789', '01234', '56789', '01234']
but if one uses
re.findall(r"\D(\d{5})\D", s)
output : ['56789', '01234']
\D is unable to handle comma or any continuously entered numerals.
\b is important part here it matches the empty string but only at end or beginning of a word .
More documentation: https://docs.python.org/2/library/re.html
More Clarification on usage of \D vs \b:
This example uses \D but it doesn't capture all the five digits number.
This example uses \b while capturing all five digits number.
Cheers
A very simple way would be to match all groups of digits, like with r'\d+', and then skip every match that isn't five characters long when you process the results.
You probably want to match a non-digit before and after your string of 5 digits, like [^0-9]([0-9]{5})[^0-9]. Then you can capture the inner group (the actual string you want).
You could try
\D\d{5}\D
or maybe
\b\d{5}\b
I'm not sure how python treats line-endings and whitespace there though.
I believe ^\d{5}$ would not work for you, as you likely want to get numbers that are somewhere within other text.
I use Regex with easier expression :
re.findall(r"\d{5}", mystring)
It will research 5 numerical digits. But you have to be sure not to have another 5 numerical digits in the string
I have strings like
XXX-1234
XXXX-1234
XX - 4321
ABCDE - 4321
AB -5677
So there will be letters at the beginning. then there will be hyphen. and then 4 digits. Number of letters may vary but number of digits are same = 4
Now I need to match the first 2 positions from the digits. So I tried a long process.
temp_digit=mystring;
temp_digit=temp_digit.replace(/ /g,'');
temp_digit=temp_digit.split("-");
if(temp_digit[1].substring(0,2)=='12') {}
Now is there any process using regex / pattern matching so that I can do it in an efficient way. Something like string.match(regexp) I'm dumb in regex patterns. How can I find the first two digits from 4 digits from above strings ? Also it would be great it the solution can match digits without hyphens like XXX 1234 But this is optional.
Try a regular expression that finds at least one letter [a-zA-Z]+, followed by some space if necessary \s*, followed by a hyphen -, followed by some more space if necessary \s*. It then matches the first two digits \d{2} after the pattern.:
[a-zA-Z]+\s*-\s*(\d{2})
may vary but number of digits are same = 4
Now I need to match the first 2 positions from the digits.
Also it would be great it the solution can match digits without hyphens like XXX 1234 But this is optional.
Do you really need to check it starts with letters? How about matching ANY 4 digit number, and capturing only the first 2 digits?
Regex
/\b(\d{2})\d{2}\b/
Matches:
\b a word boundary
(\d{2}) 2 digits, captured in group 1, and assigned to match[1].
\d{2} 2 more digits (not captured).
\b a word boundary
Code
var regex = /\b(\d{2})\d{2}\b/;
var str = 'ABCDE 4321';
var result = str.match(regex)[1];
document.body.innerText += result;
If there are always 4 digits at the end, you can simply slice it:
str.trim().slice(-4,-2);
here's a jsfiddle with the example strings:
https://jsfiddle.net/mckinleymedia/6suffmmm/
I need a regular expression for:
-[n digits]x[n digits]
I already tried this:
var s = "path/path/name-799x1024.jpg";
s.replace(/\d/g, "");
But this gets only the digits.
Here is a small jsfiddle: http://jsfiddle.net/aq6dp49n/
The outcome I try to get is:
pfad/pfade/name.jpg
How do I add the - and the small x between the two digits?
The regular expression that would match that is /-\d+x\d+/. Hence:
s.replace(/-\d+x\d+/, "")
Should work.
Here's what the regex means: the first - tells it that it should look for a - character. Then you have \d+ which means "one or more of \d", where \d is short-hand for the character class [0-9], i.e., all digits. After that you have x, which means it will look for the character x, and finally you have \d+ again, which is the same as before.
To match
-[n digits]x[n digits]
You would want
match(/-[0-9]{n}x[0-9]{n}\b/)
Though if you want an arbitrary (one or more) number of digits, use + in place of {n}. In the case of your example, you'd want 3 and 4 for your values of n.
Here's a step-by-step explanation of what this does:
/-[0-9]{3}x[0-9]{4}\b/
- matches the character - literally
[0-9]{3} match a single character present in the list below
Quantifier: {3} Exactly 3 times
0-9 a single character in the range between 0 and 9
x matches the character x literally (case sensitive)
[0-9]{4} match a single character present in the list below
Quantifier: {4} Exactly 4 times
0-9 a single character in the range between 0 and 9
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
To remove the last size-like part of a string, this should do:
"path/path/name-799x1024.jpg".replace(/(.*)-[0-9]+x[0-9]+/, "$1");
// "path/path/name.jpg"
"path/path/name-10x12-799x1024.jpg".replace(/(.*)-[0-9]+x[0-9]+/, "$1");
// "path/path/name-10x12.jpg"
This takes advantage of the fact that regexps are greedy, so the (.*) absorbs (and saves) as much preceding text as possible before finding the next match.
(I prefer to use [0-9] in place of \d because it's more specific (\d also matches non-latin numerals) and therefore slightly faster, though in this case it shouldn't matter.)
I need some help to improve a regex!
In JavaScript I have a regular expression which looks for pairs of numbers in a filename
var nums = str.match(/[\d]{1,}[\d]{1,}/gi);
This will match
DV_Banner_1200x627.jpg
DV_Banner_1200y627.jpg
DV_Banner_1200 x 627.jpg
DV_Banner_1200 x627.jpg
DV_Banner_1200 627.jpg
with (1200,627)
I have tried to improve the reg ex, just incase there are more than two pairs of numbers, to look for the following
number(1 digit or more) + whitspace(1 or more) + x (zero or once) + whitspace(1 or more) + number(1 digit or more)
Which should fail on the second example (using a 'y' instead on an 'x'), which I thought would be:
[\d]{1,}[\s]?[x]?[\s]?[\d]{1,}
but it grabs all the digits in
DV_Banner_1200 x 627 01.jpg
with (1200,627,01) whereas I only want the first two numbers. I've written the code to deal only with the first two, but I was wondering where I was going wrong. Only a level 17 regex wizard can save me now! Thanks
I used \d+\s?x?\s?\d+ as my regex (same thing just replacing + for {1,} and removing the unnecessary []). You can see the outcome of it here.
The reason it's matching the 01 is because of all the ?. So it's matching the first /d+ (1 digit: 0), and then 0 of \s, 0 of x, and 0 of \s followed by \d+ (another 1 digit: 1)
The regex
(\d+)(?:\s?x\s?|\s)(\d+)
should do the trick. Test it here
(?:...) is a non-capture group. So it allows alternation while not assigning a back reference to it. This part matches the characters in between the two numbers (either has an x or a <space>).
Just try with following regex:
(\d+)(?:(?: ?x ?)| )(\d+)
demo
You say you want "one or more" whitespace characters between the "x", but you have used the ? quantifier which means "zero or one". Thus, because you've also marked the "x" as optional, it will match any two-or-more digit number: Your first [\d]{1,} will match against 0 then your second one will match on 1.
Note that you do not need to enclose single atoms into a character range: [\d] can be more simply written as \d. Also {1,} -- meaning "one or more" -- is more easily encoded as +.
As you want "one or more" whitespace character on either side of the "x", I would go with:
\d+(?:(?:\s+x\s+)|\s+)\d+
Note that (?: ... ) is a "non-capture group", so these bits won't form part of your match array. However, I don't think you want "one or more" whitespace character, as that won't match your first example. Instead, try this:
\d+(?:(?:\s*x\s*)|\s+)\d+
Where the * quantifier means "zero-or-more".
As the subject indicates, I am in need of a JavaScript Regular expression X characters long, that accepts alphanumeric characters, but not the underscore character, and also accepts periods, but not at beginning or end. Periods cannot be consecutive either.
I have been able to almost get to where I want to be searching and reading other people's questions and the answers here on Stack Overflow (such as here).
However, in my case, I need a string that has to be exactly X characters long (say 6), and can contain letters and numbers (case insensitive) and may also include periods.
Said periods cannot be consecutive and also, cannot start, or end the string.
Jd.1.4 is valid, but Jdf1.4f is not (7 characters).
/^(?:[a-z\d]+(?:\.(?!$))?)+$/i
is what I have been able to construct using examples from others, but I cannot get it to only accept strings that match the set length.
/^((?:[a-z\d]+(?:\.(?!$))?)+){6}$/i
works in that it now accepts nothing less than 6 characters, but it also happily accepts anything longer as well...
I am obviously missing something, but I do not know what it is.
Can anyone help?
This should work:
/^(?!.*?\.\.)[a-z\d][a-z\d.]{4}[a-z\d]$/i
Explanation:
^ // matches the beginning of the string
(?!.*?\.\.) // negative lookahead, only matches if there are no
// consecutive periods (.)
[a-z\d] // matches a-z and any digit
[a-z\d.]{4} // matches 4 consecutive characters or digits or periods
[a-z\d] // matches a-z and any digit
$ // matches the end of the string
Another way to do that:
/(?=.{6}$)^[a-z\d]+(?:\.[a-z\d]+)*$/i
explanation:
(?=.{6}$) this lookahead impose the number of characters before
the end of the string
^[a-z\d]+ 1 or more alphanumeric characters at the beginning
of the string
(?:\.[a-z\d]+)* 0 or more groups containing a dot followed by 1 or
more alphanumerics
$ end of the string