Regular Expresions JS - javascript

Can someone tell me why the code does not work without \D here? I need to use lookaheads in the pwRegex to match passwords that are greater than 5 characters long, do not begin with numbers, and have two consecutive digits.
let sampleWord = "abc123";
var pwRegex = /^\D(?=\w{5})(?=\w*\d{2})/;
let result = pwRegex.test(sampleWord); //true
Thanks!

In regex, \d matches any digit character, and \D matches any character that is not a digit character. ^ means the start of the string, so ^\D means the starting character is not a digit.
... do not begin with numbers,...
The \D is for it to not begin with numbers.

Related

Regex remove all leading and trailing special characters?

Let's say I have the following string in javascript:
&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&
I want to remove all the leading and trailing special characters (anything which is not alphanumeric or alphabet in another language) from all the words.
So the string should look like
a.b.c a.b.c a.b.c a.b.c a.b&.c a.b.&&dc ê.b..c
Notice how the special characters in between the alphanumeric is left behind. The last ê is also left behind.
This regex should do what you want. It looks for
start of line, or some spaces (^| +) captured in group 1
some number of symbol characters [!-\/:-#\[-``\{-~]*
a minimal number of non-space characters ([^ ]*?) captured in group 2
some number of symbol characters [!-\/:-#\[-``\{-~]*
followed by a space or end-of-line (using a positive lookahead) (?=\s|$)
Matches are replaced with just groups 1 and 2 (the spacing and the characters between the symbols).
let str = '&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&';
str = str.replace(/(^| +)[!-\/:-#\[-`\{-~]*([^ ]*?)[!-\/:-#\[-`\{-~]*(?=\s|$)/gi, '$1$2');
console.log(str);
Note that if you want to preserve a string of punctuation characters on their own (e.g. as in Apple & Sauce), you should change the second capture group to insist on there being one or more non-space characters (([^ ]+?)) instead of none and add a lookahead after the initial match of punctuation characters to assert that the next character is not punctuation:
let str = 'Apple &&& Sauce; -This + !That!';
str = str.replace(/(^| +)[!-\/:-#\[-`\{-~]*(?![!-\/:-#\[-`\{-~])([^ ]+?)[!-\/:-#\[-`\{-~]*(?=\s|$)/gi, '$1$2');
console.log(str);
a-zA-Z\u00C0-\u017F is used to capture all valid characters, including diacritics.
The following is a single regular expression to capture each individual word. The logic is that it will look for the first valid character as the beginning of the capture group, and then the last sequence of invalid characters before a space character or string terminator as the end of the capture group.
const myRegEx = /[^a-zA-Z\u00C0-\u017F]*([a-zA-Z\u00C0-\u017F].*?[a-zA-Z\u00C0-\u017F]*)[^a-zA-Z\u00C0-\u017F]*?(\s|$)/g;
let myString = '&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&'.replace(myRegEx, '$1$2');
console.log(myString);
Something like this might help:
const string = '&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&';
const result = string.split(' ').map(s => /^[^a-zA-Z0-9ê]*([\w\W]*?)[^a-zA-Z0-9ê]*$/g.exec(s)[1]).join(' ');
console.log(result);
Note that this is not one single regex, but uses JS help code.
Rough explanation: We first split the string into an array of strings, divided by spaces. We then transform each of the substrings by stripping
the leading and trailing special characters. We do this by capturing all special characters with [^a-zA-Z0-9ê]*, because of the leading ^ character it matches all characters except those listed, so all special characters. Between these two groups we capture all relevant characters with ([\w\W]*?). \w catches words, \W catches non-words, so \w\W catches all possible characters. By appending the ? after the *, we make the quantifier * lazy, so that the group stops catching as soon as the next group, which catches trailing special characters, catches something. We also start the regex with a ^ symbol and end it with an $ symbol to capture the entire string (they respectively set anchors to the start end the end of the string). With .exec(s)[1] we then execute the regex on the substring and return the first capturing group result in our transform function. Note that this might be null if a substring does not include proper characters. At the end we join the substrings with spaces.

How does the following code mean two consecutive numbers?

This is from an exercise on FCC beta and i can not understand how the following code means two consecutive numbers seeing how \D* means NOT 0 or more numbers and \d means number, so how does this accumulate to two numbers in a regexp?
let checkPass = /(?=\w{5,})(?=\D*\d)/;
This does not match two numbers. It doesn't really match anything except an empty string, as there is nothing preceding the lookup.
If you want to match two digits, you can do something like this:
(\d)(\d)
Or if you really want to do a positive lookup with the (?=\D*\d) section, you will have to do something like this:
\d(?=\D*\d)
This will match against the last digit which is followed by a bunch of non-digits and a single digit. A few examples (matched numbers highlighted):
2 hhebuehi3
^
245673
^^^^^
2v jugn45
^ ^
To also capture the second digit, you will have to put brackets around both numbers. Ie:
(\d)(?=\D*(\d))
Here it is in action.
In order to do what your original example wants, ie:
number
5+ \w characters
a non-number character
a number
... you will need to precede your original example with a \d character. This means that your lookups will actually match something which isn't just an empty string:
\d(?=\w{5,})(?=\D*\d)
IMPORTANT EDIT
After playing around a bit more with a JavaScript online console, I have worked out the problem with your original Regex.
This matches a string with 5 or more characters, including at least 1 number. This can match two numbers, but it can also match 1 number, 3 numbers, 12 numbers, etc. In order to match exactly two numbers in a string of 5-or-more characters, you should specify the number of digits you want in the second half of your lookup:
let regex = /(?=\w{5,})(?=\D*\d{2})/;
let string1 = "abcd2";
let regex1 = /(?=\w{5,})(?=\D*\d)/;
console.log("string 1 & regex 1: " + regex1.test(string1));
let regex2 = /(?=\w{5,})(?=\D*\d{2})/;
console.log("string 1 & regex 2: " + regex2.test(string1));
let string2 = "abcd23";
console.log("string 2 & regex 2: " + regex2.test(string2));
My original answer was about Regex in a vacuum and I glossed over the fact that you were using Regex in conjunction with JavaScript, which works a little differently when comparing Regex to a string. I still don't know why your original answer was supposed to match two numbers, but I hope this is a bit more helpful.
?= Positive lookahead
w{5,} matches any word character (equal to [a-zA-Z0-9_])
{5,}. matches between 5 and unlimited
\D* matches any character that\'s not a digit (equal to [^0-9])
* matches between zero and unlimited
\d matches a digit (equal to [0-9])
This expression is global - so tries to match all
You can always check your expression using regex101

split javascript string using a regexp to separate numbers from other characters

so for example:
"10.cm" ...becomes... [10,".cm"] or ["10",".cm"], either will do as I can work with a string once it's split up.
i tried
"10.cm".split(/[0-9]/|/[abc]/)
but it seems that i don't have such a great understanding of using regexp's
thanks
You may tokenize the string into digits and non-digits with /\d+|\D+/g regex:
var s = "10.cm";
console.log(s.match(/\d+|\D+/g));
Details:
\d+ - matches 1 or more digits
| - or
\D+ - matches 1 or more characters other than digits.
/\W/ Matches any non-word character. This includes spaces and punctuation, but not underscores. In this solution can be used /\W/ with split and join methods. You can separate numbers from other characters.
let s = "10.cm";
console.log(s.split(/\W/).join(" "));
output = 10 cm

How to match digit in middle of a string efficiently in javascript?

I have strings like
XXX-1234
XXXX-1234
XX - 4321
ABCDE - 4321
AB -5677
So there will be letters at the beginning. then there will be hyphen. and then 4 digits. Number of letters may vary but number of digits are same = 4
Now I need to match the first 2 positions from the digits. So I tried a long process.
temp_digit=mystring;
temp_digit=temp_digit.replace(/ /g,'');
temp_digit=temp_digit.split("-");
if(temp_digit[1].substring(0,2)=='12') {}
Now is there any process using regex / pattern matching so that I can do it in an efficient way. Something like string.match(regexp) I'm dumb in regex patterns. How can I find the first two digits from 4 digits from above strings ? Also it would be great it the solution can match digits without hyphens like XXX 1234 But this is optional.
Try a regular expression that finds at least one letter [a-zA-Z]+, followed by some space if necessary \s*, followed by a hyphen -, followed by some more space if necessary \s*. It then matches the first two digits \d{2} after the pattern.:
[a-zA-Z]+\s*-\s*(\d{2})
may vary but number of digits are same = 4
Now I need to match the first 2 positions from the digits.
Also it would be great it the solution can match digits without hyphens like XXX 1234 But this is optional.
Do you really need to check it starts with letters? How about matching ANY 4 digit number, and capturing only the first 2 digits?
Regex
/\b(\d{2})\d{2}\b/
Matches:
\b a word boundary
(\d{2}) 2 digits, captured in group 1, and assigned to match[1].
\d{2} 2 more digits (not captured).
\b a word boundary
Code
var regex = /\b(\d{2})\d{2}\b/;
var str = 'ABCDE 4321';
var result = str.match(regex)[1];
document.body.innerText += result;
If there are always 4 digits at the end, you can simply slice it:
str.trim().slice(-4,-2);
here's a jsfiddle with the example strings:
https://jsfiddle.net/mckinleymedia/6suffmmm/

Regex with last match only

I am trying write a regex to get last digit.
My string: name[0][0].
My regex: str.match(/d+/g)
It return all match. Can you help me make regex return only last match?
To get the last digit,
\d(?=\D*$)
To get the last number.
\d+(?=\D*$)
DEMO
\d+ matches one or more digits. + repeats the previous token or more times. (?=\D*$) called positive lookahead assertion which asserts that the match would be followed by any number of non-digit characters further followed by end of the line.

Categories