I am trying to parse the three optional components of a string with this format any of these different combinations:
12345,abcd#ABCD -> $1=12345 $2=abcd $3=ABC
12345,abcd -> $1=12345 $2=abcd $3=empty
12345#ABCD -> $1=12345 $2=empty $3=ABC
12345 -> $1=12345 $2=emty $3=empty
Is it possible with a single regexp?
I have done several attempts. When the string is complete no problem but the forms with parameters missing are escaping to me:
(.+),(.+)#(.+) // works when the string is complete
// but how do you express the optionality?
(.+),?(.+)#?(.+) // nope
(.*)[$,](.*)[$#](.*) // neither
(Another option, would be splitting the string into the components that looks quite trivial but I am curious about the regexp way)
12345,abcd#ABCD -> $1=12345 $2=abcd $3=ABC
12345,abcd -> $1=12345 $2=abcd $3=empty
12345#ABCD -> $1=12345 $2=empty $3=ABC
12345 -> $1=12345 $2=emty $3=empty
From your expected output it appears that you want empty groups in your matches while matching your inputs. You may use this regex:
/^(\d+),?([^#\n]*)#?(.*)$/g
RegEx Demo
Note that this regex will always return 3 captured groups in every match result.
RegEx Details:
^: Start
(\d+): Match 1+ digits and capture in group #1
,?: Match an optional comma
([^#]*): Match 0+ any character that is not # and capture in group #2
#?: Match an optional #
(.*): Match 0+ any character and capture in group #3
$: End
You can use
^([^,#]+)(?:,([^#]+))?(?:#(.+))?$
See the regex demo (note there are newlines added in the demo pattern since the test is performed against a single multiline string there, in real world, the strings to test won't contain newlines, hence they are not in the pattern here.)
Details
^ - start of string
([^,#]+) - Group 1: one or more chars other than a comma and #
(?:,([^#]+))? - an optional non-capturing group matching 1 or 0 occurrences of a comma and then (capturing into Group 2) any one or more chars other than #
(?:#(.+))? - an optional non-capturing group matching 1 or 0 occurrences of a # char and then (capturing into Group 3) any one or more chars other than line break chars as many as possible
$ - end of string.
Related
I need a regular expression that matches the complete string with a zero/even number of backslashes anywhere in the string. If the string contains an odd number of backslashes, it should not match the complete string.
Example:
\\ -> match
\\\ -> does not match
test\\test -> match
test\\\test-> does not match
test\\test\ -> does not match
test\\test\\ -> match
and so on...
Note: We can assume any string of any length in place of 'test' in the above example
I am using this ^[^\\]*(\\\\)*[^\\]*$ regular expression, but it does not match the backslashes after the second test.
For example:
test\\test(doesn't match anything after this)
Thanks for any help in advance.
You may use this regex:
^(?:(?:[^\\]*\\){2})*[^\\]*$
RegEx Demo
RegEx Breakdown:
^: Start
(?:: Start non-capture group #1
(?:: Start non-capture group #2
[^\\]*: Match 0 or more og any char except a \
\\: Match a \
){2}: End non-capture group #2. Repeat this group 2 times.
)*: End non-capture group #1. Repeat this group 0 or more times.
[^\\]*: Match 0 or more og any char except a \
$: End
The current regular expression ^[^\\]*(\\\\)*[^\\]*$ can be interpreted as Any(\\)*Any, Where Any means any character except backslash.
The expected language shall be Any\\Any\\Any\\..., which can be obtained by containing the current regular expression in Kleene closure operator. That is (Any(\\)*Any)*
The original regular expression after modification:
^([^\\]*(\\\\)*[^\\]*)*$
It can be further optimized as:
^((\\\\)*[^\\]*)*$
The first 3 characters needs to be:
Exactly either ABC or ACD or BCD
Then followed be a hyphen -
Then followed by either a 5 or 8
Then any 4 numbers
Examples:
ABC-56789 (True)
AAA-56789 (False)
I have tried this:
/^[^ABC$|^ACD$|^BCD$][*-][5|8][0-9]{4}$/
How about use this expression?
^(ABC|ACD|BCD)-[5|8]\d{4}$
[] means character set. So, [ABC] means any A or B or C, not ABC.
And ^ means negated in []. So, regex you used may not work fine.
If you want to group the tokens, you should use ().
You can also use \d (digit) instead of [0-9].
Use this regex:
const regex = /^(?:ABC|ACD|BCD)-[58][0-9]{4}$/;
[
'ABC-56789',
'AAA-56789'
].forEach(str => {
console.log(str, '==>', regex.test(str));
})
Output:
ABC-56789 ==> true
AAA-56789 ==> false
Explanation of regex:
^ -- anchor at beginning of string
(?:ABC|ACD|BCD) -- non-capture group with OR combinations
- -- literla dash
[58] -- a 5 or 8
[0-9]{4} -- four digits
$ -- anchor at end of string
Learn more about regex: https://twiki.org/cgi-bin/view/Codev/TWikiPresentation2018x10x14Regex
Use parentheses, not square brackets, to group alternation patterns:
^(ABC|ACD|BCD)-[58][0-9]{4}$
You have to change the regex to the following:
/^(ABC|ACD|BCD)-(5|8)[0-9]{4}$/
[] match single characters, but you want to match three characters in the beginning, so you have to use the () to create a capturing group.
Using a single alternation:
^(?:ABC|[AB]CD)-[58][0-9]{4}$
Explanation
^ Start of string
(?: Non capture group for the alternatives
ABC Match literally
| Or
[AB]CD Match either ACD or BCD
) Close the non capture group
- Match literally
[58] Match either 5 or 8`
[0-9]{4} Match 4 digits 0-9
$ End of string
See a regex101 demo
I'm trying to make a regular expression that only accepts:
Min 100 and atleast 1000 characters, characters “,’,<,> aren't allowed, two full stops one after another aren't allowed.
This is what I have for now:
^.{100,1000}$ → for 100 to 1000 characters
^[^"'<>]*$ → for the characters that aren't allowed
^([^._]|[.](?=[^.]|$)|_(?=[^_]|$))*$ → doesn't allow 2 consecutive dots
How do I combine this regex into one? ._.
This part [^._] means no dot or underscore and this part [.](?=[^.]|$)|_(?=[^_]|$) matches either a . or _ followed by the opposite or end of string.
You could write the pattern using a single negative lookahead assertion excluding __ or ..
^(?!.*([._])\1)[^"'<>\n]{100,1000}$
Explanation
^ Start of string
(?! Negative lookahead, assert that what is at the right is not
.*([._])\1 capture either . or _ and match the same captured char after it (meaning no occurrence of .. or __)
) Close lookahead
[^"'<>\n]{100,1000} Match 100-1000 times any character except the listed
$ End of string
Regex demo (with the quantifier set to {10,100} for the demo)
I'm trying below RegEx which need atleast 2 characters before #
^([a-zA-Z])[^.*-\s](?!.*[-_.#]{2})(?!.\.{2})[a-zA-Z0-9-_.]+#([\w-]+[\w]+(?:\.[a-z]{2,10}){1,2})$
like
NOT ALLOWED : aa.#co.kk.pp
NOT ALLOWED : aa..#co.kk.pp
NOT ALLOWED : a.a#co.kk.pp
SHOULD ALLOWED: aa#co.kk.pp
SHOULD ALLOWED: aaa#co.kk.pp
SHOULD ALLOWED: aa.s#co.kk.pp. (atleast one char after special char and before #)
SHOULD ALLOWED: aa.ss#co.kk.pp
SHOULD ALLOWED: a#co.kk.pp
Before # only allowed special char . _ - which also not consecutively like (--) also not in beginning.
i tried below RegEx also but no luck
^[a-zA-Z)]([^.*-\s])(?!.*[-_.#]{2}).(?!.\.{2})[\w.-]+#([\w-]+[\w]+(?:\.[a-z]{2,10}){1,2})$
I would suggest keeping things simple like this:
^([a-zA-Z][\w+-]+(?:\.\w+)?)#([\w-]+(?:\.[a-zA-Z]{2,10})+)$
RegEx Demo
By no means it is a comprehensive email validator regex but it should meet your requirements.
Details:
^: Start
(: Start capture group #1
[a-zA-Z]: Match a letter
[\w.+-]+: Match 1+ of word characters or - or +
(?:\.\w+)?: Match an option part after a dot
): End capture group #1
#: Match a #
(: Start capture group #2
[\w-]+: Match 1+ of word characters or -
(?:\.[a-zA-Z]{2,10})+: Match a dot followed by 2 to 10 letters. Repeat this group 1+ times
): End capture group #2
$: End
I want to match everything before the nth character (except the first character) and everything after it. So for the following string
/firstname/lastname/some/cool/name
I want to match
Group 1: firstname/lastname
Group 2: some/cool/name
With the following regex, I'm nearly there, but I can't find the correct regex to also correctly match the first group and ignore the first /:
([^\/]*\/){3}([^.]*)
Note that I always want to match the 3rd forward slash. Everything after that can be any character that is valid in an URL.
Your regex group are not giving proper result because ([^\/]*\/){3} you're repeating captured group which will overwrite the previous matched group Read this
You can use
^.([^/]+\/[^/]+)\/(.*)$
let str = `/firstname/lastname/some/cool/name`
let op = str.match(/^.([^/]+\/[^/]+)\/(.*)$/)
console.log(op)
Ignoring the first /, then capturing the first two words, then capturing the rest of the phrase after the /.
^(:?\/)([^\/]+\/[^\/]+)\/(.+)
See example
The quantifier {3} repeats 3 times the capturing group, which will have the value of the last iteration.
The first iteration will match /, the second firstname/ and the third (the last iteration) lastname/ which will be the value of the group.
The second group captures matching [^.]* which will match 0+ not a literal dot which does not take the the structure of the data into account.
If you want to match the full pattern, you could use:
^\/([^\/]+\/[^\/]+)\/([^\/]+(?:\/[^\/]+)+)$
Explanation
^ Start of string
( Capture group 1
[^\/]+/[^\/]+ Match 2 times not a / using a negated character class then a /
) Close group
\/ Match /
( Capture group 2
[^\/]+ Match 1+ times not /
(?:\/[^\/]+)+ Repeat 1+ times matching / and 1+ times not / to match the pattern of the rest of the string.
) Close group
$ End of string
Regex demo