What's wrong with the non capture group in my regular expression

What's wrong with the non capture group in my regular expression - javascript

I'm trying to write a regular expression that will match a strings similar to the ones below:
Yu MSBE26
w AWAQBNL
I am using Javascript and have come up with the following regular expression:
(.*?(?:[AWMS\d]{2})[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})
In words, I start my capture group off by matching everything until the [AWMS\d]{2} pattern is encountered, then I match the [AWMS\d]{2} pattern, the [A-Z]{2} that follows and finally the [\dA-Za-z]{1,3} to match the final two or three characters.
From what I have read, this should be working, but I'm not getting any matches.
For example when I use a regex tester I don't get any matches: Sample

Remove the second [AWMS\d]{2} - it looks like an accidental addition and is the reason your regex doesn't work:
(.*?(?:[AWMS\d]{2})[A-Z]{2}[\dA-Za-z]{1,3})
Edit: you don't even need the non capture group, the square brackets are enough:
(.*?[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})

Your regex doesn't match your values because simply they don't match.
Your pattern is:
(.*?(?:[AWMS\d]{2})[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})
Yu MSBE26
^--- fails here
w AWAQBNL
^--- fails here
Btw, you can use your regex to match your strings as this:
(.*?[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})
Working demo

Related

Is it possible to replace a negative "look around" in a regular expression with something that doesn't use a look-behind?

I have a regex that matches a single character x but not xx.
(?<!x)x(?!x)
This works great, the problem is the regex engine in firefox doesn't support look-behinds. (this bug)
Is there any way to rewrite this so it doesn't use a looking behind?
edit:
the solution in the duplicate link doesn't work in this case. If I replace the negative look-behind with
(?!^x)x(?!x)
It doesn't match correctly.

Here is a workaround:
^(?:^|[^x])x(?!x)
Demo
It matches x if preceded by beginning of line or by a non-x

You could use non-capturing groups:
(?:[^x]|^)(x)(?:[^x]|$)
This means we search single "x" symbol in some string. Yes, the regex matches three symbols, but we can refer to match x as $1.

Regex: Match until first occurrence met

What I am trying is to match until first occurrence of & met. Right now it is matching only the last occurrence of &.
My regular expression is
(?!^)(http[^\\]+)\&
And I'm trying to match against this text:
https://www.google.com/url?rct3Dj&sa3Dt&url3Dhttp://business.itbusinessnet.com/article/WorldStage-Supports-Massive-4K-Video-Mapping-at-Adobe-MAX-with-Christie-Boxer-4K-Projectors---4820052&ct3Dga&cd3DCAEYACoTOTEwNTAyMzI0OTkyNzU0OTI0MjIaMTBmYTYxYzBmZDFlN2RlZjpjb206ZW46VVM&usg3DAFQjCNE6oIhIxR6qRMBmLkHOJTKLvamLFg
What I need is:
http://business.itbusinessnet.com/article/WorldStage-Supports-Massive-4K-Video-Mapping-at-Adobe-MAX-with-Christie-Boxer-4K-Projectors---4820052
Click for the codebase.

Use the non-greedy mode like this:
/(?!^)(http[^\\]+?)&/
// ^
In non-greedy mode (or lazy mode) the match will be as short as possible.
If you want to get rid ot the & then just wrap it in a lookahead group so it won't be in the match like this:
/(?!^)(http[^\\]+?)(?=&)/
// ^^ ^
Or you could optimize the regular expression as #apsillers suggested in the comment bellow like this:
/(?!^)(http[^\\&]+)/
Note: & is not a special character so you don't need to escape it,

How to find which part/group of regular expression fails

I need to find which part of expression fails.Let say I have a expression ^-?(\\d*)(,\\d{1,3})*(?:[,]|([.]\\d{0,2}))?$ And I want to know if it fails while matching comma (,) or decimal part . How I can find unmatched group in given regular expression

Break it in to smaller chunks and test that each part matches what you expect it to.
Also as #Avinash Raj has mentioned, online regex checkers like regex101 are indespensible.
These tools highlight what has and hasn't been matched in a given set of data. This will show you where the regex is failing.

How to write a RegEx to check for a certain, specific number of characters?

I am trying to test a string for a state code, the regex I have is
^A[LKSZRAEP]|C[AOT]|D[EC]|F[LM]|G[AU]|HI|I[ADLN]|K[SY]|LA|M[ADEHINOPST]|N[CDEHJMVY]|O[HKR]|P[ARW]|RI|S[CD]|T[NX]|UT|V[AIT]|W[AIVY]$
The issue is, if I have something like "CTA12" as a test string, it will get a match of CT. How can I modify my regex to make it only match state codes that are not part of a larger string?

Your use of anchors with alternation is incorrect, ^AB|DC$ means "strings that start with AB or end with DC". To get the ^ and $ to both apply to each element of the alternation, you need to put the alternation in a group, for example ^(AB|DC)$.
Try changing your regex to the following:
^(A[LKSZRAEP]|C[AOT]|D[EC]|F[LM]|G[AU]|HI|I[ADLN]|K[SY]|LA|M[ADEHINOPST]|N[CDEHJMVY]|O[HKR]|P[ARW]|RI|S[CD]|T[NX]|UT|V[AIT]|W[AIVY])$
The alternative to using a group is to put the ^ and $ as a part of each element in the alternation, for example ^AB$|^DC$, but that would make your regex significantly longer so a group is the way to go.

Alternation operator inside square brackets does not work

I'm creating a javascript regex to match queries in a search engine string. I am having a problem with alternation. I have the following regex:
.*baidu.com.*[/?].*wd{1}=
I want to be able to match strings that have the string 'word' or 'qw' in addition to 'wd', but everything I try is unsuccessful. I thought I would be able to do something like the following:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
but it does not seem to work.

replace [wd|word|qw] with (wd|word|qw) or (?:wd|word|qw).
[] denotes character sets, () denotes logical groupings.

Your expression:
.*baidu.com.*[/?].*[wd|word|qw]{1}=
does need a few changes, including [wd|word|qw] to (wd|word|qw) and getting rid of the redundant {1}, like so:
.*baidu.com.*[/?].*(wd|word|qw)=
But you also need to understand that the first part of your expression (.*baidu.com.*[/?].*) will match baidu.com hello what spelling/handle????????? or hbaidu-com/ or even something like lkas----jhdf lkja$##!3hdsfbaidugcomlaksjhdf.[($?lakshf, because the dot (.) matches any character except newlines... to match a literal dot, you have to escape it with a backslash (like \.)
There are several approaches you could take to match things in a URL, but we could help you more if you tell us what you are trying to do or accomplish - perhaps regex is not the best solution or (EDIT) only part of the best solution?

We Keep Coding

JavaScript is the programming language of the Web.

What's wrong with the non capture group in my regular expression - javascript

Remove the second [AWMS\d]{2} - it looks like an accidental addition and is the reason your regex doesn't work: (.?(?:[AWMS\d]{2})[A-Z]{2}[\dA-Za-z]{1,3}) Edit: you don't even need the non capture group, the square brackets are enough: (.?[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})

Related

Is it possible to replace a negative "look around" in a regular expression with something that doesn't use a look-behind?

Regex: Match until first occurrence met

How to find which part/group of regular expression fails

How to write a RegEx to check for a certain, specific number of characters?

Alternation operator inside square brackets does not work

Categories

Resources

We Keep Coding

JavaScript is the programming language of the Web.

What's wrong with the non capture group in my regular expression - javascript

Remove the second [AWMS\d]{2} - it looks like an accidental addition and is the reason your regex doesn't work: (.*?(?:[AWMS\d]{2})[A-Z]{2}[\dA-Za-z]{1,3}) Edit: you don't even need the non capture group, the square brackets are enough: (.*?[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})

Related

Is it possible to replace a negative "look around" in a regular expression with something that doesn't use a look-behind?

Regex: Match until first occurrence met

How to find which part/group of regular expression fails

How to write a RegEx to check for a certain, specific number of characters?

Alternation operator inside square brackets does not work

Categories

Resources

Remove the second [AWMS\d]{2} - it looks like an accidental addition and is the reason your regex doesn't work: (.?(?:[AWMS\d]{2})[A-Z]{2}[\dA-Za-z]{1,3}) Edit: you don't even need the non capture group, the square brackets are enough: (.?[AWMS\d]{2}[A-Z]{2}[\dA-Za-z]{1,3})