Regex match url with params to specific pattern but not query string - javascript

My regex pattern:
const pattern = /^\/(test|foo|bar\/baz|en|ppp){1}/i;
const mat = pattern.exec(myURL);
I want to match:
www.mysite.com/bar/baz/myParam/...anything here
but not
www.mysite.com/bar/baz/?uid=100/..
myParam can be any string with or without dashes but only after that anything else can occur like query strings but not immediately after baz.
Tried
/^\/(test|foo|bar\/baz\/[^/?]*|en|ppp){1}/i;
Nothing works.

This, I believe, is what you are asking for:
const myURL = "www.mysite.com/bar/baz/myParam/";
const myURL2 = "www.mysite.com/bar/baz/?uid=100";
const regex = /\/[^\?]\w+/gm;
console.log('with params', myURL.match(regex));
console.log('with queryParams', myURL2.match(regex))
You can test this and play further in Regex101. Even more, if you use that page, it tells you what does what in the regex string.
If it's not what you were asking for, there was another question related to yours, without regex: Here it is

For the 2 example strings, you might use
^[^\/]+\/bar\/baz\/[\w-]+\/.*$
Regex demo
If you want to use the alternations as well, it might look like
^[^\/]+\/(?:test|foo|bar)\/(?:baz|en|ppp)\/[\w-]+\/.*$
^ Start of string
[^\/]+ Match 1+ times any char except a /
\/ Match /
(?:test|foo|bar) Match 1 of the options
\/ Match /
(?:baz|en|ppp) Match 1 of the options
\/ Match /
[\w-]+ Match 1+ times a word char or -
\/ Match /
.* Match 0+ occurrences of any char except a newline
$ End of string
Regex demo

Using a negative lookahead or lookbehind will solve your problem. There are 2 options not clear from the question:
?uid=100 is not allowed after the starting part /bar/baz, so www.mysite.com/test/bar/baz?uid=100 should be valid.
?uid=100 is not allowed anywhere in the string following /bar/baz, which means that www.mysite.com/test/bar/baz/?uid=100 is invalid as well.
Option 1
In short:
\/(test|foo|bar\/baz(?!\/?\?)|en|ppp)(\/[-\w?=]+)*\/?
Explanation of the important parts:
| # OR
bar # 'bar' followed by
\/ # '/' followed by
baz # 'baz'
(?! # (negative lookahead) so, **not** followed by
\/? # 0 or 1 times '/'
\? # '?'
) # END negative lookahead
and
( # START group
\/ # '/'
[-\w?=]+ # any word char, or '-','?','='
)* # END group, occurrence 0 or more times
\/? # optional '/'
Examples Option 1
You can make the lookahead even more specific with something like (?!\/?\?\w+=\w+) to make explicit that ?a=b is not allowed, but that's up to you.
Option 2
To make explicit that ?a=b is not allowed anywhere we can use negative lookbehind. Let's first find a solution for not allowing* bar/baz preceding the ?a=b.
Shorthand:
(?<!bar\/baz\/?)\?\w+=\w+
Explanation:
(?<! # Negative lookbehind: do **not** match preceding
bar\/baz # 'bar/baz'
\/? # optional '/'
)
\? # match '?'
\w+=\w+ # match e.g. 'a=b'
Let's make this part of the complete regex:
\/(test|foo|en|ppp|bar\/baz)(\/?((?<!bar\/baz\/?)\?\w+=\w+|[-\w]+))*\/?$
Explanation:
\/ # match '/'
(test|foo|en|ppp|bar\/baz) # start with 'test', 'foo', 'en', 'ppp', 'bar/baz'
(\/? # optional '/'
((?<!bar\/baz\/?)\?\w+=\w+ # match 'a=b', with negative lookbehind (see above)
| # OR
[-\w]+) # 1 or more word chars or '-'
)* # repeat 0 or more times
\/? # optional match for closing '/'
$ # end anchor
Examples Option 2

Related

JavaScript RegEx - Minimum characters with Wildcard

I'm working on matching a wildcard search input. it's a name field.
Below are the conditions I need to match.
User must enter at least 3 alphanumeric characters, if he chooses to do a Wildcard search
User may/maynot enter a wildcard at the start or end of the string,but it can be on either side.
Allow spaces between words.
I want to mention that i'm trimming the string before doing a match. This is what I tried so far.
^[^\W_](\s?\w?)*$|^[^\W_]{3,}(\s?\w?)*\*$|^[\*][^\W_]{3,}(\s?\w?)*$
Debuggex Demo
Below are some examples I tried -
someone xxx, someone xxx yyy - Passed
someone* xxx- Failed
someone , someone - Passed
This is the nearest match of what i want- But it fail for these test case.
AB asf* -- Fails , this will pass- ABC asf*
*AB asf -- Fails , this will pass- *ABC asf
I know I have a condition that says - starts with at least 3 alphanumeric character and repeat space and alphanumeric characters.
That's where I need help with.
Thanks.
UPDATE2 This pattern should do:
/^([a-zA-Z0-9]{3,}[^\n*]*\*?|\*[a-zA-Z0-9]{2,}[^\n*]*|[a-zA-Z0-9]{2}\*)$/gm
EXPLANATION:
^ # assert start of line
( # 1st capturing group starts
[a-zA-Z0-9]{3,} # match 3+ times alphanumeric characters
[^\n*]* # match 0 or more non-newline and non-star (*) characters
\*? # match 0 or one literal star (*) character;
| # OR
\* # match one literal star (*) character
[a-zA-Z0-9]{2,} # match 2+ times alphanumeric characters
[^\n*]* # match 0 or more non-newline and non-star (*) characters;
| # OR
[a-zA-Z0-9]{2} # match 2 non-newline and non-star (*) characters
\* # match one literal star (*) character
) # 1st capturing group ends
$ # assert end of line
REGEX 101 DEMO.
Try this one:
^(?:[^\W_]+|\*[^\W_]{3,}|[^\W_]{3,}\*)(?:\s+(?:[^\W_]+|\*[^\W_]{3,}|[^\W_]{3,}\*))*$
NOTE: using [^\W_] instead of \w just as in your original regex.
regex101
However, I argue that this task cannot be solved in a clean way using a regex. Maybe a proper javascript function would be more readable.
If I understand correctly the requirements,
this might work. It does in my tests.
^(?:\*[^\W_]{3,}(?:\s*[^\W_]\s*)*|(?:\s*[^\W_]\s*)*[^\W_]{3,}\*|(?:\s*[^\W_]\s*)+)$
Expanded
^ # BOS
(?: # One of either ---
\* # Star at beeginning
[^\W_]{3,} # 3 or more words
(?: \s* [^\W_] \s* )* # Any number of word's following spaces
| # or,
(?: \s* [^\W_] \s* )* # Any number of word's following spaces
[^\W_]{3,} # 3 or more words
\* # Star at end
| # or,
(?: \s* [^\W_] \s* )+ # Any number of word's following spaces
) # ---------
$ # EOS

Regex to accept only numbers and a specific char

In javascript i have
var regex = /^\d+$/;
which accepts only numbers. How to remake it to accept numbers and the the character '-'
You can use a character class for that:
var regex = /^[\d-]+$/;
However, this will also allow matches like ----. If you only want to allow inputs like 123-456-789 but not -123 or 123- or 123--456, then you can use something like
var regex = /^\d+(?:-\d+)*$/;
Explanation:
^ # Start of string.
\d+ # Match a number.
(?: # Start of a non-capturing group that matches...
- # a hyphen,
\d+ # followed by a number
)* # ...any number of times, including zero.
$ # End of string

Javascript regex for matching twitter-like hashtags

I'd like some help on figuring out the JS regex to use to identify "hashtags", where they should match all of the following:
The usual twitter style hashtags: #foobar
Hashtags with text preceding: abc123#xyz456
Hashtags with space in them, which are denoted as: #[foo bar] (that is, the [] serves as delimiter for the hashtag)
For 1 and 2, I was using something of the following form:
var all_re =/\S*#\S+/gi;
I can't seem to figure out how to extend it to 3. I'm not good at regexps, some help please?
Thanks!
So it has to match either all non-space characters or any characters between (and including) [ and ]:
\S*#(?:\[[^\]]+\]|\S+)
Explanation:
\S* # any number of non-white space characters
# # matches #
(?: # start non-capturing group
\[ # matches [
[^\]]+ # any character but ], one or more
\] # matches ]
| # OR
\S+ # one or more non-white space characters
) # end non-capturing group
Reference: alternation, negated character classes.
How about this?
var all_re =/(\S*#\[[^\]]+\])|(\S*#\S+)/gi;
I had a similar problem, but only want to match when a string starts and ends with the hashtag. So similar problem, hopefully someone else can have use of my solution.
This one matches "#myhashtag" but not "gfd#myhashtag" or "#myhashtag ".
/^#\S+$/
^ #start of regex
\S #Any char that is not a white space
+ #Any number of said char
$ #End of string
Simple as that.

Regular expression negative match

I can't seem to figure out how to compose a regular expression (used in Javascript) that does the following:
Match all strings where the characters after the 4th character do not contain "GP".
Some example strings:
EDAR - match!
EDARGP - no match
EDARDTGPRI - no match
ECMRNL - match
I'd love some help here...
Use zero-width assertions:
if (subject.match(/^.{4}(?!.*GP)/)) {
// Successful match
}
Explanation:
"
^ # Assert position at the beginning of the string
. # Match any single character that is not a line break character
{4} # Exactly 4 times
(?! # Assert that it is impossible to match the regex below starting at this position (negative lookahead)
. # Match any single character that is not a line break character
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
GP # Match the characters “GP” literally
)
"
You can use what's called a negative lookahead assertion here. It looks into the string ahead of the location and matches only if the pattern contained is /not/ found. Here is an example regular expression:
/^.{4}(?!.*GP)/
This matches only if, after the first four characters, the string GP is not found.
could do something like this:
var str = "EDARDTGPRI";
var test = !(/GP/.test(str.substr(4)));
test will return true for matches and false for non.

Replace spaces but not when between parentheses

I guess I can do this with multiple regexs fairly easily, but I want to replace all the spaces in a string, but not when those spaces are between parentheses.
For example:
Here is a string (that I want to) replace spaces in.
After the regex I want the string to be
Hereisastring(that I want to)replacespacesin.
Is there an easy way to do this with lookahead or lookbehing operators?
I'm a little confused on how they work, and not real sure they would work in this situation.
Try this:
replace(/\s+(?=[^()]*(\(|$))/g, '')
A quick explanation:
\s+ # one or more white-space chars
(?= # start positive look ahead
[^()]* # zero or more chars other than '(' and ')'
( # start group 1
\( # a '('
| # OR
$ # the end of input
) # end group 1
) # end positive look ahead
In plain English: it matches one or more white space chars if either a ( or the end-of-input can be seen ahead without encountering any parenthesis in between.
An online Ideone demo: http://ideone.com/jaljw
The above will not work if:
there are nested parenthesis
parenthesis can be escaped

Categories