Regex Pattern Matching for similar URL - javascript

I need to send different Response for different urls. But the Regex that I am using is not working.
The two Regex in question is
"/v1/users/[^/]+/permissions/domain/HTTP/"
(Eg: http://localhost:4544/v1/users/10feec20-afd9-46a0-a3fc-9b2f18c1d363/permissions/domain/HTTP)
and
"/v1/users/[^/]+/"
(Eg: http://localhost:4544/v1/users/10feec20-afd9-46a0-a3fc-9b2f18c1d363)
I am not able to figure out how to stop the regex matching after "[^/]+/". Both the pattern return the same result. It is as if due to regex both of them are same URL's. The pattern matching happens in mountebank mocking server using a matching predicate. Any help would be appreciated. Thanks.

The regular expression "/v1/users/[^/]+/" matches both urls. You are asking it to match '/v1/users/` plus anything except '/' followed by a slash. This happens in both the longer URL and the short one, which is why it matches.
A couple options:
You can match the longer url and not the shorter one with:
"/v1/users/[^/]+/.+"
This matches http://localhost:4544/v1/users/10feec20-afd9-46a0-a3fc-9b2f18c1d363/permissions/domain/HTTP, but not http://localhost:4544/v1/users/10feec20-afd9-46a0-a3fc-9b2f18c1d363/
You could also match just the short one by anchoring the end:
"/v1/users/[^/]+/$"
This matches the short URL but not the long one.

Related

Regular Expression to find a pattern and replace just part of it

I want to know how can I use RegEx to find a pattern and replace just a part of it in JavaScript.
Let's say, for example, I want to replace some patterns like this -foo but just if it has a - after it, like -foo- but replace just the -foo.
Can someone please explain in details the RegEx construction to achieve it?
I did not find a detailed explanation of it here, just codes with a minimum explanation.
You need to use a positive look-ahead (?=-) that will check the existence of - after -foo but will not consume it:
var s = "-foo- -foo";
alert(s.replace(/-foo(?=-)/g, 'REPLACED'));
You can read more about look-aheads (and look-behinds, though they are not supported by the JS regex engine) at regular-expressions.info.
The main idea is that the text is checked for presence or absence of some patterns defined in the look-around, and based on that either allow or fail the match. They can actually be used efficiently together with anchors, but this is not the case here.
Lookahead and lookbehind, collectively called "lookaround", are zero-length assertions... lookaround actually matches characters, but then gives up the match, returning only the result: match or no match... They do not consume characters in the string, but only assert whether a match is possible or not.
As the first poster said, you need to make use of a lookahead (?=) to check for an additional character(s). In this situation, the character you need to look for is -, therefore your pattern would make use of a lookahead followed by - ie(?=-).

RegEx "ignores" quantifier?

Basically I have the following string: http:/www.-woejfewiofjewow
which is NOT allowed to be matched
My Regex: http://(www\.[^-])?[^-].*
(I used regexr.com to check it..)
The thing is, it doesn't use the first part of the regex (www\.[^-])? but the second part: [^-].*
I don't really know how to solve this problem, is there any possibility?
I am trying to search valid URLs (well in this case without .com) with the following format: http://www.test http://test
Hyphens at the beginning are not allowed (but http://www.test-test is allowed)
I am trying to find a solution without lookaheads
I think you actually need a negative lookahead assertion.
\bhttp:\/\/(?!www\.-)[^-].*
(?!www\.-) negative lookahead which asserts that the double forward_slashes // must not be followed by www.-
DEMO
if you are trying to validate urls, this regex would match a url a bit better:
http:\/\/(?:www\.)?(?:[a-zA-z0-9]+)\.(?:[a-z]){2,3}
these urls are allowed:
http://www.woejfewiofjewow.net
http://www.woejfewiofjewow.ly
this is not allowed:
http://www.woejfewiofjewow.neta
http://www.woejfewiofjewow.n
or even this
http://www.-woejfewiofjewow.net

How to match a URL in this string?

I've seen various articles which show how to match a URL. But my situation is a bit different from the usual URL matching.
This was one such regex that didn't work for me
/https?:\/\/(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,4}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)/
My requirement:
My requirement is that I've a string like this
userlist.2011.text_mediafire.com,
userlist.2011.text_http://www.mediafire.com",
userlist.2011.text_http://mediafire.com",
userlist.2011.text.www.mediafire.com
Now, I want to match mediafire.com along with (if exists) "http://www." and "www." so, the contraint that I wish to set is that all the strings to the left of a TLD (in this case '.com') should be recorded upto a list of specal characters like '"_- etc.
I wasn't able to proceed any further except that the basic /(.*)\.(com|net|org|info)/ .Which is clearly wrong.
Use the below regex and get the string you want from group index 1.
(?:http:\/\/)?(?:www\.)?([^'"_.-]*\.(?:com|net|org|info)\b)
You need the '$' to match the end of string. If you care about capturing the entire string before the special character you will also need to match the beginning of the string '^'.
/^(.*)\.(([^\.]+)\.(com|net|org|info))$/

URL Pattern Matching issue, .+ matches all after

I am matching up stored URLs to the current URL and having a little bit of an issue - the regex works fine when being matched against the URL itself, but for some reason all sub-directories match too (when I want a direct match only of course).
Say the user stores www.facebook.com, this should match both http://www.facebook.com and https://www.facebook.com and it does
The problem is it is also matching sub-directories such as https://www.facebook.com/events/upcoming etc.
The regex for example:
/.+:\/\/www\.facebook\.com/
Matches the following:
https://www.facebook.com/events/upcoming
When it should just be matching
http://www.facebook.com/
https://www.facebook.com/
How can I fix this seemingly broken regex?
If you're being really specific about what you want to match, why not reflect that in your RegExp?
/^https?:\/\/(?:(?:www|m)\.)?facebook\.com\/?$/
http or https
www., m. or no subdomain
facebook.com
Demo
edit to include optional trailing backslash
Put an end marker $, like:
/.+:\/\/www\.facebook\.com\/$/
but really should have a start marker ^ too, like:
/^https?:\/\/www\.facebook\.com\/$/
also if you're matching the current domain, you may as well just match the location.host rather than location.href
Try adding a $ at the end of your regex. It's the symbol for end of string.

getting user and tweet ID from url using JavaScript regex

So I have tweet url for example https://twitter.com/ESPNFC/status/423771542627966976.
This url in my website gets automatically parsed to
https://twitter.com/ESPNFC/status/423771542627966976
I need to match this pattern and also get username and tweet ID.
I did it that way
/<a href="(http|https):\/\/twitter.com\/([^\/]*)\/status\/([^\/]*)">.+<\/a>/g. Everything works when I have 1 tweet per line, but if there are 2 or more tweets in one line, that regex matches both of them at same time and groups it as one, but I need to separate them.
Example:
https://twitter.com/ESPNFC/status/423771542627966976
https://twitter.com/ESPNFC/status/423771542627966976
returns 2 matches, but
https://twitter.com/ESPNFC/status/423771542627966976https://twitter.com/ESPNFC/status/423771542627966976
returns 1 match including both urls. How can I separate it or for example everything after interpret as new line?
It's best to avoid parsing HTML with regex when possible. Having said that the problem with your expression is the greedy .+ which will match as much as possible. Instead you could use .+? to make it ungreedy (match as few characters as possible). Or you could restrict what . matches, for example use [^\s<>]+ instead of .+.
Also you probably want to change those [^\/]* to maybe [^\/"\s]* to make them more effective.

Categories