Match Url path without query string - javascript

I would like to match a path in a Url, but ignoring the querystring.
The regex should include an optional trailing slash before the querystring.
Example urls that should give a valid match:
/path/?a=123&b=123
/path?a=123&b=123
So the string '/path' should match either of the above urls.
I have tried the following regex: (/path[^?]+).*
But this will only match urls like the first example above: /path/?a=123&b=123
Any idea how i would go about getting it to match the second example without the trailing slash as well?
Regex is a requirement.

No need for regexp:
url.split("?")[0];
If you really need it, then try this:
\/path\?*.*
EDIT Actually the most precise regexp should be:
^(\/path)(\/?\?{0}|\/?\?{1}.*)$
because you want to match either /path or /path/ or /path?something or /path/?something and nothing else. Note that ? means "at most one" while \? means a question mark.
BTW: What kind of routing library does not handle query strings?? I suggest using something else.

http://jsfiddle.net/bJcX3/
var re = /(\/?[^?]*?)\?.*/;
var p1 = "/path/to/something/?a=123&b=123";
var p2 = "/path/to/something/else?a=123&b=123";
var p1_matches = p1.match(re);
var p2_matches = p2.match(re);
document.write(p1_matches[1] + "<br>");
document.write(p2_matches[1] + "<br>");

Related

Get substring between substring and first occurrence of another string

I have URL pathnames that look similar to this: /service-area/i-need-this/but-not-this/. The /service-area/ part never changes, and the rest of the path is dynamic.
I need to get the part of the URL saying i-need-this.
Here was my attempt:
location.pathname.match(new RegExp('/service-area/' + "(.*)" + '/'));.
The goal was to get everything between /service-area/ and / but it's actually going up to the last occurrence of /, not the first occurrance. So the output from this is actually i-need-this/but-not-this.
I'm not so good with regex, is there a way it can be tweaked to get the desired result?
You need a lazy regex rather than a greedy one - so (.*?) instead of (.*). See also: What do 'lazy' and 'greedy' mean in the context of regular expressions?
You can do this without a regex too using replace and split:
var path = '/service-area/i-need-this/but-not-this/';
var res = path.replace('/service-area/', '').split('/')[0];
console.log(res);

How to make regex match pattern from the beginning?

I need a little assistance with a Regular Expressions.
I'm doing the following from JavaScript to "mask" all special URLs that may be composed using the following rule:
They may begin with something like this 0> or 1223> or 1_23>
They may begin with a protocol, ex: http:\\ or https:\\
They may also have www. subdomain
So for instance, for https://www.example.com it should produce https://www. ....
So I came up with the following JS:
var url = "0>https://www.example.com/plugins/page.php?href=https://forum.example.com/topic/some_topic";
m = url.match(/\b((?:[\d_]+>)?.+\:\/\/(?:www.)?)/i);
if (m) {
url = m[1] + " ...";
}
console.log(url);
It works for most cases, except that "repeating" URL in my example, in which case I get this:
0>https://www.example.com/plugins/page.php?href=https:// ...
when I was expecting:
0>https:// www. ...
How do I make it pick the match from the beginning? I thought adding \b would do it...
Just make the .+, non-greedy, like this
m = url.match(/\b((?:[\d_]+>)?.+?\:\/\/(?:www.)?)/i);
Note the ? after .+. It means that, the RegEx has to match till the first : after the current expression. If you don't use the ?, it will make it greedy and it will consume all the characters till the last : in the string.
And, you don't have to escape : and you have to escape . after www. So your RegEx will become like this
m = url.match(/\b((?:[\d_]+>)?.+?:\/\/(?:www\.)?)/i);

Regex for matching multiple forward slashes in URL

I need a regular expression for replacing multiple forward slashes in a URL with a single forward slash, excluding the ones following the colon
e.g. http://link.com//whatever/// would become http://link.com/whatever/
I think this should work: /[^:](\/+)/ or /[^:](\/\/+)/ if you want only multiples.
It wont match leading // but it looks like you're not looking for that.
To replace:
"http://test//a/b//d".replace(/([^:]\/)\/+/g, "$1") // --> http://test/a/b/d
Working Demo
As you already accepted an answer. To show some more extend of matching and controlling the matches, this might help you in the future:
var url = 'http://link.com//whatever///';
var set = url.match(/([^:]\/{2,3})/g); // Match (NOT ":") followed by (2 OR 3 "/")
for (var str in set) {
// Modify the data you have
var replace_with = set[str].substr(0, 1) + '/';
// Replace the match
url = url.replace(set[str], replace_with);
}
console.log(url);
Will output:
http://link.com/whatever/
Doublets won't matter in your situation. If you have this string:
var url = 'http://link.com//om/om/om/om/om///';
Your set array will contain multiple m//. A bit redundant, as the loop will see that variable a few times. The nice thing is that String.replace() replaces nothing if it finds nothing, so no harm done.
What you could do is strip out the duplicates from set first, but that would almost require the same amount of resources as just letting the for-loop go over them.
Good luck!
result = subject.replace(/(?<!http:)\/*\//g, "/");
or (for http, https, ftp and ftps)
result = subject.replace(/(?<!(?:ht|f)tps?:)\/*\//g, "/");
The original accepted answer does a sufficient job at replacing, but not for matching. And the currently accepted answer matches the character before duplicate slashes, also not good for matching.
Using a negative lookbehind to exclude the protocol from the match (?<!:), and a curly bracket quantifier to match 2 to infinite slashes \/{2,} does the job to both match and replace.
(?<!:)\/{2,}
let str = 'https://test.example.com:8080//this/is//an/exmaple///';
document.write('Original: ' + str + '<br><br>');
document.write('Matches: ' + str.match(/(?<!:)\/{2,}/g) + '<br><br>');
document.write('Replaced: ' + str.replace(/(?<!:)\/{2,}/g, '/'));

Finding image url via using Regex

Any working Regex to find image url ?
Example :
var reg = /^url\(|url\(".*"\)|\)$/;
var string = 'url("http://domain.com/randompath/random4509324041123213.jpg")';
var string2 = 'url(http://domain.com/randompath/random4509324041123213.jpg)';
console.log(string.match(reg));
console.log(string2.match(reg));
I tied but fail with this reg
pattern will look like this, I just want image url between url(" ") or url( )
I just want to get output like http://domain.com/randompath/random4509324041123213.jpg
http://jsbin.com/ahewaq/1/edit
I'd simply use this expression:
/url.*\("?([^")]+)/
This returns an array, where the first index (0) contains the entire match, the second will be the url itself, like so:
'url("http://domain.com/randompath/random4509324041123213.jpg")'.match(/url.*\("?([^")]+)/)[1];
//returns "http://domain.com/randompath/random4509324041123213.jpg"
//or without the quotes, same return, same expression
'url(http://domain.com/randompath/random4509324041123213.jpg)'.match(/url.*\("?([^")]+)/)[1];
If there is a change that single and double quotes are used, you can simply replace all " by either '" or ['"], in this case:
/url.*\(["']?([^"')]+)/
Try this regexp:
var regex = /\burl\(\"?(.*?)\"?\)/;
var match = regex.exec(string);
console.log(match[1]);
The URL is captured in the first subgroup.
If the string will always be consistent, one option would be simply to remove the first 4 characters url(" and the last two "):
var string = 'url("http://domain.com/randompath/random4509324041123213.jpg")';
// Remove last two characters
string = string.substr(0, string.length - 2);
// Remove first five characters
string = string.substr(5, string.length);
Here's a working fiddle.
Benefit of this approach: You can edit it yourself, without asking StackOverflow to do it for you. RegEx is great, but if you don't know it, peppering your code with it makes for a frustrating refactor.

Javascript URL Regex That Checks Regex with URL

I have this URL...
http://www.google.com/local/add/analytics?hl=en-US&gl=US
And I want to check these URLs to see if they matches above URL...
www.google.com/local/add*
www.google.com/local/add/*
http://www.google.com/local/add*
http://www.google.com/local/add/*
https://www.google.com/local/add*
https://www.google.com/local/add/*
You can see the input URL is also a regex having * so what regex that I can use to match a list of URLs with a regex to see if the url exists? Currently I am doing this...
var isAllowed = (url.indexOf(newURL) === 0);
Which is definitely not efficient.
it's not the cleanest regex i've ever written but I think it should work.
var url = "http://www.google.com/local/add/analytics?hl=en-US&gl=US";
var reg = /((https|http|)(\:\/\/|)www\.google.com\/local\/add(\/|)).*/;
console.log(reg.test(url));
this will return true for all of these cases
www.google.com/local/add*
www.google.com/local/add/*
http://www.google.com/local/add*
http://www.google.com/local/add/*
https://www.google.com/local/add*
https://www.google.com/local/add/*
it should look for (http or https or nothing) then (:// or nothing) then www.google.com/local/add then (/ or nothing) then anything.
the one case it will also return true that I will leave for you is the case (http|https)www.google.com/local/add(/|)*
var reg = new RegExp("(https?://)?(www.)?google.com/local/add/?"),
URL = "http://www.google.com/local/add/analytics?hl=en-US&gl=US";
console.log(reg.test(URL));
I've used the ? a lot, which means, whatever character precedes the question mark may or may not be matched.
https? means the s may or may not be there. (www.)? means that the www. may be absent entirely. You hopefully get how it works now.
Demo
Learn how to use Regular Expressions
As far as I understand you, you want something like this:
Convert the input URL to a regex. E.g.:
var input = "http://www.google.com/local/add*";
var reg_url = input .replace(/\*/g,".*").replace(/\./g,"\\.");
you might need to escape some more characters, see here
And check if it matches:
var url = "http://www.google.com/local/add/analytics?hl=en-US&gl=US";
var isAllowed = url.search(reg_url) >= 0;

Categories