Removing last part of URL based on - javascript

I need to remove any occurence of a product number that may occur in URLs, using javascript/jquery.
URL looks like this:
http://www.mysite.com/section1/section2/section3/section4/01-012-15_1571884
The final part of the url is always formatted with 2 digits followed by -, so I was thinking a regex might do the job? I need everything removing after the last /.
It must also work when the product occurs higher or lower in the hierarchy, i.e.: http://www.mysite.com/section1/section2/01-012-15_1571884
So far I have tried different solutions with location.pathname and splits, but I am stuck on how to handle differences in product hierarchy and handling the arrays.

DEMO
var x = "http://www.mysite.com/section1/section2/section3/section4/01-012-15_1571884";
console.log(x.substr(0,x.lastIndexOf('/')));

Use lastIndexOf to find the last occurence of "/" and then remove the rest of the path using substring.

var url = 'http://www.mysite.com/section1/section2/section3/section4/01-012-15_1571884';
parts = url.split('/');
parts.pop();
url = parts.join('/');
http://jsfiddle.net/YXe6L/

var a = 'http://www.mysite.com/section1/section2/01-012-15_1571884',
result = a.replace(a.match(/(\d{1,2}-\d{1,3}-\d{1,2}_\d+)[^\d]*/g), '');
JSFiddle: http://jsfiddle.net/2TVBk/2/
This is a very nice online regex tester to test your regexes with: http://regexpal.com/

Here is an approach that will properly handle a situation where there is no product ID as you requested. http://jsfiddle.net/84GVe/
var url1 = "http://www.mysite.com/section1/section2/section3/section4/01-012-15_1571884";
var url2 = "http://www.mysite.com/section1/section2/section3/section4";
function removeID(url) {
//look for a / followed by _, - or 0-9 characters,
//and use $ to ensure it is the end of the string
var reg = /\/[-\d_]+$/;
if(reg.test(url))
{
url = url.substr(0,url.lastIndexOf('/'));
}
return url;
}
console.log( removeID(url1) );
console.log( removeID(url2) );

Related

RegEx for matching YouTube embed ID

I'm in non-modern JavaScript and I have a string defined as follows:
"//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0"
I want to pull out just the DmYK479EpQc but I don't know the length. I do know that I want what is after the / and before the ?
Is there some simple lines of JavaScript that would solve this?
Use the URL object?
console.log(
(new URL("//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0", location.href)).pathname
.split('/')
.pop());
Why? Because I can likely make up a URL that defeats the regex (though for youtube it's probably unlikely)
This expression might help you to do so, and it might be faster:
(d\/)([A-z0-9]+)(\?)
Graph
This graph shows how the expression would work and you can visualize other expressions in this link:
const regex = /(.*)(d\/)([A-z0-9]+)(\?)(.*)/gm;
const str = `//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0`;
const subst = `$3`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Performance Test
This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.
const repeat = 1000000;
const start = Date.now();
for (var i = repeat; i >= 0; i--) {
const string = '//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0';
const regex = /(.*)(d\/)([A-z0-9]+)(\?)(.*)/gm;
var match = string.replace(regex, "$3");
}
const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");
How about non-regex way
console.log("//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0".split('/').pop().split('?')[0]);
I'm not going to give a piece of code because this is a relatively simple algorithm, and easy to implement.
Please note that those links has this format (correct me if I'm wrong):
https:// or http://
www.youtube.com/
embed/
Video ID (DmYK479EpQc in this case)
?parameters (note that they start ALWAYS with the character ?)
You want the ID of the video, so you can split the string into those sections and if you store those sections in one array you can be sure that the ID is at the 3rd position.
One example of how that array would look like would be:
['https://', 'www.youtube.com', 'embed', 'DmYK479EpQc', '?vq=hd720&rel=0']
One option uses a regex replacement:
var url = "//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0";
var path = url.replace(/.*\/([^?]+).*/, "$1");
console.log(path);
The above regex pattern says to:
.* match and consume everything up to and
/ including the last path separator
([^?]+) then match and capture any number of non ? characters
.* then consume the rest of the input
Then, we just replace with the first capture group, which corresponds to the text after the final path separator, but before the start of the query string, should the URL have one.
You can use this regex
.* match and consume everything up to
[A-z0-9]+ then match and capture any number and character between A-z
.* then consume the rest of the input
const ytUrl = '//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0';
const regex = /(.*)(d\/)([A-z0-9]+)(\?)(.*)/gm;
const position = '$3';
let result = ytUrl.replace(regex, position);
console.log('YouTube ID: ', result);
This regex just split the string into different sections and the YouTube id is at the 3rd position.
Another, solution is using split. This method splits a string into an array of substrings.
const ytUrl = '//www.youtube.com/embed/DmYK479EpQc?vq=hd720&rel=0';
let result = ytUrl.split('/').pop().split('?').shift()
console.log('YouTube ID: ', result);
In this sample, we split the URL using / as separator. Then we took the last element of the array with the pop method. and finally we split again using ? as separator and we take the first element of the array with the shift method.

javascript regex for querysting with anchor

I am making a web app that has multiple 'pages' but it will all be loaded client side. Seems how it is all technically on the same page, I will be using parameters after # to track the current page state while preventing postbacks. My problem is that I cant seem to select all the parameters with a regex line. The regex for split works when I use a testing tool online but does not work when I use it on my web page.
//Test data for url
//https://test.ca?hi&hey=3&test=oh+hi+mark#edit&e=1
var split = /([^&#=]+)=?([^&#]*)/g;
var url = window.location.href;
var match = split.exec(url);
//this outputs match with a length of three
//[0] = 'https://test.ca?hi'
//[1] = 'https://test.ca?hi'
//[2] = ''
I thought this should be a solved problem but I cant seem to find an answer. Which I guess leads to another question. Am I going about this the completely wrong way?
You are using the regex wrong. You just print the whole match, while you need to access the captured groups while iterating through all the matches inside 1 string.
Here is an example snippet:
var re = /([^&#=]+)=?([^&#]*)/g;
var str = 'https://test.ca?hi&hey=3&test=oh+hi+mark#edit&e=1';
var match;
while ((match = re.exec(str)) !== null) {
document.write(match[1] + "<br/>" + match[2] + "<br/><br/>");
}
Note that the first match is the "main" part of the URL. Subsequent matches are param-value pairs.
Try using window.location.hash instead. It will return the hash value (in your example url it would be #edit&e=1) and you can use string operations to do whatever you need to with that.

How to slice a String object until it reaches "/" mark?

I have a link "http://client.local/#/wiki/revision/new/1" and I'd like to slice off everything of it and leave only the id ("1" in the example). Now I know that can be done using JavaScript's slice() function, but it must slice it from the end until it reaches the / sign no matter how many letters the id contains.
How can I do that?
You can use a regex :
var id = url.match(/[^\/]*$/)[0];
or split :
var id = url.split('/').pop();
Using a regex allows for better easy control. For example if you want to fail if what follows the last / isn't made of digits, do
var m = url.match(/\/(\d+)$/)
if (m) {
var id = m[1];
...
} else {
// bad URL, let's handle the error
}
Yet another variant
var id = url.substring(url.lastIndexOf('/')+1)
You are looking for split and not slice
var id = "http://client.local/#/wiki/revision/new/1234".split('/').pop();
Given that javascript regular expressions are greedy by default, you can do:
var id = url.replace(/.*\//,'');
which will return everything after the last '/'.

Finding image url via using Regex

Any working Regex to find image url ?
Example :
var reg = /^url\(|url\(".*"\)|\)$/;
var string = 'url("http://domain.com/randompath/random4509324041123213.jpg")';
var string2 = 'url(http://domain.com/randompath/random4509324041123213.jpg)';
console.log(string.match(reg));
console.log(string2.match(reg));
I tied but fail with this reg
pattern will look like this, I just want image url between url(" ") or url( )
I just want to get output like http://domain.com/randompath/random4509324041123213.jpg
http://jsbin.com/ahewaq/1/edit
I'd simply use this expression:
/url.*\("?([^")]+)/
This returns an array, where the first index (0) contains the entire match, the second will be the url itself, like so:
'url("http://domain.com/randompath/random4509324041123213.jpg")'.match(/url.*\("?([^")]+)/)[1];
//returns "http://domain.com/randompath/random4509324041123213.jpg"
//or without the quotes, same return, same expression
'url(http://domain.com/randompath/random4509324041123213.jpg)'.match(/url.*\("?([^")]+)/)[1];
If there is a change that single and double quotes are used, you can simply replace all " by either '" or ['"], in this case:
/url.*\(["']?([^"')]+)/
Try this regexp:
var regex = /\burl\(\"?(.*?)\"?\)/;
var match = regex.exec(string);
console.log(match[1]);
The URL is captured in the first subgroup.
If the string will always be consistent, one option would be simply to remove the first 4 characters url(" and the last two "):
var string = 'url("http://domain.com/randompath/random4509324041123213.jpg")';
// Remove last two characters
string = string.substr(0, string.length - 2);
// Remove first five characters
string = string.substr(5, string.length);
Here's a working fiddle.
Benefit of this approach: You can edit it yourself, without asking StackOverflow to do it for you. RegEx is great, but if you don't know it, peppering your code with it makes for a frustrating refactor.

Regex to find id in url

I have the following URL:
http://example.com/product/1/something/another-thing
Although it can also be:
http://test.example.com/product/1/something/another-thing
or
http://completelydifferentdomain.tdl/product/1/something/another-thing
And I want to get the number 1 (id) from the URL using Javascript.
The only thing that would always be the same is /product. But I have some other pages where there is also /product in the url just not at the start of the path.
What would the regex look like?
Use window.location.pathname to
retrieve the current path (excluding
TLD).
Use the JavaScript string
match method.
Use the regex /^\/product\/(\d+)/ to find a path which starts with /product/, then one or more digits (add i right at the end to support case insensitivity).
Come up with something like this:
var res = window.location.pathname.match(/^\/product\/(\d+)/);
if (res.length == 2) {
// use res[1] to get the id.
}
/\/product\/(\d+)/ and obtain $1.
Just, as an alternative, to do this without Regex (though i admit regex is awfully nice here)
var url = "http://test.example.com//mypage/1/test/test//test";
var newurl = url.replace("http://","").split("/");
for(i=0;i<newurl.length;i++) {
if(newurl[i] == "") {
newurl.splice(i,1); //this for loop takes care of situatiosn where there may be a // or /// instead of a /
}
}
alert(newurl[2]); //returns 1
I would like to suggest another option.
.match(/\/(\d+)+[\/]?/g)
This would return all matches of id's present.
Example:
var url = 'http://localhost:4000/#/trees/8/detail/3';
// with slashes
var ids = url.match(/\/(\d+)+[\/]?/g);
console.log(ids);
//without slashes
ids = url.match(/\/(\d+)+[\/]?/g).map(id => id.replace(/\//g, ''));
console.log(ids);
This way, your URL doesn't even matter, it justs retrieves all parts that are number only.
To just get the first result you could remove the g modifier:
.match(/\/(\d+)+[\/]?/)
var url = 'http://localhost:4000/#/trees/8';
var id = url.match(/\/(\d+)+[\/]?/);
//With and without slashes
console.log(id);
The id without slashes would be in the second element because this is the first group found in the full match.
Hope this helps people.
Cheers!

Categories