Manipulate urls with regular expressions instead of the slip approach - javascript

Basically I have an url which looks like this structurally speaking:
http://www.my-site.com/topic1/topic2/topic3/topic4/topic5
and I want to do 2 things to it:
1. check the structure and validate it
2. replace both topic3 and topic4 with a new parameters called topic6
This is my solution for now but I am interested to build something more optimize.
if ((url.match(/\//g) || []).length === 7) {
const arr = url.split('/');
const topic5 = arr.pop();
arr.pop();
arr.pop();
arr.push('topic6', topic5);
return arr.join('/');
}
So as you can see, for the fact that I am not very good at regular expressions I have used another approach which doesn't look that good. I basically check for the number of / in the string, if tey are 7 it means that the structure of the url is good and that on this type of url I should apply the next steps. After that I grab the last param of the url and also remove it and next after that I remove the last 2 parameters and add the new one instead along with the topic5.
In case you you have a better approach, please let me know. I think that it can be done by writing less number of lines of code but as I said, I am not very familiar with regular expressions.

One way is to use replace to match last 3 with /, capture only the last / part in capture group and replace with the topics6 and value of captured group
let url = "http://www.my-site.com/topic1/topic2/topic3/topic4/topic5"
let func = (url) => {
if ((url.match(/\//g) || []).length === 7) {
const arr = url.replace(/(?:\/[^\/]+){2}(\/[^\/]+)$/, (m, g) => `\/topics6${g}`)
return arr
}
return url
}
console.log(func(url))

Related

Clean way to get value from string in Javascript

I have this string https://pokeapi.co/api/v2/pokemon/6/
I would like to extract the value after pokemon/ in this case 6. This represent Pokémon ids which could span between 1 -> N
I know this is pretty trivial and was wondering a nice solution for future proofing. Here is my solution.
const foo= "https://pokeapi.co/api/v2/pokemon/6/"
const result = foo.split('/') //[ 'https:', '', 'pokeapi.co', 'api', 'v2', 'pokemon', '6', '' ]
const ids = result[6]
You can grab the value after the last / character like so:
const pokemonID = foo.substring(foo.lastIndexOf("/") + 1)
Using String.lastIndexOf to get the final index of the slash character, and then using String.substring with only a single argument to parse the part of the string after that last / character. We add 1 to the lastIndexOf to omit the final slash.
For this to work you need to drop your final trailing slash (which won't do anything anyways) from your request URL.
This could be abstracted into a utility function to get the last value of any url, which is the biggest improvement over using a split and find by index approach.
However, beware, it will take whatever the value is after the last slash.
Using the string https://pokeapi.co/api/v2/pokemon/6/pokedex would return pokedex.
If you are using Angular, React, Vue etc with built in router, there will be specific APIs for the framework that can get the exact parameter you need regardless of URL shape.
You should use the built-in URL API to do the splitting correctly for you:
const url = new URL("https://pokeapi.co/api/v2/pokemon/6/");
Then you can get the pathname and split that:
const path = url.pathname.split("/");
After you split it you can get the value 6 by accessing the 5th element here:
const url = new URL("https://pokeapi.co/api/v2/pokemon/6/");
const path = url.pathname.split("/");
console.log(path[4]);
you could also do something like:
url.split('pokemon/')[1].split('/')[0]
Here is what I would do
const result = new URL(url).pathname.split('/');
const id = result[4];
I am not sure if this is better than yours
const foo= "https://pokeapi.co/api/v2/pokemon/6/"
const result = foo.indexOf("pokemon/");
const id_index = result + 8
const id = foo[id_index];

URL Parse Exercise (JavaScript)

So here is a description of the problem that I've been talked to solve:
We need some logic that extracts the variable parts of a url into a hash. The keys
of the extract hash will be the "names" of the variable parts of a url, and the
values of the hash will be the values. We will be supplied with:
A url format string, which describes the format of a url. A url format string
can contain constant parts and variable parts, in any order, where "parts"
of a url are separated with "/". All variable parts begin with a colon. Here is
an example of such a url format string:
'/:version/api/:collection/:id'
A particular url instance that is guaranteed to have the format given by
the url format string. It may also contain url parameters. For example,
given the example url format string above, the url instance might be:
'/6/api/listings/3?sort=desc&limit=10'
Given this example url format string and url instance, the hash we want that
maps all the variable parts of the url instance to their values would look like this:
{
version: 6,
collection: 'listings',
id: 3,
sort: 'desc',
limit: 10
}
So I technically have a semi-working solution to this but, my questions are:
Am I understanding the task correctly? I'm not sure if I'm supposed to be dealing with two inputs (URL format string and URL instance) or if I'm just supposed to be working with one URL as a whole. (my solution takes two separate inputs)
In my solution, I keep reusing the split() method to chunk the array/s down and it feels a little repetitive. Is there a better way to do this?
If anyone can help me understand this challenge better and/or help me clean up my solution, it would be greatly appreciated!
Here is my JS:
const obj = {};
function parseUrl(str1, str2) {
const keyArr = [];
const valArr = [];
const splitStr1 = str1.split("/");
const splitStr2 = str2.split("?");
let val1 = splitStr2[0].split("/");
let val2 = splitStr2[1].split("&");
splitStr1.forEach((i) => {
keyArr.push(i);
});
val1.forEach((i) => {
valArr.push(i);
});
val2.forEach((i) => {
keyArr.push(i.split("=")[0]);
valArr.push(i.split("=")[1]);
});
for (let i = 0; i < keyArr.length; i++) {
if (keyArr[i] !== "" && valArr[i] !== "") {
obj[keyArr[i]] = valArr[i];
}
}
return obj;
};
console.log(parseUrl('/:version/api/:collection/:id', '/6/api/listings/3?sort=desc&limit=10'));
And here is a link to my codepen so you can see my output in the console:
https://codepen.io/TOOTCODER/pen/yLabpBo?editors=0012
Am I understanding the task correctly? I'm not sure if I'm supposed to
be dealing with two inputs (URL format string and URL instance) or if
I'm just supposed to be working with one URL as a whole. (my solution
takes two separate inputs)
Yes, your understanding of the problem seems correct to me. What this task seems to be asking you to do is implement a route parameter and a query string parser. These often come up when you want to extract data from part of the URL on the server-side (although you don't usually need to implement this logic your self). Do keep in mind though, you only want to get the path parameters if they have a : in front of them (currently you're retrieving all values for all), not all parameters (eg: api in your answer should be excluded from the object (ie: hash)).
In my solution, I keep reusing the split() method to chunk the array/s
down and it feels a little repetitive. Is there a better way to do
this?
The number of .split() methods that you have may seem like a lot, but each of them is serving its own purpose of extracting the data required. You can, however, change your code to make use of other array methods such as .map(), .filter() etc. to cut your code down a little. The below code also considers the case when no query string (ie: ?key=value) is provided:
function parseQuery(queryString) {
return queryString.split("&").map(qParam => qParam.split("="));
}
function parseUrl(str1, str2) {
const keys = str1.split("/")
.map((key, idx) => [key.replace(":", ""), idx, key.charAt(0) === ":"])
.filter(([,,keep]) => keep);
const [path, query = ""] = str2.split("?");
const pathParts = path.split("/");
const entries = keys.map(([key, idx]) => [key, pathParts[idx]]);
return Object.fromEntries(query ? [...entries, ...parseQuery(query)] : entries);
}
console.log(parseUrl('/:version/api/:collection/:id', '/6/api/listings/3?sort=desc&limit=10'));
It would be even better if you don't have to re-invent the wheel, and instead make use of the URL constructor, which will allow you to extract the required information from your URLs more easily, such as the search parameters, this, however, requires that both strings are valid URLs:
function parseUrl(str1, str2) {
const {pathname, searchParams} = new URL(str2);
const keys = new URL(str1).pathname.split("/")
.map((key, idx) => [key.replace(":", ""), idx, key.startsWith(":")])
.filter(([,,keep]) => keep);
const pathParts = pathname.split("/");
const entries = keys.map(([key, idx]) => [key, pathParts[idx]]);
return Object.fromEntries([...entries, ...searchParams]);
}
console.log(parseUrl('https://www.example.com/:version/api/:collection/:id', 'https://www.example.com/6/api/listings/3?sort=desc&limit=10'));
Above, we still need to write our own custom logic to obtain the URL parameters, however, we don't need to write any logic to extract the query string data as this is done for us by using URLSearchParams. We're also able to lower the number of .split()s used as we can obtain use the URL constructor to give us an object with a parsed URL already. If you end up using a library (such as express), you will get the above functionality out-of-the-box.

Javascript regex parse complex url string

I need to parse a complex URL string to fetch specific values.
From the following URL string:
/api/rss/feeds?url=http://any-feed-url-a.com?filter=hot&format=rss&url=http://any-feed-url-b.com?filter=rising&format=rss
I need to extract this result in array format:
['http://any-feed-url-a.com?filter=hot&format=rss', 'http://any-feed-url-b.com?filter=rising&format=rss']
I tried already with this one /url=([^&]+)/ but I can't capture all correctly all the query parameters. And I would like to omit the url=.
RegExr link
Thanks in advance.
This regex works for me: url=([a-z:/.?=-]+&[a-z=]+)
also, you can test this: /http(s)?://([a-z-.?=&])+&/g
const string = '/api/rss/feeds?url=http://any-feed-url.com?filter=hot&format=rss&url=http://any-feed-url.com?filter=latest&format=rss'
const string2 = '/api/rss/feeds?url=http://any-feed-url.com?filter=hot&format=rss&next=parm&url=http://any-feed-url.com?filter=latest&format=rss'
const regex = /url=([a-z:/.?=-]+&[a-z=]+)/g;
const regex2 = /http(s)?:\/\/([a-z-.?=&])+&/g;
console.log(string.match(regex))
console.log(string2.match(regex2))
have you tried to use split method ? instead of using regex.
const urlsArr = "/api/rss/feeds?url=http://any-feed-url-a.com?filter=hot&format=rss&url=http://any-feed-url-b.com?filter=rising&format=rss".split("url=");
urlsArr.shift(); // removing first item from array -> "/api/rss/feeds?"
console.log(urlsArr)
)
which is going to return ["/api/rss/feeds?", "http://any-feed-url-a.com?filter=hot&format=rss&", "http://any-feed-url-b.com?filter=rising&format=rss"] then i am dropping first item in array
if possible its better to use something else then regex CoddingHorror: regular-expressions-now-you-have-two-problems
You can matchAll the url's, then map the capture group 1 to an array.
str = '/api/rss/feeds?url=http://any-feed-url-a.com?filter=hot&format=rss&url=http://any-feed-url-b.com?filter=rising&format=rss'
arr = [...str.matchAll(/url=(.*?)(?=&url=|$)/g)].map(x => x[1])
console.log(arr)
But matchAll isn't supported by older browsers.
But looping an exec to fill an array works also.
str = '/api/rss/feeds?url=http://any-feed-url-a.com?filter=hot&format=rss&url=http://any-feed-url-b.com?filter=rising&format=rss'
re = /url=(.*?)(?=&url=|$)/g;
arr = [];
while (m = re.exec(str)) {
arr.push(m[1]);
}
console.log(arr)
If your input is better-formed in reality than shown in the question and you’re targeting a modern JavaScript environment, there’s URL/URLSearchParams:
const input = '/api/rss/feeds?url=http://any-feed-url-a.com?filter=hot%26format=rss&url=http://any-feed-url-b.com?filter=rising%26format=rss';
const url = new URL(input, 'http://example.com/');
console.log(url.searchParams.getAll('url'));
Notice how & has to be escaped as %26 for it to make sense.
Without this input in a standard form, it’s not clear which rules of URLs are still on the table.

create one array after using map() twice

I may or may not get in 2 differently formatted bits of data.
They both need to be stripped of characters in different ways. Please excuse the variable names, I will make them better once I have this working.
const cut = flatten.map(obj => {
return obj.file.replace("0:/", "");
});
const removeDots = flatten.map(obj => {
return obj.file.replace("../../uploads/", "");
})
I then need to push the arrays into my mongo database.
let data;
for (const loop of cut) {
data = { name: loop };
product.images.push(data);
}
let moreData;
for (const looptwo of removeDots) {
moreData = {name: looptwo};
product.images.push(moreData);
}
I wanted to know if there is a way to either join them or do an if/else because the result of this is that if I have 2 records, it ends up duplicating and I get 4 records instead of 2. Also, 2 of the records are incorrectly formatted ie: the "0:/ is still present instead of being stripped away.
Ideally I would like have a check that if 0:/ is present, remove it, if ../../uploads/ is present or if both are present, remove both. And then create an array from that to push.
You can do your 2 replace on the same map :
const processed = flatten.map(obj => {
return obj.file.replace("0:/", "").replace("../../uploads/", "");
});
Since you know the possible patterns, you can create a regex and use it to replace any occurrences.
const regex = /(0:\/|(\.\.\/)+uploads\/)/g
const processed = flatten.map(obj => obj.file.replace(regex, ''));
You can verify here
Note, regex is a pattern based approach. So it has pros and cons.
Pro:
You can have any number of folder nesting. Using string ../../uploads/ will restrict you with 2 folder structure only.
You can achieve transformation in 1 operation and code looks clean.
Cons:
Regex can be hard to understand and can reduce readability of code a bit. (Opinionated)
If you have pattern like .../../uploads/bla, this will be parsed to .bla.
Since you ask also about a possible way of joining two arrays, I'll give you couple of solutions (with and w/o joining).
You can either chain .replace on the elements of the array, or you can concat the two arrays in your solution. So, either:
const filtered = flatten.map(obj => {
return obj.file.replace('0:/', '').replace('../../uploads/', '');
});
Or (joining the arrays):
// your two .map calls go here
const joinedArray = cut.concat(removeDots);

Check Array Entries with Regex

I have an Array with one or more entries. Each one is a string (List of urls in open Tabs via Firefox SDK). I want to check if a specific url is already opened in some of the tabs (nothing special till now).
My problem is, that the url in tab list can have four diffrent fourms. For example:
Url I want to find in the tablist:
https://cmsr-author.de/cf#/content/test/de.html
But the url can also look like this:
https://cmsr-author.de/content/test/de.html
https://cmsr-author.de/test/de.html
https://cmsr-author.de/cf#/test/de.html
Of course the last part of the url (after /test/...) is always something diffrent. If I wasn't able to find one of the four urls in the tablist i want to call some other action.
My Solution till now is to build some if-chain:
if (res !== url1) {
if (res !== url2) {
if ...
But i thought there must be some more elegant way. Maybe via RegEx? I already have a capture to catch the first part (which stays the same https://cmsr-author.ws...) with it four forms. But i dont know how to implent this probably.
var urls = ["https://cmsr-author.de/content/test/de.html","https://cmsr-author.de/test/de.html","https://cmsr-author.de/cf#/test/de.html"]
var filtered = urls.filter(function(url)
{
return url.indexOf("cf#") > -1 && url.endsWith("/test/de.html")
})
var contains = filtered.length > 0
console.log(contains)
If you want to use regex you can do this by using groups for the middle part, which is explained in detail here: http://www.regular-expressions.info/refcapture.html
Practically, your regex would look something like that:
https:\/\/cmsr-author\.de\/(content|...|...)\/de\.html
Where ... must be replaced by the middle parts of the url which differ.
Note that | is "or" used to provide multiple possibilities within the group. The character / and . must be escaped since they have special roles in regex.
I hope that helps!
My English is not good,Do not fully understand what you mean,According to my idea,You should need a regular expression,Only to match the first.If I am wrong,
please # me.
I hope that helps!
var reg = /^https:\/\/cmsr\-author\.de\/cf#\/(?:\w+\/)+test\/de\.html$/gi;
var str1 = "https://cmsr-author.de/cf#/content/test/de.html";
var str2 = "https://cmsr-author.de/content/test/de.html";
var str3 = "https://cmsr-author.de/test/de.html";
var str4 = "https://cmsr-author.de/cf#/test/de.html";
console.log(reg.test(str1));
console.log(reg.test(str2));
console.log(reg.test(str3));
console.log(reg.test(str4));

Categories