JS regex to check url - javascript

This must habe been asked a million times, but I can't find a solution to fit my needs.
I need to regex to check if a string contains an url, then get it. So I have this :
var regexToken = /(((ftp|https?):\/\/)[\-\w#:%_\+.~#?,&\/\/=]+)|((mailto:)?[_.\w-]+#([\w][\w\-]+\.)+[a-zA-Z]{2,3})/g;
while( (matchArray = regexToken.exec( source )) !== null )
{
var result = matchArray[0];
}
return result;
This can retrieve :
http(s)|ftp://domain.com
http(s)|ftp://www.domain.com
http(s)|ftp://www.domain.com/with/path
But I need to modify that so it could also retrieve url that just begin with www :
www.domain.com/with/path
How to do that ? I'm really noob with regex...

Something like this may help:
/(((ftp|https?):\/\/|www\.)[continue from here]/
This will start matches allowing ftp://, http://, https:// or www.

Try this way to match url of different format.Like,
var re = /^((ftp|https?):\/\/|www\.).*/gm;
var str = 'http://domain.com\nhttps://domain.com\nftp://domain.com\nftp://www.domain.com\nhttp://www.domain.com\nhttps://www.domain.com\nhttp://www.domain.com/with/path\nhttps://www.domain.com/with/path\nftp://www.domain.com/with/path\nwww.domain.com/with/path \n\n';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
DEMO

Related

Match Regex if not inside specific HTML tag

I would like to get special formatted strings ({string}) out of the HTML which are not inside a specific HTML tag.
For example I would like to match {test} and not <var>{test}</var>.
Therefore I am using the following regex: (excluding is done with ?!)
(?!<var>)\{\S+?\}(?!<\/var>)
So this works very well for texts with spaces, but if I have something like (where there is no space in-between):
<var>{name}</var>{username}
it matches two {}-strings: {name}</var>{username}
How can I just match {username} here?
Update:
If I need to do something like this
<var.*?<\/var>|(\{\S+?\})
How can I get the matched values, because the matched index depends on the position.
Examples:
Match 1:
"{username}<var>{name}</var>".match(/<var.*?<\/var>|(\{\S+?\})/g);
=> ["{username}", "<var>{name}</var>"]
Match 2:
"<var>{name}</var>{username}".match(/<var.*?<\/var>|(\{\S+?\})/g);
=> ["<var>{name}</var>", "{username}"]
Current Solution:
angular.forEach(html.match(regex), function (match) {
if(match.substring(0, 4) !== '<var') {
newAdded = match;
}
});
Is this really the 'best' solution for JavaScript?
Here is how you can achieve this using the following regex:
/<var.*?<\/var>|(\{\S+?\})/g;
var s = '<var>{name}</var>{username}<var>{newname}</var>{another_username}';
var log = [];
var m;
var regex = /<var.*?<\/var>|(\{\S+?\})/g;
while ((m = regex.exec(s)) !== null) {
if ( m[1] !== undefined) {
log.push(m[1]);
}
}
alert(log);

javascript regex pattern for _water_glass

I need a javascript regex pattern to test a schema variable, so that it should have either of the following.
It can start with any character followed by "_water_glass" and must not be anything after water_glass like "xxxx_water_glass"
or
It can be just "water_glass" not necessary to have character before water_glass and must not be anything after water_glass.
Could anyone help on this please to get the regex pattern.
Try this simply /^.*_?\_water_glass/
var re = /^.*_?_water_glass/mg;
var str = 'horse.mp3_country_code\n4343434_country_code\n_country_code';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
DEMO https://regex101.com/r/gB9zL7/2
Here you are:
^(?:.+_|)water_glass$
Details:
^- start of string
(?:.+_|) - an optional 1+ chars other than line break chars, as many as possible, up to the last _ including it
water_glass - a water_glass substring
$ - end of string.
See this regex demo and a demo code below:
var re = /^(?:.+_|)water_glass$/gm;
var str = 'xxxx_water_glass\nwater_glass';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}

selecting with regex content between two points

I always have a hard time with regex..
I'm trying to select the text between (taking into acount the before and after)
'window.API=' and ';' //for window.API= '--API--';
and other cases like:
'window.img_cdn=' and ';' //for window.img_cdn= '--imgCDN--';
any tips on witch regex concepts I should use would be a great help!
If you want to capture the content between 'xx' you can use a regex like this:
'(.*?)'
working demo
For the sample text:
window.API= '--API--';
window.img_cdn= '--imgCDN--';
You will capture:
MATCH 1
1. [13-20] `--API--`
MATCH 2
1. [40-50] `--imgCDN--`
The javascript code you can use is:
var re = /'(.*?)'/g;
var str = 'window.API= \'--API--\';\nwindow.img_cdn= \'--imgCDN--\';';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
On the other hand, if you specifically want to capture the content for only those entries, then you can use this regex:
window\.(?:API|img_cdn).*?'(.*?)'
If you want to match any text between a <some string>= sign and a semicolon, here you go:
(?:[\w\.]+\s*=\s')(.+)(?:';)$
This regex pattern will match a full string if an escaped apostrophe is present in the string: //for window.img_cdn = '--imgCDN and \'semicolon\'--';
JavaScript code:
var re = /(?:[\w\.]+\s*=\s')(.+)(?:';)$/gm;
var str = '//for window.img_cdn= \'--imgCDN--\';\n//for window.img_cdn = \'--imgCDN and semicolon = ;;;--\';';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// view results
}
The required text is in the 1st captured group. In case there is a semicolon in the text you are looking for, you will correctly match it due to the $ anchor.
See demo here

Matching the last word in an URL with JavaScript

The URL looks like this:
http://www.example.com/?lessoncontent=lesson-003-pinyin
I managed to get the last part with this:
var url = window.location.href.split("/").pop();
So now I got this:
?lessoncontent=lesson-003-pinyin
Not sure how to get the last part, though (pinyin). I want to be able to do if statements with URLs like this:
?lessoncontent=lesson-001-pinyin
?lessoncontent=lesson-003-pinyin
?lessoncontent=lesson-002-complete
?lessoncontent=lesson-003-complete
(Only taking into account the last word of the URL).
Example:
if (match === "pinyin") { //do stuff }
if (match === "complete") { //do stuff }
Just split on - and take the last element from it.
var match = window.location.href.split("-").pop();
if (match === "pinyin") {} // do stuff
if (match === "complete") {} // do stuff
We are splitting on - and then, taking the last element in it by popping it out of the Array.
I would try this:
(\w+)[&?\/]?$
which will work for all sorts of URLs, for example whether there is a URL parameter or not. It will get all the word characters up to a optional trailing &, ?, or /. See Regex 101 Demo here.
[^-]*$
Try this.See demo.You can direclty apply this over the link and get the answer in one step.
https://regex101.com/r/wU7sQ0/16
var re = /[^-]*$/gm;
var str = 'http://www.example.com/?lessoncontent=lesson-003-pinyin';
var m;
while ((m = re.exec(str)) != null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
// View your result using the m-variable.
// eg m[0] etc.
}
You may try RegExp:
var match = location.search.match(/(\w+)$/)[0]
if (match === "pinyin") { //do stuff }
if (match === "complete") { //do stuff }
Here, location.search will only the parameters i.e. ?lessoncontent=lesson-001-pinyin. Next match(/(\w+)$/) gives you an array of matching words from the end of string.

Javascript match part of url, if statement based on result

Here is an example of the url i'm trying to match: http://store.mywebsite.com/folder-1/folder-2/item3423434.aspx
What im trying to match is http: //store.mywebsite.com/folder-1 except that "folder-1" will always be a different value. I can't figure out how to write an if statement for this:
Example (pseudo-code)
if(url contains http://store.mywebsite.com/folder-1)
do this
else if (url contains http://store.mywebsite.com/folder-2)
do something else
etc
In the interest of keeping things very simple...
if(location.pathname.indexOf("folder-1") != -1)
{
//do things for "folder-1"
}
this might give you false positives if the value "folder-1" could be present in other parts of the string. If you are already making sure this is not the case, the provided example should be sufficient.
I would split() the string and check an individual component of the url:
var str = "http://store.mywebsite.com/folder-1/folder-2/item3423434.aspx"
// split the string into an array of parts
var spl = str.split("/");
// spl is now [ http:,,store.mywebsite.com,folder-1,folder-2,item3423434.aspx ]
if (spl[4] == "folder-1") {
// do something
} else if (spl[4] == "folder-2") {
// do something else
}
Using this method it's easy to check other parts of the URL too, without having to use a regular expression with sub-expression captures. e.g. matching the second directory in the path would be if spl[5] == "folder-x".
Of course, you could also use indexOf(), which will return the position of a substring match within a string, but this method is not quite as dynamic and it's not very efficient/easy to read if there are going to be a lot of else conditions:
var str = "http://store.mywebsite.com/folder-1/folder-2/item3423434.aspx"
if (str.indexOf("http://store.mywebsite.com/folder-1") === 0) {
// do something
} else if (str.indexOf("http://store.mywebsite.com/folder-2") === 0) {
// do something
}
Assuming the base URL is fixed and the folder numbers can be very large then this code should work:
var url = 'http://store.mywebsite.com/folder-1/folder-2/item3423434.aspx'
, regex = /^http:..store.mywebsite.com.(folder-\d+)/
, match = url.match(regex);
if (match) {
if (match[1] == 'folder-1') {
// Do this
} else if (match[1] == 'folder-2') {
// Do something else
}
}
Just use URL parting in JS and then you can match URL's against simple string conditions or against regular expressions

Categories