Javascript regular expression for WebURL - javascript

I want regular expression in javascript that can validate any WebURL.
It should accept below formats:
google.com/...
www.google.com/...
http://google.com/...
https://google.com/...
I have tried lots of regular expressions for that.But no one is looking perfect.Below are some of the tried regular expressions:-
/(http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/
/[a-z]+:\/\/(([a-z0-9][a-z0-9-]+\.)*[a-z][a-z]+|(0x[0-9A-F]+)|[0-9.]+)\/.*/
/(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/
/^(((ht|f){1}((tp|tps):[/][/]){1}))[-a-zA-Z0-9#:%_\+.~#!?&//=]+$/
I want regular expression should take only 3 WWW.Not more than 3 and not less than 3 WWW.

Here there are two methods to validate the URL
^(https?|ftp|file)://.+$
^((https?|ftp)://|(www|ftp)\.)[a-z0-9-]+(\.[a-z0-9-]+)+([/?].*)?$
Try these...

What do you think about:
((http|https|)\://){0,1}([w]{3}.){0,1}[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}
Explanation
((http|https|)\://){0,1} for optional protocol (""or "http://" or "https://" will be ok)
([w]{3}.){0,1} for optional www string
[a-zA-Z0-9\-\.]+ for domain name
[a-zA-Z]{2,3} for domain suffix like: com, uk, biz, tv, etc.

This one would match all of your URLs:
(https?://)?(www\.)?([a-zA-Z0-9_%]*)\b\.[a-z]{2,4}(\.[a-z]{2})?((/[a-zA-Z0-9_%]*)+)?(\.[a-z]*)?
Altough it's not possible, to check for exactly three "W" as there might be a subdomain. If you really need this check, I would use a second regex to test that.

Related

Finding difficulty in correct regex for URL validation

I have to set some rules on not accepting wrong url for my project. I am using regex for this.
My Url is "http ://some/resource/location".
This url should not allow space in beginning or middle or in end.
For example these spaces are invalid:
"https ://some/(space here in middle) resource/location"
"https ://some/resource/location (space in end)"
"(space in starting) https ://some/resource/location"
"https ://(space here) some/resource/location"
Also these scenario's are invalid.
"httpshttp ://some/resource/location"
"https ://some/resource/location,https ://some/resource/location"
Currently I am using a regex
var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/;
This regex accepts all those invalid scenarios. I am unable to find the correct matching regex which will accept only if the url is valid. Can anyone help me out on this?
We need to validate n number of scenarios for URL validation. If your particular about your given pattern then above regex expression from other answer looks good.
Or
If you want to take care of all the URL validation scenarios please refer In search of the perfect URL validation regex
/(ftp|http|https){1}:\/\/(?:.(?! ))+$/
is this regex OK ?
use this
^\?([\w-]+(=[\w-]*)?(&[\w-]+(=[\w-]*)?)*)?$
See live demo
This considers each "pair" as a key followed by an optional value (which maybe blank), and has a first pair, followed by an optional & then another pair,and the whole expression (except for the leading?) is optional. Doing it this way prevents matching ?&abc=def
Also note that hyphen doesn't need escaping when last in the character class, allowing a slight simplification.
You seem to want to allow hyphens anywhere in keys or values. If keys need to be hyphen free:
^\?(\w+(=[\w-]*)?(&\w+(=[\w-]*)?)*)?$

URL RegExp WITHOUT http:// or www

I'm trying to construct URL RegExp. The base expression looks like:
/^(((http(?:s)?\:\/\/)|www\.)[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$/
It looks good for me, because matches these:
http://gmail.com
http://www.gmail.com
www.gmail.com
But I wold like to modify it to match this:
gmail.com
I will appreciate any help.
just add a ? to make www optional, then it will match gmail.com also
use this :
^(((http(?:s)?\:\/\/)|www\.)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$
or if you want to match only gmail.com and not http://gmail.com in that case use this :
^([a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$
please note , this will match anu string which has dots and alphabets in it.
IMO it will be better off using a regex like this :
^(http:\/\/|www\.)?[\w\.]+\.(com|net|co\.cc|co\.in)$
you can modify it according to your needs .
check out a demo here and play around with the regex :
http://regex101.com/r/tS4aB3
The easiest way is to treat 'www' as just another subdomain (because that's all it is).
So:
/^(((http(?:s)?\:\/\/))?([a-zA-Z0-9\-]+\.?)+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$/
Edit: as a side note, the tld (i.e. the ".com" part) is... quite complicated these days. There are a lot of them, and they may not fit easily in 2-6 chars.

Regex for hostname

I have the following code (currHost is hostname):
if (currHost.match(/(alpha|beta|test|dev|load|local)\./))
but I need to add additional conditions such as
file:
.od*. (* is wildcard)
dev-wa.
.sq*. (* is wildcard)
.hbox.
Example MATCHING URLS:
file://c:blahblahblah
www.sqc.mydomain.com
www.sqa.mydomain.com
www.odd.mydomain.com
www.odp.mydomain.com
www.hbox.mydomain.com
dev-wa.mydomain.com
Example NOT MATCHING URLS:
www.sqcmydomain.com
www.sqamydomain.com
www.oddmydomain.com
www.odp.mydomain.com
www.hboxx.mydomain.com
dev-waa.mydomain.com
not sure how to approach this?
Thanks!
For your file matches you can simply use
document.location.protocol == 'file:'
and the expression you're looking for is
curHost.match(/(alpha|beta|test|dev|load|local|\.od.*|dev-wa|\.sq.*|\.hbox)\./)
Just remember that this expression will also match www.od.mydomain.com and oddmydomain.com since it's a valid match too. If you don't want this, you need to either specify a full expression (with the .com part) or specify the number of characters after the od/sq part. For example
curHost.match(/(alpha|beta|test|dev|load|local|\.od.{1}|dev-wa|\.sq.{1}|\.hbox)\./)
For one letter match.
If you want to specifically match those string only in the beginning of the domain, with or without www. you can use
curHost.match(/^(www\.)?(alpha|beta|test|dev|load|local|od.+|dev-wa|sq.+|hbox)\./)
if (currHost.match(/(alpha|beta|test|dev|load|local|file:|dev-wa|od\*|hbox)?\./))
Just add like above? Or do you mean something else?. I've added the a slash in from of the asterix. Added a questionmark at the end to make it ungreedy

Regexp javascript - url match with localhost

I'm trying to find a simple regexp for url validation, but not very good in regexing..
Currently I have such regexp: (/^https?:\/\/\w/).test(url)
So it's allowing to validate urls as http://localhost:8080 etc.
What I want to do is NOT to validate urls if they have some long special characters at the end like: http://dodo....... or http://dododo&&&&&
Could you help me?
How about this?
/^http:\/\/\w+(\.\w+)*(:[0-9]+)?\/?(\/[.\w]*)*$/
Will match: http://domain.com:port/path or just http://domain or http://domain:port
/^http:\/\/\w+(\.\w+)*(:[0-9]+)?\/?$/
match URLs without path
Some explanations of regex blocks:
Domain: \w+(\.\w+)* to match text with dots: localhost or www.yahoo.com (could be as long as Path or Port section begins)
Port: (:[0-9]+)? to match or to not match a number starting with semicolon: :8000 (and it could be only one)
Path: \/?(\/[.\w]*)* to match any alphanums with slashes and dots: /user/images/0001.jpg (until the end of the line)
(path is very interesting part, now I did it to allow lone or adjacent dots, i.e. such expressions could be possible: /. or /./ or /.../ and etc. If you'd like to have dots in path like in domain section - without border or adjacent dots, then use \/?(\/\w+(.\w+)*)* regexp, similar to domain part.)
* UPDATED *
Also, if you would like to have (it is valid) - characters in your URL (or any other), you should simply expand character class for "URL text matching", i.e. \w+ should become [\-\w]+ and so on.
If you want to match ABCD then you may leave the start part..
For Example to match http://localhost:8080
' just write
/(localhost).
if you want to match specific thing then please focus the term that you want to search, not the starting and ending of sentence.
Regular expression is for searching the terms, until we have a rigid rule for the same. :)
i hope this will do..
It depends on how complex you need the Regex to be. A simple way would be to just accept words (and the port/domain):
^https?:\/\/\w+(:[0-9]*)?(\.\w+)?$
Remember you need to use the + character to match one or more characters.
Of course, there are far better & more complicated solutions out there.
^https?:\/\/localhost:[0-9]{1,5}\/([-a-zA-Z0-9()#:%_\+.~#?&\/=]*)
match:
https://localhost:65535/file-upload-svc/files/app?query=abc#next
not match:
https://localhost:775535/file-upload-svc/files/app?query=abc#next
explanation
it can only be used for localhost
it also check the value for port number since it should be less than 65535 but you probably need to add additional logic
You can use this. This will allow localhost and live domain as well.
^https?:\/\/\w+(\.\w+)*(:[0-9]+)?(\/.*)?$
I'm pretty late to the party but now you should consider validating your URL with the URL class. Avoid the headache of regex and rely on standard
let isValid;
try {
new URL(endpoint); // Will throw if URL is invalid
isValid = true;
} catch (err) {
isValid = false;
}
^https?:\/\/(localhost:([0-9]+\.)+[a-zA-Z0-9]{1,6})?$
Will match the following cases :
http://localhost:3100/api
http://localhost:3100/1
http://localhost:3100/AP
http://localhost:310
Will NOT match the following cases :
http://localhost:3100/
http://localhost:
http://localhost
http://localhost:31

How to identify all URLs that contain a (domain) substring?

If I am correct, the following code will only match a URL that is exactly as presented.
However, what would it look like if you wanted to identify subdomains as well as urls that contain various different query strings - in other words, any address that contains this domain:
var url = /test.com/
if (window.location.href.match(url)){
alert("match!");
}
If you want this regex to match "test.com" you need to escape the "." and both of the "/" that means any character in regex syntax.
Escaped : \/test\.com\/
Take a look for here for more info
No, your pattern will actually match on all strings containing test.com.
The regular expresssion /test.com/ says to match for test[ANY CHARACTER]com anywhere in the string
Better to use example.com for example links. So I replaces test with example.
Some example matches could be
http://example.com
http://examplexcom.xyz
http://example!com.xyz
http://example.com?q=123
http://sub.example.com
http://fooexample.com
http://example.com/asdf/123
http://stackoverflow.com/?site=example.com
I think you need to use /g. /g enables "global" matching. When using the replace() method, specify this modifier to replace all matches, rather than only the first one:
var /test.com/g;
If you want to test if an URL is valid this is the one I use. Fairly complex, because it takes care also of numeric domain & a few other peculiarities :
var urlMatcher = /(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?/;
Takes care of parameters and anchors etc... dont ask me to explain the details pls.

Categories