RegEx for subdomain with varying number - javascript

Using Javascript RegExp, I'm trying to match URL like the following:
http://sub.domain.com/
http://sub1.domain.com/
http://sub100.domain.com/
I tried the following regex, which isn't working - not sure what I'm doing wrong ?
^http\:\/\/sub\d*\.domain\.com\/$
EDIT: fixed copy & paste typo
Update: For some reason, document.location.href doesn't match the regex - even though examples below (also on regex101.com) do work as expected. My workaround for now - I just match any subdomain.
..any help is much appreciated!

i dont understand How you can compare a grape with a mango
here is the corrected regex :
/^http\:\/\/subs\d*\.app\.clicktale\.com\/$/.test("http://subs14.app.clicktale.com/");
run this command in your console, right now.
you will get true , if you use search then you will get 0 because the index of your match is 0

Well the sub-domains you showed have the token "sub", but your REGEX is looking for "subs".
Also, no need to escape colons.
You don't say whether you wish to test for a match or actually capture the sub-domain. I'll assume the latter:
var match = "http://foo.bar.com".match(/https?:\/\/(([^.]+)\.)?/);
alert(match[2]); //"foo"

Related

Regex for - 'A,B','C'

I have written this regex -
([\s]*'[A-Za-z0-9_: ]*[\,]*[\s]*[A-Za-z0-9_: ]*\'[\s]*)[\,]*
But this is not handling the input - 'A,B' 'C' - In this the comma is missing, still its a perfect match.
Can anyone please help.
After giving this more thought, I think what you want is something more like this:
^(?<item>\'[a-zA-Z0-9,\s]+\')(\s*,(?&item))*\s*$
You're using an asterisk which will match zero instances. Try using + instead for the characters you want one or more of.
Please provide other examples that you'd expect to match. For this specific case, the following would match, but is very rigid and specific:
\'+[a-zA-Z]+\,\s*[a-zA-Z]+\'+\,\s*\'+[a-zA-Z]+\'+
Edit:
This is more in line with what I think you want:
^(\'[a-zA-Z]+(\,+\s*[a-zA-Z]+)*\'\s*\,*)*$

My regex to find a pattern in an img src attribut fails when the searched pattern starts with a t

I built a regex to find any href or src attribute value in a html string that does not start with 'http'.
My solution seems to work in most cases, except when the attribute value starts with a 't'. I don't understand why. Can someone explain why this happens?
examples (in javascript):
//this gives the expected match
'<img href="somename.jpg">'.match(/(?:href|src)\=\"([^(http)][^(\")]*)\"/);
//this does NOT give the expected match
'<img href="thisname.jpg">'.match(/(?:href|src)\=\"([^(http)][^(\")]*)\"/);
Here is the regex I am using:
/(?:href|src)\=\"([^(http)][^(\")]*)\"/
It might be, that [^(http)] excludes all occurences of h, t and p.
Try, if "psomename.jpg" does not work, too.
[^(http)] here is your problem you basically said not h not t and not p with this.
I am willing to assume you thought of (?!http) as a negative look-ahead group to eliminate all http literals from the URL.
This should suffice (short n simple)
(?:href|src)="(?!http:\/\/).*\"
In case you are only trying to eliminate ones starting with http and not actually check if something is a valid URL afterwards
You're looking for a lookahead assertion:
/(?:href|src)="(?!https?:\/\/)[^"]+"/
This is a negative lookahead. In this situation, it matches your string if it is not preceded by http:// (or also https://). A simpler example of it is (?!a)b which is b not preceded by a. A negative lookbehind would also work (?<!string) but I don't think it's supported in JavaScript.
https://www.regular-expressions.info/lookaround.html
Try this
<img href="thisname.jpg">'.match(/(?:href|src)\=\"([^(http)]?[^(\")]*)\"/);

Javascript substring check using indexOf or search on a date string with forward slash /

I am surprised to not to find any post regarding this, I must be missing something very trivial. I have a small JavaScript function to check if a string matches an object's properties. Simple stuff right? It works easily with all strings except those which contain a forward slash.
"‎04‎/‎08‎/‎2015‎".indexOf('4') // returns 2 :good
"‎04‎/‎08‎/‎2015‎".indexOf('4/') // returns -1 :why?
The same issue appears to be with .search() function as well. I encountered this issue while working on date strings.
Please note that I don't want to use regex based solution for performance reasons. Thanks for your help in advance!
Your string has invisible Unicode characters in it. The "left-to-right mark" (hex 200E) appears around the two slash characters as well as at the beginning and the end of the string.
If you type the code in on your browser console instead of cutting and pasting, you'll see that it works as expected.

javascript regexp to match path depth

Been struggling for the last hour to try and get this regexp to work but cannot seem to crack it.
It must be a regexp and I cannot use split etc as it is part of a bigger regexp that searches for numerous other strings using .test().
(public\/css.*[!\/]?)
public/css/somefile.css
public/css/somepath/somefile.css
public/css/somepath/anotherpath/somefile.css
Here I am trying to look for path starting with public/css followed by any character except for another forward slash.
so "public/css/somefile.css" should match but the other 2 should not.
A better solution may be to somehow specify the number of levels to match after the prefix using something like
(public\/css\/{1,2}.*)
but I can't seem to figure that out either, some help with this would be appreciated.
edit
No idea why this question has been marked down twice, I have clearly stated the requirement with sample code and test cases and also attempted to solve the issue, why is it being marked down ?
You can use this regex:
/^(public\/css\/[^\/]*?)$/gm
^ : Starts with
[^/] : Not /
*?: Any Characters
$: Ends with
g: Global Flag
m: Multi-line Flag
Something like this?
/public\/css\/[^\/]+$/
This will match
public/css/[Any characters except for /]$
$ is matching the end of the string in regex.

URL RegExp WITHOUT http:// or www

I'm trying to construct URL RegExp. The base expression looks like:
/^(((http(?:s)?\:\/\/)|www\.)[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$/
It looks good for me, because matches these:
http://gmail.com
http://www.gmail.com
www.gmail.com
But I wold like to modify it to match this:
gmail.com
I will appreciate any help.
just add a ? to make www optional, then it will match gmail.com also
use this :
^(((http(?:s)?\:\/\/)|www\.)?[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$
or if you want to match only gmail.com and not http://gmail.com in that case use this :
^([a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$
please note , this will match anu string which has dots and alphabets in it.
IMO it will be better off using a regex like this :
^(http:\/\/|www\.)?[\w\.]+\.(com|net|co\.cc|co\.in)$
you can modify it according to your needs .
check out a demo here and play around with the regex :
http://regex101.com/r/tS4aB3
The easiest way is to treat 'www' as just another subdomain (because that's all it is).
So:
/^(((http(?:s)?\:\/\/))?([a-zA-Z0-9\-]+\.?)+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+((\.[a-zA-Z]{2,4})?)(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$/
Edit: as a side note, the tld (i.e. the ".com" part) is... quite complicated these days. There are a lot of them, and they may not fit easily in 2-6 chars.

Categories