I use this regex code to parse urls:
/^(((http|https):\/\/)+[www.])?+\s*\S+\s*+(.com|.es|.net|.org|.co)$/ig
It works perfectly on https://regex101.com/r/bX5oM4/1
But on my console I keep getting the:
SyntaxError: Invalid regular expression: /^(((http|https):\/\/)+[www\.])?+\s*\S+\s*+(\.com|\.es|\.net|\.org|\.co)$/: Nothing to repeat
I tried escaping the + but It doesn't work. I'm kinda new on regex so It could be anything.
Here is your fixed regex:
^(?:https?:\/\/www\.)?[a-zA-Z0-9]\S+(\.(?:com|es|net|org|co))$
See demo
Or, to match the strings inside larger strings:
\b(?:https?:\/\/www\.)?[a-zA-Z0-9]\S+(?:\.(?:com|es|net|org|co))\b
See another demo
In JavaScript, you cannot set + to ? quantifier.
Also, note that [www.] matches 1 character, either w or . since it is a character class. You must have meant a group, and thus you need round brackets, not square ones.
I removed unnecessary groups, regrouped them a bit and escaped the dots. Note that unescaped dot matches any character but a newline.
So, the regex:
^ - Asserts the position at the start of the string
(?:https?:\/\/www\.)? - Optionally matches http or https then //www. literally
\w\S+ - 1 alhoanumeric and 1 or more non-whitespace characters
(\.(?:com|es|net|org|co)) - Matches a dot and then any of the alternatives in the round brackets
$ - Asserts end of string
Try this (update!)
^((http|https):\/\/)?([\w]+[.-]?)+\.(com|es|net|org|co|uk|de)$
instead of
/^(((http|https):\/\/)+[www.])?+\s*\S+\s*+(.com|.es|.net|.org|.co)$/ig
You had an extra + behind a ? and another one behind a *. And several other things were not quite OK, as stribizhev pointed out quite rightly!
This regex is looking for a limited range of TLDs ... (e. g. french pages would not pass). The [www.] was syntactically wrong and also surperfluous as any domain name can have subdomains (expressed by ([\w]+[.-]?)+) and 'www.' is just one of the possible ones.
Related
I have a regex that i ended up using from one of the answer here in SO .
Basically my regex must validate ipv4 address with mask .
So i ended up using the below regex :
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)/([1-9]|1[0-9]|2[0-9]|3[0-2]|(((128|192|224|240|248|252|254)\.0\.0\.0)|(255\.(0|128|192|224|240|248|252|254)\.0\.0)|(255\.255\.(0|128|192|224|240|248|252|254)\.0)|(255\.255\.255\.(0|128|192|224|240|248|252|254))))
Now my challenge is to not allow 0 in the last digit of ip i.e ,
192.168.6.10/mask is valid but 192.168.6.0/mask is invalid
So i modified the above regexp to something like this :
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[1-9]?)/([1-9]|1[0-9]|2[0-9]|3[0-2]|(((128|192|224|240|248|252|254)\.0\.0\.0)|(255\.(0|128|192|224|240|248|252|254)\.0\.0)|(255\.255\.(0|128|192|224|240|248|252|254)\.0)|(255\.255\.255\.(0|128|192|224|240|248|252|254))))
but 192.168.6.0 is always valid when testing with Angular Validators.pattern
Any idea where i'm going wrong ?
EDIT
List of IPs & its validity :
192.168.6.6/24 is valid
192.168.6.6/24 is valid
192.168.6.24/24 is valid
192.168.6.0/24 invalid
192.168.6.0/255.255.255.0 is invalid
You want to avoid matching any IP with the last octet set to 0.
You may use
ipAddress : FormControl = new FormControl('' , Validators.pattern(/^(?!(?:\d+\.){3}0(?:\/|$))(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\/(?:[1-9]|1[0-9]|2[0-9]|3[0-2]|(?:(?:128|192|224|240|248|252|254)\.0\.0\.0|255\.(?:0|128|192|224|240|248|252|254)\.0\.0|255\.255\.(?:0|128|192|224|240|248|252|254)\.0|255\.255\.255\.(?:0|128|192|224|240|248|252|254)))$/));
Here is the regex demo
The main addition is the lookahead after ^ that is executed once at the start of a string. The (?!(?:\d+\.){3}0(?:\/|$)) pattern is a negative lookahead that fails the match if, immediately to the right of the current location (string start), there are:
(?:\d+\.){3} - three repetitions of 1+ digits and a dot
0 - a zero
(?:\/|$)) - / or (|) end of string ($).
Notice I defined the pattern using a regex literal notation (/regex/) and I had to add ^ (string start) and $ (string end) anchors since the regex was no longer anchored by default. Also, to escape special chars in a regex literal notation, you only need one backslash, not two.
Suppose that the last part cannot be written 000 and 00 but just 0. Then you can you such regex
^(?:(?:2(?:5[0-5]|[0-4]\d)|1?\d?\d)\.){3}(?:(?:2(?:5[0-5]|[0-4]\d)|1?\d\d|[1-9]))$
Where diff between the first groups and the last one that one-digit value should be from 1 to 9
demo
You can try with this pattern
^(?:[1-9]|[0-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.(?:[1-9]|[0-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.(?:[1-9]|[0-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.(?:2[0-5][1-5]|[1-9]|1[0-9][1-9]|[1-9][1-9])$
Online demo
For the last numbers you have check with this
(?:2[0-5][1-5]|[1-9]|1[0-9][1-9]|[1-9][1-9])
One possible approach here is simple, and just involves adding a negative lookbehind at the very end of the pattern (?<!\.0), which asserts that .0 is not the immediately preceding term in the IP address. Applying this to your correctly working pattern from the comments above, we get:
^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\/
([1-9]|1[0-9]|2[0-9]|3[0-2]|(((128|192|224|240|248|252|254)\.0\.0\.0)|
(255\.(0|128|192|224|240|248|252|254)\.0\.0)|
(255\.255\.(0|128|192|224|240|248|252|254)\.0)|
(255\.255\.255\.(0|128|192|224|240|248|252|254))))(?<!\.0)$
Demo
The downside is that your JavaScript engine may not yet support negative lookbehind syntax just yet.
I am facing an issue with a regular expression while trying to block any string which has minus(-) in the beginning of some white listed characters.
^(?!-.*$).([a-zA-Z0-9-:#\\,()\\/\\.]+)$
It is blocking minus(-) at place and allowing it any where in the character sequence but this regex is not working if the passed string is single character.
For e.g A or 9 etc.
Please help me out with this or give me a good regex to do the task.
Your pattern requires at least 2 chars in the input string because there is a dot after the first lookahead and then a character class follows that has + after it (that is, at least 1 occurrence must be present in the string).
So, you need to remove the dot. Also, you do not need to escape any special char inside a character class. Besides, to avoid matching strings atarting with - a mere (?!-) will suffice, no need adding .*$ there. You may use
^(?!-)[a-zA-Z0-9:#,()/.-]+$
See the regex demo. Remember to escape / if used in a regex literal notation in JavaScript, there is no need to escape it in a constructor notation or in a Java regex pattern.
Details
^ - start of a string
(?!-) - cannot start with -
[a-zA-Z0-9:#,()/.-]+ - 1 or more ASCII letters, digits and special chars defined in the character class (:, #, ,, (, ), /, ., -)
$ - end of string.
If i understand correctly, and you don't want a minus at the beginning, does ^[^-].* work as a regex for you? Java's "matches" would return false if it starts with minus
There is a method in a String class that provides you exactly what you are asking for - it's a startsWith() method - you could use this method in your code like this (you can translate it as "If the given String doesn't start with -, doSomething, in other case do the else part, that can contain some code or might be empty if you want nothing to be done if the given String starts with - ") :
if(!(yourString.startsWith("-"))) {
doSomething()
} else {
doNothingOrProvideAnyInformationAboutWrongInput()
}
I think that it can help you.
^(?!-).*[a-zA-Z0-9-:#\\,()\/\\.]+$
I saw the other posts but none of them help me ...
So, i tried to match url in a string in javascript with regex it works perfectly on regex101 but fails in javascript.
var matches = feed.content.match(
'/((http|https|ftp):\/\/([a-zA-Z0-9\.\-\_\%]+\/?){1}([a-zA-Z0-9\.\-\_]+\/?)*(\?[a-zA-Z0-9\.\-\_\%\+\=\&\:]*)*)/ig'
);
And firebug returns me
SyntaxError: invalid quantifier
Please can you help me ?
As pointed out in the comments, you should remove the single quotes enclosing the regex. As well as that, I would propose making a few changes to the expression itself:
((https?|ftp):\/\/([\w.%-]+\/?)([\w.-]+\/?)*(\?[\w.%+=&:-]*)*)
The ? after the smeans that it is optional, so http and https will both match. \w is the word character class, so that covers A-Za-z0-9_ much more concisely. There's no need to escape all the symbols but a useful trick is to put the - at the end of the character class, so that it isn't interpreted as a range between two characters. The {1} isn't necessary as that's the default behaviour.
updated on regex101
You're passing the regex as a string - just get rid of the outer quotes.
var matches = feed.content.match(
/((http|https|ftp):\/\/([a-zA-Z0-9\.\-\_\%]+\/?){1}([a-zA-Z0-9\.\-\_]+\/?)*(\?[a-zA-Z0-9\.\-\_\%\+\=\&\:]*)*)/ig
);
I have the following regular expression in a validation rule:
^[a-zA-Z0-9',!;?~>+&\"\-##%*.\s]{1,1000}$
However, I can enter ====== which I believe should not be allowed.
My thoughts is that somehow the - could cause trouble if not properly escaped or something but this is way over my head.
The regex you've shown us with the - escaped does not accept ===.But if - is not escaped, === will be accepted. See this.
A - inside a regex is special and is used as range operator if it's not escaped and is surrounded by characters which participate as min and max in the range:
[a-z] matches any lowercase character.
[-az] matches either a - or a or z.
[az-] matches either a - or a or z.
[a\-z] matches either a - or a or z.
[a-c-d-f] matches a or b or c or - or d or e or f. The first and last - act as range operator but the one in the middle is treated literally.
In your case the = comes in the range "-# and hence gets matched.
.
matches on everything. You want
\.
The - will be interpreted as a range indicator. You need to put it either first or last within the [] brackets if you want to match a literal -.
Your regex works fine for me but if I remove the escaping of - it matches =. I'm sure you are doing that.
We have been using the following js/regex to find and replace all non-alphanumeric characters apart from - and +
outputString = outputString.replace(/[^\w|^\+|^-]*/g, "");
However it doesn't work entirely - it doesn't replace the ^ and | characters. I can't help but wonder if this is something to do with the ^ and | being used as meta-characters in the regex itself.
I've tried switching to use [\W|^+|^-], but that replaces the - and +. I thought that possibly a lookahead assertion may be the answer, but I'm not very sure how to implement them.
Has anyone got an idea how to accomplish this?
Character classes do not do alternation, hence why the | is literal, and the ^ must be at the start of the class to take effect (otherwise it's treated literally.)
Use this:
[^\w+-]+
(Also, if - is not last, it needs to be escaped as \- inside a character class - so be careful if more characters might be added to the exception list).
You could also do it with a negative lookahead like this:
(?![+-])\W
Note: You do not want a * or + after that \W, since the lookahead only applies to the immediately following character (and the g flag makes the replace repeat until done).
Also note that \w and \W consider _ as a word character. If that's not desired then to replace that you can use (?![+-])[\W_] (or use explicit ranges in the first expressions).