I'm creating regex for URL validation. Somehow i have validated Url as i need but my requirement was after the domain name https://asasas.com/ special character should not allow to be continually. Wanted to know how to restrict that?
My regex
Part 1 : (https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})
Part 2: [/]+[a-zA-Z0-9~##$!^%*;'&()<>_+=[]{}|\,.?: -]+.(?:jpg$|gif$|png$|jpeg$)/`
Part 1 regex to validate URL and Part 2 regex to validate string that it should end with .JPG or .PNG or .JPEG or .GIF
My requirement was it should not allow?
1) https://ssas.com//////////////////////////sds.png
2) https://ssas.com/sds//##$%/j^&&*///.png
Success case:
each and every special character it should have word or number
1) https://ssas.com/sds/sdsd/s#df^ggasa/dadsa.png
Instead of validating the whole url with a regex, you could first validate the url using URL.
Then you could check if the protocol starts with http and use a pattern to check if the string after the origin ends with one of the allowed extensions and that the string does not contain to consecutive chars which you would consider special.
If you want to get a case insensitive match, you could make use of the /i flag.
The pattern consists of 2 assertions (non consuming)
^(?=.*\.(?:jpe?g|png|gif)$)(?!.*[/~##$!^%*;'&()<>_+=\[\]{}|\,.?:-][/~##$!^%*;'&()<>_+=\[\]{}|\,.?:-])
In parts
^ Start of string
(?= Positive lookahead, assert what is on the right is
.*\.(?:jpe?g|png|gif)$ Match any of the listed at the end $ of string
) Close lookahead
(?! Negative lookahead, assert what is on the right is not
.* Match any char except a newline 0+ times
[/~##$!^%*;'&()<>_+=\[\]{}|\,.?:-] Match any of the listed
[/~##$!^%*;'&()<>_+=\[\]{}|\,.?:-] Same as above
) Close lookahead
Regex demo
[
"https://ssas.com//////////////////////////sds.png",
"https://ssas.com/sds//##$%/j^&&*///.png",
"https://ssas.com/sds/sdsd/s#df^ggasa/dadsa.png"
].forEach(s => {
let url = new URL(s);
let secondPart = url.href.replace(url.origin, '');
let pattern = /^(?=.*\.(?:jpe?g|png|gif)$)(?!.*[/~##$!^%*;'&()<>_+=\[\]{}|,.?:-][/~##$!^%*;'&()<>_+=\[\]{}|,.?:-])/i;
if (
url.protocol.startsWith("http") &&
pattern.test(secondPart)) {
console.log(s);
}
})
Related
I am trying to validate aws arn for a connect instance but I am stuck on creating the correct regex.
Below is the string that I want to validate.
arn:aws:connect:us-west-2:123456789011:instance/0533yu22-d4cb-410a-81da-6c9hjhjucec4b9
I want to create a regex which checks below things.
arn:aws:connect:<region_name>:<12 digit account id>:instance/<an alphanumeric instance id>
Can someone please help.
Tried below
^arn:aws:connect:\S+:\d+:instance\/\S+\/queue\/\S+$
There is no /queue/ substring in your example string, and \S+ matches any no whitespace character and will cause backtracking to match the rest of the pattern.
You might update your pattern to ^arn:aws:connect:\S+:\d+:instance\/\S+$ but that will be less precise according to the things you want to check.
A bit more precise pattern could be:
^arn:aws:connect:\w+(?:-\w+)+:\d{12}:instance\/[A-Za-z0-9]+(?:-[A-Za-z0-9]+)+$
^ Start of string
arn:aws:connect: Match literally
\w+(?:-\w+)+: Match 1+ word characters and repeat matching - and 1+ word characters and then match :
\d{12}: Match 12 digits and :
instance\/ Match instance/
[A-Za-z0-9]+(?:-[A-Za-z0-9]+)+ Match 1+ alpha numerics and repeat 1+ times - and 1+ alpha numerics
$ End of string
Regex demo
You need some capture groups to facilitate this. Here I've also used named capture groups for ease of understanding.
const string = "arn:aws:connect:us-west-2:123456789011:instance/0533yu22-d4cb-410a-81da-6c9hjhjucec4b9";
// Regex broken down into parts
const parts = [
'arn:aws:connect:',
'(?<region_name>[^:]+?)', // group 1
':',
'(?<account_id>\\d{12})', // group 2
':instance\\/',
'(?<instance_id>[A-z0-9\\-]+?)', // group 3
'$'
];
// Joined parts into regex expression
const regex = new RegExp(parts.join(''));
// Execute query and assign group values to variables
const { region_name, account_id, instance_id } = regex.exec(string).groups;
console.log("region_name:", region_name);
console.log("account_id:", account_id);
console.log("instance_id:", instance_id);
my collegue and I try to build a Regex (Javascript) to validate an input field for a specific format.
The field should be a comma seperated list of port declarations and could look like this:
TCP/53,UDP/53,TCP/10-20,UDP/20-30
We tried this regex:
/^[TCP/\d+,|UDP/\d+,|TCP/\d+\-\d+,|UDP/\d+\-\d+,]*[TCP/\d+|UDP/\d+|TCP/\d+\-\d+|UDP/\d+\-\d+]$/g
the regex matches, but also matches other strings as well, like this one:
TCP/53UDP53,TCP/10-20UDP20-30
Thanks for any guidance!
You don't need all those alternations, and the [ ] are not used for grouping like that. You can also make the - and digits part optional using grouping (?:...)?
To match that string format:
^(?:TCP|UDP)\/\d+(?:-\d+)?(?:,(?:TCP|UDP)\/\d+(?:-\d+)?)*$
The pattern matches:
^ Start of string
(?:TCP|UDP) Match one of the alternatives
\/\d+(?:-\d+)? Match / 1+ digits and optionally - and 1+ digits
(?: Non capture group to repeat as a whole part
,(?:TCP|UDP)\/\d+(?:-\d+)? Match a , and repeat the same pattern
)* Close non capture group and optionally repeat (If there should be at least 1 comma, change the * to +)
$ End of string
Regex demo
Alternative: split up the string, use Array.filter and a relative simple RegExp for testing.
const valid = `TCP/53,UDP/53,TCP/10-20,UDP/20-30`;
const invalid = `TCP/53UDP53,TCP/10-20UDP20-30`;
console.log(`${valid} ok? ${checkInp(valid)}`);
console.log(`${invalid} ok? ${checkInp(invalid)}`);
function checkInp(str) {
return str.split(`,`)
.filter(v => /^(TCP|UDP)\/\d+(?:-\d+)*$/.test(v))
.join(`,`)
.length === str.length;
}
In the URLs
https://image/4x/c1/abc/5b026cdb06921e7ca5f7a24aff46512e--wedding-vendors-wedding-receptions.jpg
https://image/4x/c1/abc/5b026cdb06921e7ca5f7a24aff46512e.jpg
I'm trying to capture 5b026cdb06921e7ca5f7a24aff46512e in both of these strings. The string will always happen after the last slash, it will be a random assortment of letters and numbers, it may or may not have --randomtext appended, and it will have .jpg at the end.
I currently have ([^\/]+)$ to extract any string after the last slash, but would like to know how to capture everything before .jpg and --randomtext(if present). I will be using this in javascript.
If what is after the last forward slash is a random assortment of letters and numbers a-z0-9, on option is to use a capturing group.
^.*\/([a-z0-9]+).*\.jpg$
In parts
^ Start of string
.*\/ Match until including the last /
([a-z0-9]+) Capture in group 1 matching 1+ chars a-z or digits 0-9
.* Match any char except a newline 0+ times
\.jpg Match .jpg
$ End of string
Regex demo
const regex = /^.*\/([a-z0-9]+).*\.jpg$/;
["https://image/4x/c1/abc/5b026cdb06921e7ca5f7a24aff46512e--wedding-vendors-wedding-receptions.jpg",
"https://image/4x/c1/abc/5b026cdb06921e7ca5f7a24aff46512e.jpg"
].forEach(s => console.log(s.match(regex)[1]));
You can split by / and take the last part, and then replace anything after -- or .jpg from end with empty string
let arr = ["https://image/4x/c1/abc/5b026cdb06921e7ca5f7a24aff46512e--wedding-vendors-wedding-receptions.jpg","https://image/4x/c1/abc/5b026cdb06921e7ca5f7a24aff46512e.jpg"]
let getText = (url) =>{
return url.split('/').pop().replace(/(--.*|\.jpg)$/g,'')
}
arr.forEach(url=> console.log(getText(url)))
If there are chances to have -- more than one time than instead of replacing you can simply match match(/^[a-z0-9]+/g) and take the first element from matched array
Use:
([^\/]*?)(?:--.*)?\.jpg$
and your desired match will be in $1
https://regex101.com/r/gZ9kSi/1
I'm trying to create a regex using javascript that will allow names like abc-def but will not allow abc-
(hyphen is also the only nonalpha character allowed)
The name has to be a minimum of 2 characters. I started with
^[a-zA-Z-]{2,}$, but it's not good enough so I'm trying something like this
^([A-Za-z]{2,})+(-[A-Za-z]+)*$.
It can have more than one - in a name but it should never start or finish with -.
It's allowing names like xx-x but not names like x-x. I'd like to achieve that x-x is also accepted but not x-.
Thanks!
Option 1
This option matches strings that begin and end with a letter and ensures two - are not consecutive so a string like a--a is invalid. To allow this case, see the Option 2.
^[a-z]+(?:-?[a-z]+)+$
^ Assert position at the start of the line
[a-z]+ Match any lowercase ASCII letter one or more times (with i flag this also matches uppercase variants)
(?:-?[a-z]+)+ Match the following one or more times
-? Optionally match -
[a-z]+ Match any ASCII letter (with i flag)
$ Assert position at the end of the line
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a", // valid
"aa-a-","a","a-","-a","a--a" // invalid
]
var r = /^[a-z]+(?:-?[a-z]+)+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
Option 2
If you want to match strings like a--a then you can instead use the following regex:
^[a-z]+[a-z-]*[a-z]+$
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a","a--a", // valid
"aa-a-","a","a-","-a" // invalid
]
var r = /^[a-z]+[a-z-]*[a-z]+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
You can use a negative lookahead:
/(?!.*-$)^[a-z][a-z-]+$/i
Regex101 Example
Breakdown:
// Negative lookahead so that it can't end with a -
(?!.*-$)
// The actual string must begin with a letter a-z
[a-z]
// Any following strings can be a-z or -, there must be at least 1 of these
[a-z-]+
let regex = /(?!.*-$)^[a-z][a-z-]+$/i;
let test = [
'xx-x',
'x-x',
'x-x-x',
'x-',
'x-x-x-',
'-x',
'x'
];
test.forEach(string => {
console.log(string, ':', regex.test(string));
});
The problem is that the first assertion accepts 2 or more [A-Za-z]. You will need to modify it to accept one or more character:
^[A-Za-z]+((-[A-Za-z]{1,})+)?$
Edit: solved some commented issues
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-dfe'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('x-d'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-'); // Logs false
Edit 2: Edited to accept characters only
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('abc'); // Logs true
Use this if you want to accept such as A---A as well :
^(?!-|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/4UYd9l/4/
If you don't want to accept such as A---A do this:
^(?!-|.*[-]{2,}.*|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/qH4Q0q/4/
So both will accept only word starting from two characters of the pattern [A-Za-z-] and not start or end (?!-|.*-$) (negative lookahead) with - .
Try this /([a-zA-Z]{1,}-[a-zA-Z]{1,})/g
I suggest the following :
^[a-zA-Z][a-zA-Z-]*[a-zA-Z]$
It validates :
that the matched string is at least composed of two characters (the first and last character classes are matched exactly once)
that the first and the last characters aren't dashes (the first and last character classes do not include -)
that the string can contain dashes and be greater than 2 characters (the second character class includes dashes and will consume as much characters as needed, dashes included).
Try it online.
^(?=[A-Za-z](?:-|[A-Za-z]))(?:(?:-|^)[A-Za-z]+)+$
Asserts that
the first character is a-z
the second is a-z or hyphen
If this matches
looks for groups of one or more letters prefixed by a hyphen or start of string, all the way to end of string.
You can also use the I switch to make it case insensitive.
I need a regEx to identify custom parameters.
So if I have the url config path:
'/myapp/users/:userId'
it would match [':userId']
or
'/myapp/user/:username/profile/:profileId'
I need to return [':username',':profileId']
So far I have :(.*)/? but it selects everything after the initial found parameter
http://www.regextester.com/?fam=97974
I'm weak on reg ex, can anyone help please?
The :(.*)/? pattern matches the first :, then grabs the whole line greedily with .* and does not have to do anything else but return the match since /? matches an empty string (/? matches 1 or 0 / symbols).
You may use a negated character class [^\/]+:
:([^\/]+)
Details:
: - a colon
([^\/]+) - 1+ chars other than /
See the regex demo.
Try this pattern /:(.*?)($|\/)/g
Demo
Alternative $ asserts position at the end of the string
Alternative \/ matches the character
.*? matches any character (except for line terminators)