catastrophic backtracking error validation string - javascript

^([a-zA-Z0-9]+[._-]?)+[a-zA-Z0-9]+$
I have used above regex to perform validation on input string for below scenarios.
I want to allow only hyphen ('-'), period ('.') & underscore ('_')
name should not start or end with hyphen ('-'), period ('.') & underscore ('_')
name should not contain Spaces
Two consecutive special characters (from set of (._-)) are not allowed
And I have validated it through javascript.
But, when we add name having special characters at the end, browser halts/ hangs instead of returning false.
var regex = new RegExp("^([a-zA-Z0-9]+[._-]?)+[a-zA-Z0-9]+$");
if (regex.test($('#txtBox1').val())) {//success}

Don't make those special delimiters optional in your repeated group:
^([a-zA-Z0-9]+[._-])*[a-zA-Z0-9]+$
# ^ ^
That still matches the same, but it can't backtrack to apply the optional character in positions where it doesn't appear.

Try this as well
var isValid = !!str.match(/[^\w.-]/i)
&& !str.split(/[._-]/).filter( s => s.length == 0 ).length;
Explanation
str.match(/[^\w.-]/i) checks if there is any character matching which is neither alphanumeric, underscore, dot nor hypen.
str.split(/[._-]/) splits the input by these three characters [._-] and then check if there is any empty string. If these characters are at the beginning or end or consecutively placed, then there will be an empty string in the resultant array.

Related

regex - don't allow name to finish with hyphen

I'm trying to create a regex using javascript that will allow names like abc-def but will not allow abc-
(hyphen is also the only nonalpha character allowed)
The name has to be a minimum of 2 characters. I started with
^[a-zA-Z-]{2,}$, but it's not good enough so I'm trying something like this
^([A-Za-z]{2,})+(-[A-Za-z]+)*$.
It can have more than one - in a name but it should never start or finish with -.
It's allowing names like xx-x but not names like x-x. I'd like to achieve that x-x is also accepted but not x-.
Thanks!
Option 1
This option matches strings that begin and end with a letter and ensures two - are not consecutive so a string like a--a is invalid. To allow this case, see the Option 2.
^[a-z]+(?:-?[a-z]+)+$
^ Assert position at the start of the line
[a-z]+ Match any lowercase ASCII letter one or more times (with i flag this also matches uppercase variants)
(?:-?[a-z]+)+ Match the following one or more times
-? Optionally match -
[a-z]+ Match any ASCII letter (with i flag)
$ Assert position at the end of the line
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a", // valid
"aa-a-","a","a-","-a","a--a" // invalid
]
var r = /^[a-z]+(?:-?[a-z]+)+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
Option 2
If you want to match strings like a--a then you can instead use the following regex:
^[a-z]+[a-z-]*[a-z]+$
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a","a--a", // valid
"aa-a-","a","a-","-a" // invalid
]
var r = /^[a-z]+[a-z-]*[a-z]+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
You can use a negative lookahead:
/(?!.*-$)^[a-z][a-z-]+$/i
Regex101 Example
Breakdown:
// Negative lookahead so that it can't end with a -
(?!.*-$)
// The actual string must begin with a letter a-z
[a-z]
// Any following strings can be a-z or -, there must be at least 1 of these
[a-z-]+
let regex = /(?!.*-$)^[a-z][a-z-]+$/i;
let test = [
'xx-x',
'x-x',
'x-x-x',
'x-',
'x-x-x-',
'-x',
'x'
];
test.forEach(string => {
console.log(string, ':', regex.test(string));
});
The problem is that the first assertion accepts 2 or more [A-Za-z]. You will need to modify it to accept one or more character:
^[A-Za-z]+((-[A-Za-z]{1,})+)?$
Edit: solved some commented issues
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-dfe'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('x-d'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-'); // Logs false
Edit 2: Edited to accept characters only
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('abc'); // Logs true
Use this if you want to accept such as A---A as well :
^(?!-|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/4UYd9l/4/
If you don't want to accept such as A---A do this:
^(?!-|.*[-]{2,}.*|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/qH4Q0q/4/
So both will accept only word starting from two characters of the pattern [A-Za-z-] and not start or end (?!-|.*-$) (negative lookahead) with - .
Try this /([a-zA-Z]{1,}-[a-zA-Z]{1,})/g
I suggest the following :
^[a-zA-Z][a-zA-Z-]*[a-zA-Z]$
It validates :
that the matched string is at least composed of two characters (the first and last character classes are matched exactly once)
that the first and the last characters aren't dashes (the first and last character classes do not include -)
that the string can contain dashes and be greater than 2 characters (the second character class includes dashes and will consume as much characters as needed, dashes included).
Try it online.
^(?=[A-Za-z](?:-|[A-Za-z]))(?:(?:-|^)[A-Za-z]+)+$
Asserts that
the first character is a-z
the second is a-z or hyphen
If this matches
looks for groups of one or more letters prefixed by a hyphen or start of string, all the way to end of string.
You can also use the I switch to make it case insensitive.

Write a regex for usernames

I want a Regex for my mongoose schema to test if a username contains only letters, numbers and underscore, dash or dot. What I got so far is
/[a-zA-Z0-9-_.]/
but somehow it lets pass everything.
Your regex is set to match a string if it contains ANY of the contained characters, but it doesn't make sure that the string is composed entirely of those characters.
For example, /[a-zA-Z0-9-_.]/.test("a&") returns true, because the string contains the letter a, regardless of the fact that it also includes &.
To make sure all characters are one of your desired characters, use a regex that matches the beginning of the string ^, then your desired characters followed by a quantifier + (a plus means one or more of the previous set, a * would mean zero or more), then end of string $. So:
const reg = /^[a-zA-Z0-9-_.]+$/
console.log(reg.test("")) // false
console.log(reg.test("I-am_valid.")) // true
console.log(reg.test("I-am_not&")) // false
Try like this with start(^) and end($),
^[a-zA-Z0-9-_.]+$
See demo : https://regex101.com/r/6v0nNT/3
/^([a-zA-Z0-9]|[-_\.])*$/
This regex should work.
^ matches at the beginning of the string. $ matches at the end of the string. This means it checks for the entire string.
The * allows it to match any number of characters or sequences of characters. This is required to match the entire password.
Now the parentheses are required for this as there is a | (or) used here. The first stretch was something you already included, and it is for capital/lowercase letters, and numbers. The second area of brackets are used for the other characters. The . must be escaped with a backslash, as it is a reserved character in regex, used for denoting that something can be any character.

JavaScript - making my regular expression work

I have these 2 expressions:
1: [^a-zA-Z0-9]
2: [^a-zA-Z]
The first one must be used whenever my string starts with data- and the second one if it doesn't. However, I need this built-in into my regular expression (so using .slice(0, 5) == "data-" is no option for this situation).
Is it possible to do this inlined (so by just having to use 1 regular expression)? Or do I first have to validate (if string starts with data-) and then use the correct expression?
Some examples:
data-attribute#!#!19 => data-attribute19
data-attribute17 => data-attribute17
attribute19 => attribute
attribute1#!#!##183 => attribute
You can do something a bit like this:
/^(data-[a-zA-Z0-9]+).+?(\d*)$|^([a-zA-Z]+).+$/
Which will match what you want, and then return the results inside either one or two capture groups (depending on which option it matches).
Breaking it Down
Going from left to right:
The ^ character means "beginning of line" - in this case, the beginning of a single string.
The parentheses () indicate a capture group - some substring that you want to capture and output separately from your main match string.
data- indicates the literal string "data-", with the hyphen at the end.
[a-zA-Z0-9]+ is a character class, repeated one or more times.
.+? is one or more of any characters, matched lazily - meaning it will "give up" some of its match to enable the next token to match as much as possible.
\d* means zero or more digits (equivalent to [0-9]*).
The $ character means "match the end of the line" (again, in this case, the end of your string).
The | character means "alternate" - basically, it will match either the pattern on the left or the pattern on the right, enabling this single regex to match either of your two strings.
str.replace('/[#!#]/', '')
str.match('/^data-(.+)$/') // Contains true or false
This should do the trick.
First we remove every special chars (you can add your own.)
[abc] is a class of characters, wich says to JavaScript : match any of the characters between square brackets
Then we test if it matches with data-attribute
^ and $ match beginning and end of the input (it can't start or end with a space or any other character)
() catches the characters inside them. You can access what was catched with RegExp.$1-9
. means any characters, excepts line terminators.
+ is a quantifier for 1 time or more. It is the same as {1,}.
You just have now to test if it matches with the input. If it matches the attribute starts with data-

Regex: string up to 20char long, without specific characters

I am trying to make regexp for validating string not containing
^ ; , & . < > | and having 1-20 characters. Any other Unicode characters are valid (asian letters for example).
How to do it?
You can use the following:
^[^^;,&.<>|]{1,20}$
Explanation:
^ assert starting of the string
[^ start of negated character class ([^ ])
^;,&.<>| all the characters you dont want to match
] close the negates character class
{1,20} range of matches
$ assert ending of the string
It will match any character other than specified characters within range of 1-20.
Your regex \w[^;,&.<>|]{1,20} contains \w that might not match all Unicode letters (I guess your regex flavor does not match Unicode letters with \w). Anyway, the \w only matches 1 character in your pattern.
Also, you say you need to exclude ^ but it is missing in your pattern.
When you want to validate length, you also must use ^/$ anchors to mark the beginning and end of a string.
To create a pattern for some range that does not match specific characters, you need a negated character class with anchors around it, and the length is set with limiting quantifiers:
^[^^;,&.<>|]{1,20}$
Or (this version makes sure we only match at the beginning and end of the string, never a line):
\A[^^;,&.<>|]{1,20}\z
Note that inside a character class, almost all special characters do not require escaping (only some of them, none in your case). Even the ^ caret symbol.
See demo

Regular Expression for the given format - start and end with alphanumeric

I need to validate building name in the below format (length is 1-50)
my regex for alphanumeric and specified characters check
/^[a-zA-Z0-9\s\)\(\]\[\._-&]+$/
Its showing invalid expression but when i exclude & it works fine.
/^[a-zA-Z0-9\s\)\(\]\[\._-]+$/
Actual Format
Building name should use Letters, Numbers, Underscore_, Hyphen-, Period., Square-brackets[], Parentheses() ,Ampersand &
It should not start and end with any special characters continuously
Valid:
Empire3 State&Building[A]
7Empire.State-Building(A)
Empire(State)_Building[A]12
Invalid:
##$#$#building))
().-building2
$buildingseven[0]&.
i am struggling for 2nd format. how to check for continuous allowed special characters at first and last. Any help is very much appreciated.
Escape the - character:
/^(?!\W.+\W$)[a-zA-Z0-9\s\)\(\]\[\._\-&]+$/
In a character class, the - character signifies a range of characters (e.g 1-9). Because the ASCII code for & is less than _, your regular expressions fails to parse correctly.
Also, to check that no special characters are at the beginning or end, use \W (a character other than a letter, digit or underscore) in a lookahead to check that both the start and the end are not "special characters". If you count an underscore as a special character, use [^A-Za-z0-9] instead of \W.
var validBuildingName = /^(?!\W.+\W$)[a-zA-Z0-9\s\)\(\]\[\._\-&]+$/;
validBuildingName.test('(example)'); // false
validBuildingName.test('(example'); // true
validBuildingName.test('example)'); // true
validBuildingName.test('example'); // true

Categories