Need a regex with some validations for address field - javascript

Allowed characters are [a-zA-Z0-9- /.#,] but
A blank must precede the pound sign.
There must not be a blank immediately before or after a dash.
Address must not begin with #, -, or /.
Address must not end with #, -, or /.
A slash must be surrounded in numerics.
Triple alphas are not allowed immediately following a numeric.
No single characters in the address field with the exception of N, S, E and W
So can any one suggest me to how to do this? Any help is greatly appreciated. Thanks in advance.

In regexes, it's usually easier to think of what's allowed, rather than what's forbidden. For instance:
Blank must precede the pound sign is better rephrased as "blank followed by pound sign is allowed", so one of the components will be: ( #)
Must not start/end with #-/. is something like ^[a-zA-Z0-9]...[a-zA-Z0-9]$ (with a more generous pattern in the middle.
If you're trying to validate addresses, as in postal addresses, consider whether this is a useful thing to do; there are many surprising quirks with addresses. What problem are you trying to solve? Is it worth rejecting some valid addresses in order to solve that problem?
As others have commented, post what you've already tried and what didn't work?
Use one of the interactive regex tools like regex101 to help, especially if you have examples of texts that should and shouldn't match.

I have assumed a "blank" is a space and have not implemented requirement #7 because I do not understand what it means. Once #7 has been clarified I will attempt to amend the following regular expression.
^(?!#|.*[^ ]#)(?!.*(?: -|- ))(?!.*(?:\D\/|\/\D))(?!.*\d[A-Za-z]{3})(?![/-])[a-zA-Z0-9- /.#,]*(?<![#/-])$
Start your engine!
Notice that the regex contains 5 negative lookaheads ((?!...)), three of which begin by consuming zero or more characters other than line terminators ((?!.*...)), and one negative lookbehind at the end ((?<![#/-]) and no capture groups. The negative lookaheads implement the following assertions (in order):
the pound sign, '#', must be preceded by a comma;
a hyphen cannot be preceded or followed by a space;
forward slashes must be preceded and followed by a digit; and
digits may not be followed by 3 letters.
the string may not begin with '/' or '-'
Note that the first of these requirements ensures the string does not begin with a pound sign.
Javascript's regex engine performs the following operations.
^ : match beginning of string
(?! : begin negative lookahead
# : match '#'
| : or
.*[^ ]# : match 0+ chars then char other than a space then '#'
) : end negative lookahead
(?! : begin negative lookahead
.* : match 0+ chars other than newlines
(?: -|- ) : match ' -' or '- '
) : end negative lookahead
(?! : begin negative lookahead
.* : match 0+ chars other than newlines
(?: : begin a non-capture group
\D\/ : match a non-digit followed by '/'
| : or
\/\D : match '/' followed by a non-digit
) : end non-capture group
) : end negative lookahead
(?! : begin negative lookahead
.* : match 0+ chars other than newlines
\d[A-Za-z]{3} : match a digit, then 3 letters
) : end negative lookahead
(?! : begin negative lookahead
[/-] : match '/' or '-'
) : end negative lookahead
[a-zA-Z0-9- /.#,]* : match 0+ chars in char class
(?<![#/-]) : match '#', '/' or '-' in negative lookbehind
$ : match end of string

Related

Regex - Allow one or two alphabet and its can be at anywhere in string

I want regex which can fulfill below requirement:
Between 6 and 10 total characters
At least 1 but not more than 2 of the characters need to be alpha
The alpha characters can be anywhere in the string
We have tried this but not working as expected : (^[A-Z]{1,2}[0-9]{5,8}$)|(^[A-Z]{1}[0-9]{4,8}[A-Z]{1}$)|(^[0-9]{4,8}[A-Z]{1,2}$)|([^A-Z]{3}[0-9]{6,9})
Can anyone please help me to figure it out?
Thanks
You can assert the length of the string to be 6-10 char.
Then match at least a single char [A-Z] between optional digits, and optionally match a second char [A-Z] between optional digits.
^(?=[A-Z\d]{6,10}$)\d*[A-Z](?:\d*[A-Z])?\d*$
^ Start of string
(?=[A-Z\d]{6,10}$) Positive lookahead to assert 6-10 occurrences of A-Z or a digit
\d*[A-Z] Match optional digits and then match the first [A-Z]
(?:\d*[A-Z])? Optionally match optional digits and the second [A-Z]
\d* Match optional digits
$ End of string
See a regex demo.
One option is to use the following regular expression:
^(?=.*[a-z])(?!(?:.*[a-z]){3})[a-z\d]{6,10}$
with the case-indifferent flag i set.
Demo
This expression reads, "Match the beginning of the string, assert the string contains at least one letter, assert the string does not contain three letters and assert the string contains 6-10 characters, all being letters or numbers".
The various parts of the expression have the following functions.
^ # match the beginning of the string
(?= # begin a positive lookahead
.*[a-z] # match zero or more characters and then a letter
) # end positive lookahead
(?! # begin a negative lookahead
(?: # begin a non-capture group
.*[a-z] # match zero or more characters and then a letter
){3} # end non-capture group and execute it 3 times
) # end negative lookahead
[a-z\d]{6,10} # match 6-10 letters or digits
$ # match end of string
Note that neither of the lookaheads advances the string pointer maintained by the regex engine from the beginning of the string.

regex for simple arithmetic expression

I've read other stackoverflow posts about a simple arithmetic expression regex, but none of them is working with my issue:
I need to validate this kind of expression: "12+5.6-3.51-1.06",
I tried
const mathre = /(\d+(.)?\d*)([+-])?(\d+(.)?\d*)*/;
console.log("12+5.6-3.51-1.06".match(mathre));
but the result is '12+5', and I can't figure why ?
You only get 12.5 as a match, as there is not /g global flag, but if you would enable the global flag it will give partial matches as there are no anchors ^ and $ in the pattern validating the whole string.
The [+-] is only matched once, which should be repeated to match it multiple times.
Currently the pattern will match 1+2+3 but it will also match 1a1+2b2 as the dot is not escaped and can match any character (use \. to match it literally).
For starting with digits and optional decimal parts and repeating 1 or more times a + or -:
^\d+(?:\.\d+)?(?:[-+]\d+(?:\.\d+)?)+$
Regex demo
If the values can start with optional plus and minus and can also be decimals without leading digits:
^[+-]?\d*\.?\d+(?:[-+][+-]?\d*\.?\d+)+$
^ Start of string
[+-]? Optional + or -
\d*\.\d+ Match *+ digits with optional . and 1+ digits
(?: Non capture group
[-+] Match a + or -
[+-]?\d*\.\d+ Match an optional + or - 0+ digits and optional . and 1+ digits
)+ Close the noncapture group and repeat 1+ times to match at least a single + or -
$ End of string
Regex demo
You would try to use this solution for PCRE compatible RegExp engine:
^(?:(-?\d+(?:[\.,]{1}\d)?)[+-]?)*(?1)$
^ Start of String
(?: Non capture group ng1
(-?\d+(?:[\.,]{1}\d)?) Pattern for digit with or without start
"-" and with "." or "," in the middle, matches 1 or 1.1 or 1,1
(Matching group 1)
[+-]? Pattern for "+" or "-"
)* Says
that group ng1 might to repeat 0 or more times
(?1) Says that
it must be a digit in the end of pattern by reference to the first subpattern
$ End of string
As JS does not support recursive reference, you may use full version instead:
/^(?:(-?\d+(?:[\.,]{1}\d)?)[+-]?)*(-?\d+(?:[\.,]{1}\d)?)$/gm

RegEx for email validation where atleast 2 char required before #

I'm trying below RegEx which need atleast 2 characters before #
^([a-zA-Z])[^.*-\s](?!.*[-_.#]{2})(?!.\.{2})[a-zA-Z0-9-_.]+#([\w-]+[\w]+(?:\.[a-z]{2,10}){1,2})$
like
NOT ALLOWED : aa.#co.kk.pp
NOT ALLOWED : aa..#co.kk.pp
NOT ALLOWED : a.a#co.kk.pp
SHOULD ALLOWED: aa#co.kk.pp
SHOULD ALLOWED: aaa#co.kk.pp
SHOULD ALLOWED: aa.s#co.kk.pp. (atleast one char after special char and before #)
SHOULD ALLOWED: aa.ss#co.kk.pp
SHOULD ALLOWED: a#co.kk.pp
Before # only allowed special char . _ - which also not consecutively like (--) also not in beginning.
i tried below RegEx also but no luck
^[a-zA-Z)]([^.*-\s])(?!.*[-_.#]{2}).(?!.\.{2})[\w.-]+#([\w-]+[\w]+(?:\.[a-z]{2,10}){1,2})$
I would suggest keeping things simple like this:
^([a-zA-Z][\w+-]+(?:\.\w+)?)#([\w-]+(?:\.[a-zA-Z]{2,10})+)$
RegEx Demo
By no means it is a comprehensive email validator regex but it should meet your requirements.
Details:
^: Start
(: Start capture group #1
[a-zA-Z]: Match a letter
[\w.+-]+: Match 1+ of word characters or - or +
(?:\.\w+)?: Match an option part after a dot
): End capture group #1
#: Match a #
(: Start capture group #2
[\w-]+: Match 1+ of word characters or -
(?:\.[a-zA-Z]{2,10})+: Match a dot followed by 2 to 10 letters. Repeat this group 1+ times
): End capture group #2
$: End

Capture between pattern of digits

I'm stuck trying to capture a structure like this:
1:1 wefeff qwefejä qwefjk
dfjdf 10:2 jdskjdksdjö
12:1 qwe qwe: qwertyå
I would want to match everything between the digits, followed by a colon, followed by another set of digits. So the expected output would be:
match 1 = 1:1 wefeff qwefejä qwefjk dfjdf
match 2 = 10:2 jdskjdksdjö
match 3 = 12:1 qwe qwe: qwertyå
Here's what I have tried:
\d+\:\d+.+
But that fails if there are word characters spanning two lines.
I'm using a javascript based regex engine.
You may use a regex based on a tempered greedy token:
/\d+:\d+(?:(?!\d+:\d)[\s\S])*/g
The \d+:\d+ part will match one or more digits, a colon, one or more digits and (?:(?!\d+:\d)[\s\S])* will match any char, zero or more occurrences, that do not start a sequence of one or more digits followed with a colon and a digit. See this regex demo.
As the tempered greedy token is a resource consuming construct, you can unroll it into a more efficient pattern like
/\d+:\d+\D*(?:\d(?!\d*:\d)\D*)*/g
See another regex demo.
Now, the () is turned into a pattern that matches strings linearly:
\D* - 0+ non-digit symbols
(?: - start of a non-capturing group matching zero or more sequences of:
\d - a digit that is...
(?!\d*:\d) - not followed with 0+ digits, : and a digit
\D* - 0+ non-digit symbols
)* - end of the non-capturing group.
you can use or not the ñ-Ñ, but you should be ok this way
\d+?:\d+? [a-zñA-ZÑ ]*
Edited:
If you want to include the break lines, you can add the \n or \r to the set,
\d+?:\d+? [a-zñA-ZÑ\n ]*
\d+?:\d+? [a-zñA-ZÑ\r ]*
Give it a try ! also tested in https://regex101.com/
for more chars:
^[a-zA-Z0-9!##\$%\^\&*)(+=._-]+$

Is there a better way to write a regex that does not match on leading and trailing spaces along with a character limit?

The regex I have is...
^[A-z0-9]*[A-z0-9\s]{0,20}[A-z0-9]*$
The ultimate goal of this regex is not to allow leading and trailing spaces, while limiting the characters that are entered to 20, which the above regex doesn't do a good job at.
I found a some questions similar to this and the closest one to this would be How to validate a user name with regex?, but it did not limit the number of chars. This did solve the problem of leading and trailing spaces.
I also saw a way using negation and another negative lookahead, but that didn't work out so well for me.
Is there a better way to write the regex above with the 20 character limit? The repeat of the allowed characters is pretty ugly especially when the list of the allowed characters are large and specific.
Update:
I like this one even better. We use a negative lookahead to make sure there isn't ^\s (whitespace at the beginning of the string) or \s$ whitespace at the end of the string. And then match 1 alphanumeric character. We repeat this 1-20 times.
/^(?:(?!^\s|\s$)[a-z0-9\s]){1,20}$/i
Demo
^ (?# beginning of string)
(?: (?# non-capture group for repetition)
(?! (?# begin negative lookahead)
^\s (?# whitespace at beginning of string)
| (?# OR)
\s$ (?# whitespace at end of string)
) (?# end negative lookahead)
[a-z0-9\s] (?# match one alphanumeric/whitespace character)
){1,20} (?# repeat this process 1-20 times)
$ (?# end of string)
Initial:
I use a negative lookahead at the beginning of the string ((?!...)) to make sure that we don't start off with whitespace. Then we check for 0-19 alphanumeric (case-insensitive thanks to i modifier) or whitespace characters. Finally, we make sure we end with a pure alphanumeric character (no whitespace) since we can't use lookbehinds in Javascript.
/^(?!\s)[a-z0-9\s]{0,19}[a-z0-9]$/i
Hmm, if you need to exclude the single character text, I would go with:
^[A-z0-9][A-z0-9\s]{0,18}[A-z0-9]$
If a single character is also acceptable:
^[A-z0-9](?:[A-z0-9\s]{0,18}[A-z0-9])?$
I think your regex limits the input to 22 characters, not 20.
Are you aware that character range [A-z] includes characters [\]^_`?
I think I'd do something like this:
input = input.trim().replace(/\s+/, ' ');
if (input.length > MAX_INPUT_LENGTH ||
! /^[a-z ]+$/i.match(input) ) {
# raise exception?
}
\S matches a non-whitespace character. Therefore this should match what you're looking for:
^\S.{0,18}\S$
That is, a non-space character \S, followed by up to 18 of any type of character . (space or not), and finally a non-space character.
The only limitation of the above regex is that the value must be at least 2 characters. If you need to allow 1 character, you can use:
^\S(.{0,18}\S)?$
If you're looking to validate a user name (as you implied but didn't explicitly state) you're probably looking to allow only numbers, letters, and underscores. In that case, ^\w{1,20}$ will suffice.
use this pattern ^(?!\s).{0,20}(?<!\s)$
^(?!\s) start of line does not see a space
.{0,20} followed by 0 to 20 characters
(?<!\s)$ ends with a character that is not a space
Demo
or this pattern ^(\S.{0,18}\S)?$
Demo

Categories