Regular expression for Phone Numbers with different lengths - javascript

I searched on Google for phone number regex validations but haven't been able to make it work based on my requirements.
Basically, I have three separate sets of rules for the prefix:
For 10 digit numbers I need to make sure the first 3 are numbers starting from 2-9.
For 11 digit numbers I need to make sure the first 4 are numbers starting from 1-9.
For for anything greater than 12 digits I need to make sure the first 7 are numbers from 0-9.
After that I can allow letters like 1888GOSUPER or something like that (this would fall under the second condition)
This is what I have so far but I am not certain if I have covered everything:
var reg10 = /^[2-9]{3}[a-z0-9]+$/i;
var reg11 = /^[1-9]{4}[a-z0-9]+$/i;
var reg12plus = /^[0-9]{7}[a-z0-9]+$/i;

This can be handled by one regex (including your check for length, as suggested by others). Probably can be done more succinctly than this, but I feel this is more readable in the context of your 3 specifically separate prefix requirements:
^(?:[2-9]{3}[a-z0-9]{7})$|^(?:[1-9]{4}[a-z0-9]{7})$|^(?:[0-9]{7}[a-z0-9]{5,})$
Basically combines your three separate cases via "alternation" |
This can be "normalised" slightly, without "breaking" the clarity of intent, by grouping the entire expression and then surrounding with start/end anchors (rather than repeating these in each option, as above). Although this results in a similar length rule overall, by the time we add our additional non-capturing group:
^(?:(?:[2-9]{3}[a-z0-9]{7})|(?:[1-9]{4}[a-z0-9]{7})|(?:[0-9]{7}[a-z0-9]{5,}))$

Related

Javascript - how to use regex process the following complicated string

I have the following string that will occur repeatedly in a larger string:
[SM_g]word[SM_h].[SM_l] "
Notice in this string after the phrase "[SM_g]word[Sm_h]" there are three components:
A period (.) This could also be a comma (,)
[SM_l]
"
Zero to all three of these components will always appear after "[SM_g]word[SM_h]". However, they can also appear in any order after "[SM_g]word[SM_h]". For example, the string could also be:
[SM_g]word[SM_h][SM_l]"
or
[SM_g]word[SM_h]"[SM_l].
or
[SM_g]word[SM_h]".
or
[SM_g]word[SM_h][SM_1].
or
[SM_g]word[SM_h].
or simply just
[SM_g]word[SM_h]
These are just some of the examples. The point is that there are three different components (more if you consider the period can also be a comma) that can appear after "[SM_h]word[SM_g]" where these three components can be in any order and sometimes one, two, or all three of the components will be missing.
Not only that, sometimes there will be up to one space before " and the previous component/[SM_g]word[SM_h].
For example:
[SM_g]word[SM_h] ".
or
[SM_g]word[SM_h][SM_l] ".
etc. etc.
I am trying to process this string by moving each of the three components inside of the core string (and preserving the space, in case there is a space before &\quot; and the previous component/[SM_g]word[SM_h]).
For example, [SM_g]word[SM_h].[SM_l]" would turn into
[SM_g]word.[SM_l]"[SM_h]
or
[SM_g]word[SM_h]"[SM_l]. would turn into
[SM_g]word"[SM_l].[SM_h]
or, to simulate having a space before "
[SM_g]word[SM_h] ".
would turn into
[SM_g]word ".[SM_h]
and so on.
I've tried several combinations of regex expressions, and none of them have worked.
Does anyone have advice?
You need to put each component within an alternation in a grouping construct with maximum match try of 3 if it is necessary:
\[SM_g]word(\[SM_h])((?:\.|\[SM_l]| ?"){0,3})
You may replace word with .*? if it is not a constant or specific keyword.
Then in replacement string you should do:
$1$3$2
var re = /(\[SM_g]word)(\[SM_h])((?:\.|\[SM_l]| ?"){0,3})/g;
var str = `[SM_g]word[SM_h][SM_l] ".`;
console.log(str.replace(re, `$1$3$2`));
This seems applicable for your process, in other word, changing sub-string position.
(\[SM_g])([^[]*)(\[SM_h])((?=([,\.])|(\[SM_l])|( ?&\\?quot;)).*)?
Demo,,, in which all sub-strings are captured to each capture group respectively for your post processing.
[SM_g] is captured to group1, word to group2, [SM_h] to group3, and string of all trailing part is to group4, [,\.] to group5, [SM_l] to group6, " ?&\\?quot;" to group7.
Thus, group1~3 are core part, group4 is trailing part for checking if trailing part exists, and group5~7 are sub-parts of group4 for your post processing.
Therefore, you can get easily matched string's position changed output string in the order of what you want by replacing with captured groups like follows.
\1\2\7\3 or $1$2$7$3 etc..
For replacing in Javascript, please refer to this post. JS Regex, how to replace the captured groups only?
But above regex is not sufficiently precise because it may allow any repeatitions of the sub-part of the trailing string, for example, \1\2\3\5\5\5\5 or \1\2\3\6\7\7\7\7\5\5\5, etc..
To avoid this situation, it needs to adopt condition which accepts only the possible combinations of the sub-parts of the trailing string. Please refer to this example. https://regex101.com/r/6aM4Pv/1/ for the possible combinations in the order.
But if the regex adopts the condition of allowing only possible combinations, the regex will be more complicated so I leave the above simplified regex to help you understand about it. Thank you:-)

Matching varying length in JS Regex

Let me explain my query with an example:
Am capturing page name from a web site. Due to design, the page name can be of varying length:
It can be
Data1|Data2|Data3
Data1|Data2|Data3|Data4
Data1|Data2
I need to write a Regex which comes true on all the above scenarios. I have something below shared by a previous user:
/(.*?)\|(.*?)\|(.*?)\|(.*)/gm;
The above works well when the string is always of four group, and there is a blank in between. But if I just have two values the regex fails. Can any user please guide?
Not sure what you meant there but does this help? But it will only accept alphanumeric values and a space
/([a-zA-Z 0-9]{1,}\|){1,}[a-zA-Z 0-9]{1,}/g
This will expect at less two Data field, and at most 4 fields
/(?:([^|]*)\|){1,3}([^|]*)/gm;
If you also want only one field (no pipe):
/(?:([^|]*)\|){,3}([^|]*)/gm;
{n,m} means allowed to repeat n trhough m times
Notice how I used [^|]* instead of .*?, so I match anything but the pipe |, also I used non matching groups (?:) so the groups that includes the pipes are invisible, i.e. you can get the fields as get them before

Regex Comma Separated Phone Number

I am trying to generate a regex which would match the following sequence-
+91123456789,+41123456789,+21123456789.... and so on, there is no limit of phone numbers.
Basically the usage is to validate the phone numbers which users may add, phone number can be multiple and need to be separated by commas, I am already removing the empty spaces which users may add, so no worry for that.
I am not good with regex and have created the following regex but it doesn't matches the preceding phone numbers, means the whole string of phone numbers do not match-
^\+?\d{1,4}?[-.\s]?\(?\d{1,3}?\)?[-.\s]?\d{1,4}[-.\s]?\d{1,4}[-.\s]?\d{1,9},\+?\d{1,4}?[-.\s]?\(?\d{1,3}?\)?[-.\s]?\d{1,4}[-.\s]?\d{1,4}[-.\s]?\d{1,9}$
I need to validate the user input using javascript or jquery.
Valid Phone number should be having country code like +91 or +21 etc country code can be of one or two digits, then the number of digits need to be 7 to 9.
I anyone could help, it would be highly appreciable, I have spent lot of time on this one.
To validate the whole string handling mulitple values sepparated by comma just add an group with * multiplier:
^\+\d{8,11}(,\+\d{8,11})*$
If I understand the requirements correctly, the following regex should work
\+\d{9,11}
However, you can separate the country code out, for if you need to allow for (+44)xxxxxxxxx
\+\d{2}\d{7,9}
if the requirement is to allow for 1 country code as well, adjust the regex to the following
\+\d{1,2}\d{7,10} //I think to 10, not sure on their numbers
You can update the ranges as you see fit :)
Demo: https://regex101.com/r/rJ4wM7/1

basic search ranking with regex in javascript

Currently I am using the below for search.
I assume each and every term the user types must appear at least once in the article.
I use the match method with regex
^(?=.*one)(?=.*two)(?=.*three).*$
with g, i, and m
At the moment I use matches.length to count the number of matches, but the behavior is not as expected.
example:
"one two three. one two three"
would give me 2 matches, but it should really be 6.
If I do something like
(one|two|three)
then I do get 6 matches, but if I have the data:
"one two. one two"
I get 4 matches, when in reality I want it to be 0, since not every word appears at least once.
I could do the first regex to check if there's at least one "match". If there is, I would subsequently use the second regex to count the real number of matches, but this would make my program run much slower than it already is. Doing this regex against 2500 json articles takes anywhere from 60 to 120 seconds as it is.
Any ideas on how to make this faster or better? Change the regex? Use search or indexOf instead of matches?
note:
I'm using lawnchair db for local persistance and jquery. I package the code for phonegap and as a chrome packaged app.
var input = '...';
var match = [];
if (input.match(/^(?=.*\bone\b)(?=.*\btwo\b)(?=.*\bthree\b)/i)) {
match = input.match(/\b(one|two|three)\b/ig);
}
Test this code here.

Need regex to match unformatted phone number syntax

I need a regex for Javascript that will match a phone number stripped of all characters except numbers and 'x' (for extension). Here are some example formats:
12223334444
2223334444
2223334444x5555
You are guaranteed to always have a minimum of 10 numerical digits, as the leading '1' and extension are optional. There is also no limit on the number of numerical digits that may appear after the 'x'. I want the numbers to be split into the following backreferences:
(1)(222)(333)(4444)x(5555)
The parenthesis above demonstrate how I want the number to be split up into backreferences. The first set of parenthesis would be assigned to backreference $1, for example.
So far, here is what I've come up with for a regex. Keep in mind that I'm not really that great with regex, and regexlib.com hasn't really helped me out in this department.
(\d{3})(\d{3})(\d{4})
The above regex handles the 2nd case in my list of example test cases in my first code snippet above. However, this regex needs to be modified to handle both the optional '1' and extension. Any help on this? Thanks!
Regex option seems perfectly fine to me.
var subject = '2223334444';
result = subject.replace(/^1?(\d{3})(\d{3})(\d{4})(x\d+)?$/mg, "1$1$2$3$4");
alert(result);
if(!result.match(/^\d{11}(?:x\d+)?/))
alert('The phone number came out invalid. Perhaps it was entered incorrectly');
This will say 12223334444 when there is no extension
I expect you want to tweak this out some, let me know how it should be.
If I were you, I would not go with a regular expression for this — it would cause more headaches than it solved. I would:
Split the phone number on the "x", store the last part in the extension.
See how long the initial part is, 9 or 10 digits
If it's 10 digits, check that the first is a 1, slice it off, and then continue with the 9-digit process:
If it's 9 digits, split it up into 3-3-4 and split them into area code, exchange, number.
Validate the area code and exchange code according to the rules of the NANP.
This will validate your phone number and be much, much easier and will make it possible for you to enforce rules like "no X11 area codes" or "no X11 exchange codes" more-easily — you'd have to do this anyway, and it's probably easier to just use plain string manipulation to split it into substrings.
I did a bit more testing and here's a solution I've found. I haven't found a case where this breaks yet, but if someone sees something wrong with it please let me know:
(1)?(\d{3})(\d{3})(\d{4})(?:x(\d+))?
Update:
I've revised the regex above to handle some more edge cases. This new version will fail completely if something unexpected is present.
(^1|^)(\d{3})(\d{3})(\d{4})($|(?:x(\d+))$)
My regex is:
/\+?[0-9\-\ \(\)]{10,22}/g

Categories