Regular expression that matches 5 (exactly) comma separated currency values - javascript

I need to match 5 occurrences of comma separated currency values.
I do have this reg ex that does the job but I think that's not the great way to do it.
^(\$[0-9]{1,3}(?:[,.]?[0-9]{3})*(?:\.[0-9]{2})?,\s?){4}(\$[0-9]{1,3}(?:[,.]?[0-9]{3})*(?:\.[0-9]{2})?)$
P.S. I had to split the expression into matching, 4 comma separated occurrences and 1 to sniff out trailing comma (I don't think that's the way to do it)
Some of the valid matching inputs could be,
$200,000,$525,$60000,$120,000,$65,456 (space between currency values is optional)
$200,000, $525, $60000,$120,000, $65,456
Some of the invalid input values,
$200,000,$525,$60000,$120,000,$65,456, (Trailing comma)
$200,000,,$525,$60000.$120,000,$65,456,, etc
Any pointers would be greatly appreciated.
Edit: The solution I am looking at is a pure reg ex solution (better than what I have written above), so that I can fire validations as soon as erroneous inputs are entered by the user.

Update
If you want to match while validating prices you could do this which follows:
Including both dot and comma for formatting prices
Max one space character between prices
^\$\d+([,.]\d{3})*( ?, ?\$\d+([,.]\d{3})*){4}$
Live demo
Breakdown:
^ Match start of input string (or line if m flag is set)
\$\d+ Match a $ that preceds a number of digits
( Start of grouping (#1)
[,.]\d{3} Match a period or comma that preceds 3 digits
)* End of grouping (#1), match at least zero time
( Start of grouping (#2)
?, ? Match a comma surrounded by optional spaces (one space at either side)
\$\d+ Match a $ that preceds a number of digits
([,.]\d{3})* Match a period or comma that preceds 3 digits (thousand separator), match at least zero time
){4} End of grouping (#2), repeat exactly 4 times
$ End of input string (or line if m flag is set)
JS code:
var re = /^\$\d+([,.]\d{3})*( ?, ?\$\d+([,.]\d{3})*){4}$/g;
var prices = ['$200,000,$525,$60000,$120,000,$65,456',
'$200,000, $525, $60000,$120,000, $65,456',
'$200,000,$525,$60000,$120,000,$65,456, ',
'$200,000,,$525,$60000.$120,000,$65,456,,'];
prices.forEach(function(s) {
console.log(s + " => " + Boolean(s.match(re)))
})

This regex is a simpler version of what you're trying to achieve:
^(?:\$\d{1,3}(?:,?\d{3})*[,.] ?){4}\$\d{1,3}(?:,?\d{3})*$
-------------------------------
The underlined part matches 4 "prices" as you've defined, followed by a dot/comma and an optional space.
The rest matches the last "price".
Please let me know if something is unclear

The most prevalent character to base the pattern on is \$ (escaped), whether it is the first character of the string or preceded by a comma (optionally followed by whitespace), that is done using (?:^|,)\s*. After that you want any number of digits, which is \d+, optionally followed by a comma which is immediately followed by digits again; ,\d+.
Combining these, you'd get; /(?:^|,)\s*(\$\d+(?:,\d+)?)/g
const pattern = /(?:^|,|\.)\s*(\$\d+(?:,\d+)?)/g;
const test = [
'$200,000,$525,$60000,$120,000,$65,456',
'$200,000, $525, $60000,$120,000, $65,456',
'$200,000,$525,$60000,$120,000,$65,456,',
'$200,000,,$525,$60000.$120,000,$65,456,,',
];
const matches = test.reduce((carry, string) => {
let match = null;
while (match = pattern.exec(string)) {
carry.push(match[1]);
}
return carry;
}, []);
console.log(matches);
Added the extra examples from the modified question, including the . which now appeared as separator ($200,000,,$525,$60000.$120,000,$65,456,,) and modified the pattern in the example to account for this.

Related

Regex to detect dates separated by newlines

I'm trying to validate text that's in the format of dates separated by newlines.
The date format needs to be in the form of MM-DD-YYYY.
So a sample could be
MM-DD-YYYY\n
MM-DD-YYYY\n
MM-DD-YYYY
Where there could be an infinite amount of dates entered that are separated by newlines
I've tried /^(\d{2})-(\d{2})-(\d{4})\s+$/ but that doesn't seem to fully work.
Note: I want this to allow for any leading, trailing whitespace and empty newlines as well.
Basically,
A space character
A carriage return character
A newline character
I'm not partial to using regexes. If another way is simpler, desired, more efficient, than I'd gladly switch to that. Thanks!
To validate a string with multiple date-like strings in it with or without leading/trailing whitespace, allowing empty/blank lines, you may use
A method to split the text into lines and use .every() to test each line against a simple pattern:
text.split("\n").every(x => /^\s*(?:\d{2}-\d{2}-\d{4}\s*)?$/.test(x))
NOTE: This will validate a blank input!
Details
^ - start of string
\s* - 0+ whitespaces
(?: - starts a non-capturing group
\d{2}-\d{2}-\d{4} - two digits, -, two digits, - and four digits
\s* - 0+ whitespaces
)? - end of the group, repeat 1 or 0 times (it is optional)
$ - end of string.
A single regex for the multiline string
/^\s*\d{2}-\d{2}-\d{4}(?:[^\S\n]*\n\s*\d{2}-\d{2}-\d{4})*\s*$/.test(text)
See the regex demo. This will not validate blank input.
This regex is long, but is still efficient since the backtracking is minimal (see [^\S\n]*\n\s* part where the first [^\S\n]* matches any whitespace but a line feed, then \n matches a newline (hence, no backtracking here) and then \s* matches 0+ whitespace (again, \n is not quantified so no backtracking into the pattern is \s* fails). The (?:[^\S\n]*\n\s*\d{2}-\d{2}-\d{4})* part is a * quantified non-capturing group that matches 0 or more occurrences of the quantified pattern sequence.
JS demos:
var matching_text = "\n01-01-2020\n 01-01-2020\n01-01-2020 \n\n\n 01-01-2020 \n";
var non_matching_text = "\n01-01-2020\n 01-01-2020\n01-01-2020 \n\n\n 01-01-2020 \n01201-01-20202020";
var regex_1 = /^\s*(?:\d{2}-\d{2}-\d{4}\s*)?$/;
var regex_2 = /^\s*\d{2}-\d{2}-\d{4}(?:[^\S\n]*\n\s*\d{2}-\d{2}-\d{4})*\s*$/;
// Test Solution 1:
console.log(matching_text.split("\n").every(x => regex_1.test(x))); // => true
console.log(non_matching_text.split("\n").every(x => regex_1.test(x))); // => false
// Test Solution 2:
console.log(regex_2.test(matching_text)); // => true
console.log(regex_2.test(non_matching_text)); // => false
You can use something like below to get all matches that satisfy the regular expression. Notice the parentheses () are only around the date part (\d{2}-\d{2}-\d{4}) so that is what you will end up capturing. Since the global flag g is also set on the regex, this will return all occurrences of the parenthesized expression.
Edit: added support for a leading and trailing whitespace.
Edit 2: added ^ and $ so the regex doesn't allow for more than 2 digits in day and more than 4 digits in year.
Run and test:
let regex = /[\\s]*(\d{2}-\d{2}-\d{4})[\\s]*[\\n]*/g;
let dates = " 12-02-2020 \n 09-10-2020\n 03-03-2020 ";
console.log( dates.match(regex) );
EDIT: In order to validate the string of dates you could use the regex.test() method like this:
let regex = /^\s*\d{2}-\d{2}-\d{4}\s*$/;
let dateString = " 12-02-2020 \n 09-10-2020\n 03-03-2020 ";
var dates = dateString.split('\n');
var datesValid = () => {
dates.forEach((el) => {
if(!regex.test(el))
return false;
});
return true;
};
console.log( datesValid() );

Formatting a phone number in specific way?

This is not a duplicate, the linked thread does not explain how to achieve this.
I'm looking to get a phone number in a specific format.
+xx (x) xxx xxx xxxx
Country code.
Space.
Zero in brackets.
Space.
3 digits.
Space.
3 digits.
Space.
4 digits.
The user could type anything in (but should always be a +61 number). So far I have tried the below.
Removing spaces and non numeric characters.
If starting with a zero, remove.
If starting with 610, remove.
If starting with 61, remove.
Re add country code in specific format and format rest of phone number is a 3,3,4 format.
My question, is - is there a way to simply the below to perhaps one expression?
value = value.replace(/\D/g,'');
value = value.startsWith(0) ? value.substring(1) : value;
value = value.startsWith('610') ? value.substring(3) : value;
value = value.startsWith('61') ? value.substring(2) : value;
value = '+61 (0) ' + value.replace(/\d{3,4}?(?=...)/g, '$& ');
To expand and explain on #splash58's comment they propose using two regular expressions to do the full replacement you desire. The first(/\D|(0+|610|61)/gi) will remove all unwanted characters within the string. The second (/(\d{3})(\d{3})(\d{4})/gi) will take the remaining digits and capture the desired groupings so you can format them as desired. I highly suggest looking at the regex101 links they provided as that site will fully explain how and why a given expressions matches what it does on the right.
Short version:
/\D|(0+|610|61)/gi will match any NON-digit character OR a string of 0s, "610" or "61". Replace this with nothing to remove
/(\d{3})(\d{3})(\d{4})/gi will match a string of 10 digits and capture groups, that's what the parentheses are, of 3 digits, 3 digits and 4 digits. These can be referenced in the replacement as identifiers $1, $2 and $3 according to their position.
Putting it all together:
// look in a string and return formatted phone number only
function phone(str) {
str = str.replace(/\D|(0+|610|61)/gi, '');
str = str.replace(/(\d{3})(\d{3})(\d{4})/gi, '+61 (0) $1 $2 $3');
return str;
}
console.log(phone('xgsh6101231231234vvajx'));
console.log(phone('+6101231231234'));
I would also recommend first doing a search on the entire input string for a series of numbers or whitespace so that you end up with less false positives. This can be done with a regular expression like /[\d\s]+/
You might match the number using:
^.*?\+?0*610?(\d{3})(\d{3})(\d{4})(?!\d).*$
Regex demo
And replace with:
+61 (0) $1 $2 $3
Explanation
^ Assert the start of the string
.*? Match 0+ characters non greedy
\+? Match an optional plus sign
0*610? Match 0+ times a zero, 61 with optional zero
(\d{3})(\d{3})(\d{4}) match 3 groups with 3, 3, and 4 digits
(?!\d) Negative lookahead to assert what follows is not a digit
.* Match 0+ characters
$ Assert the end of the string
const strings = [
"xgsh6101231231234vvajx",
"xgsh06101231231234vvajx",
"xgsh000006101231231234vvajx",
"+6101231231234",
"xgsh61012312312345vvajx",
"xgsh5101231231234vvajx",
"xgsh00000101231231234vvajx",
"xgsh6143545626455345601231231234vvajx"
];
let pattern = /^.*?\+?0*610?(\d{3})(\d{3})(\d{4})(?!\d).*$/;
strings.forEach((s) => {
console.log(s.replace(pattern, "+61 (0) $1 $2 $3"));
});

regex - don't allow name to finish with hyphen

I'm trying to create a regex using javascript that will allow names like abc-def but will not allow abc-
(hyphen is also the only nonalpha character allowed)
The name has to be a minimum of 2 characters. I started with
^[a-zA-Z-]{2,}$, but it's not good enough so I'm trying something like this
^([A-Za-z]{2,})+(-[A-Za-z]+)*$.
It can have more than one - in a name but it should never start or finish with -.
It's allowing names like xx-x but not names like x-x. I'd like to achieve that x-x is also accepted but not x-.
Thanks!
Option 1
This option matches strings that begin and end with a letter and ensures two - are not consecutive so a string like a--a is invalid. To allow this case, see the Option 2.
^[a-z]+(?:-?[a-z]+)+$
^ Assert position at the start of the line
[a-z]+ Match any lowercase ASCII letter one or more times (with i flag this also matches uppercase variants)
(?:-?[a-z]+)+ Match the following one or more times
-? Optionally match -
[a-z]+ Match any ASCII letter (with i flag)
$ Assert position at the end of the line
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a", // valid
"aa-a-","a","a-","-a","a--a" // invalid
]
var r = /^[a-z]+(?:-?[a-z]+)+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
Option 2
If you want to match strings like a--a then you can instead use the following regex:
^[a-z]+[a-z-]*[a-z]+$
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a","a--a", // valid
"aa-a-","a","a-","-a" // invalid
]
var r = /^[a-z]+[a-z-]*[a-z]+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
You can use a negative lookahead:
/(?!.*-$)^[a-z][a-z-]+$/i
Regex101 Example
Breakdown:
// Negative lookahead so that it can't end with a -
(?!.*-$)
// The actual string must begin with a letter a-z
[a-z]
// Any following strings can be a-z or -, there must be at least 1 of these
[a-z-]+
let regex = /(?!.*-$)^[a-z][a-z-]+$/i;
let test = [
'xx-x',
'x-x',
'x-x-x',
'x-',
'x-x-x-',
'-x',
'x'
];
test.forEach(string => {
console.log(string, ':', regex.test(string));
});
The problem is that the first assertion accepts 2 or more [A-Za-z]. You will need to modify it to accept one or more character:
^[A-Za-z]+((-[A-Za-z]{1,})+)?$
Edit: solved some commented issues
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-dfe'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('x-d'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-'); // Logs false
Edit 2: Edited to accept characters only
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('abc'); // Logs true
Use this if you want to accept such as A---A as well :
^(?!-|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/4UYd9l/4/
If you don't want to accept such as A---A do this:
^(?!-|.*[-]{2,}.*|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/qH4Q0q/4/
So both will accept only word starting from two characters of the pattern [A-Za-z-] and not start or end (?!-|.*-$) (negative lookahead) with - .
Try this /([a-zA-Z]{1,}-[a-zA-Z]{1,})/g
I suggest the following :
^[a-zA-Z][a-zA-Z-]*[a-zA-Z]$
It validates :
that the matched string is at least composed of two characters (the first and last character classes are matched exactly once)
that the first and the last characters aren't dashes (the first and last character classes do not include -)
that the string can contain dashes and be greater than 2 characters (the second character class includes dashes and will consume as much characters as needed, dashes included).
Try it online.
^(?=[A-Za-z](?:-|[A-Za-z]))(?:(?:-|^)[A-Za-z]+)+$
Asserts that
the first character is a-z
the second is a-z or hyphen
If this matches
looks for groups of one or more letters prefixed by a hyphen or start of string, all the way to end of string.
You can also use the I switch to make it case insensitive.

Javascript regex to get substring, excluding a pattern?

I am still a beginner :)
I need to get a substring ignoring the last section inside [] (including the brackets []), i.e. ignore the [something inside] section in the end.
Note - There could be other single occurances of [ in the string. And they should appear in the result.
Example
Input of the form -
1 checked arranged [1678]
Desired output -
1 checked arranged
I tried with this
var item = "1 checked arranged [1678]";
var parsed = item.match(/([a-zA-Z0-9\s]+)([(\[d+\])]+)$/);
|<-section 1 ->|<-section 2->|
alert(parsed);
I tried to mean the following -
section 1 - multiple occurrences of words (containing literals and nos.) followed by spaces
section 2 - ignore the pattern [something] in the end.
But I am getting 1678],1678,] and I am not sure which way it is going.
Thanks
OK here is the problem in your expression
([a-zA-Z0-9\s]+)([(\[d+\])]+)$
The Problem is only in the last part
([(\[d+\])]+)$
^ ^
here are you creating a character class,
what you don't want because everything inside will be matched literally.
((\[d+\])+)$
^ ^^
here you create a capturing group and repeat this at least once ==> not needed
(\[d+\])$
^
here you want to match digits but forgot to escape
That brings us to
([a-zA-Z0-9\s]+)(\[\d+\])$
See it here on Regexr, the complete string is matched, the section 1 in capturing group 1 and section 2 in group 2.
When you now replace the whole thing with the content of group 1 you are done.
You could do this
var s = "1 checked arranged [1678]";
var a = s.indexOf('[');
var b = s.substring(0,a);
alert(b);
http://jsfiddle.net/jasongennaro/ZQe6Y/1/
This s.indexOf('['); checks for where the first [ appears in the string.
This s.substring(0,a); chops the string, from the beginning to the first [.
Of course, this assumes the string is always in a similar format
var item = '1 check arranged [1678]',
matches = item.match(/(.*)(?=\[\d+\])/));
alert(matches[1]);
The regular expression I used makes use of a positive lookahead to exclude the undesired portion of the string. The bracketed number must be a part of the string for the match to succeed, but it will not be returned in the results.
Here you can find how to delete stuff inside square brackets. This will leave you with the rest. :)
Regex: delete contents of square brackets
try this if you only want to get rid of that [] in the end
var parsed = item.replace(/\s*\[[^\]]*\]$/,"")
var item = "1 checked arranged [1678]";
var parsed = item.replace(/\s\[.*/,"");
alert(parsed);
That work as desired?
Use escaped brackets and non-capturing parentheses:
var item = "1 checked arranged [1678]";
var parsed = item.match(/([\w\s]+)(?:\s+\[\d+\])$/);
alert(parsed[1]); //"1 checked arranged"
Explanation of regex:
([\w\s]+) //Match alphanumeric characters and spaces
(?: //Start of non-capturing parentheses
\s* //Match leading whitespace if present, and remove it
\[ //Bracket literal
\d+ //One or more digits
\] //Bracket literal
) //End of non-capturing parentheses
$ //End of string

validating variable in javascript

Hi i have a field in php that will be validated in javascript using i.e for emails
var emailRegex = /^[\w-\.]+#([\w-]+\.)+[\w-]{2,4}$/;
What i'm after is a validation check which will look for the
first letter as a capital Q
then the next letters can be numbers only
then followed by a .
then two numbers only
and then an optional letter
i.e Q100.11 or Q100.11a
I must admit i look at the above email validation check and i have no clue how it works but it does ;)
many thanks for any help on this
Steve
The ^ marks the beginning of the string, $ matches the end of the string. In other words, the whole string should exactly match this regular expression.
[\w-\.]+: I think you wanted to match letters, digits, dots and - only. In that case, the - should be escaped (\-): [\w\-\.]+. The plus-sign makes is match one or more times.
#: a literal # match
([\w-]+\.)+ letters, digits and - are allowed one or more times, with a dot after it (between the parentheses). This may occur several times (at least once).
[\w-]{2,4}: this should match the TLD, like com, net or org. Because a TLD can only contain letters, it should be replaced by [a-z]{2,4}. This means: lowercase letters may occur two till four times. Note that the TLD can be longer than 4 characters.
An regular expression which should follow the next rules:
a capital Q (Q)
followed by one or more occurrences of digits (\d+)
a literal dot (.)
two digits (\d{2})
one optional letter ([a-z]?)
Result:
var regex = /Q\d+\.\d{2}[a-z]?/;
If you need to match strings case-insensitive, add the i (case-insensitive) modifier:
var regex = /Q\d+\.\d{2}[a-z]?/i;
Validating a string using a regexp can be done in several ways, one of them:
if (regex.test(str)) {
// success
} else {
// no match
}
var emailRegex = /^Q\d+\.\d{2}[a-zA-Z]?#([\w-]+\.)+[a-zA-Z]+$/;
var str = "Q100.11#test.com";
alert(emailRegex.test(str));
var regex = /^Q[0-9]+\.[0-9]{2}[a-z]?$/;
+ means one or more
the period must be escaped - \.
[0-9]{2} means 2 digits, same as \d{2}
[a-z]? means 0 or 1 letter
You can check your regex at http://regexpal.com/

Categories