Regex to detect dates separated by newlines - javascript

I'm trying to validate text that's in the format of dates separated by newlines.
The date format needs to be in the form of MM-DD-YYYY.
So a sample could be
MM-DD-YYYY\n
MM-DD-YYYY\n
MM-DD-YYYY
Where there could be an infinite amount of dates entered that are separated by newlines
I've tried /^(\d{2})-(\d{2})-(\d{4})\s+$/ but that doesn't seem to fully work.
Note: I want this to allow for any leading, trailing whitespace and empty newlines as well.
Basically,
A space character
A carriage return character
A newline character
I'm not partial to using regexes. If another way is simpler, desired, more efficient, than I'd gladly switch to that. Thanks!

To validate a string with multiple date-like strings in it with or without leading/trailing whitespace, allowing empty/blank lines, you may use
A method to split the text into lines and use .every() to test each line against a simple pattern:
text.split("\n").every(x => /^\s*(?:\d{2}-\d{2}-\d{4}\s*)?$/.test(x))
NOTE: This will validate a blank input!
Details
^ - start of string
\s* - 0+ whitespaces
(?: - starts a non-capturing group
\d{2}-\d{2}-\d{4} - two digits, -, two digits, - and four digits
\s* - 0+ whitespaces
)? - end of the group, repeat 1 or 0 times (it is optional)
$ - end of string.
A single regex for the multiline string
/^\s*\d{2}-\d{2}-\d{4}(?:[^\S\n]*\n\s*\d{2}-\d{2}-\d{4})*\s*$/.test(text)
See the regex demo. This will not validate blank input.
This regex is long, but is still efficient since the backtracking is minimal (see [^\S\n]*\n\s* part where the first [^\S\n]* matches any whitespace but a line feed, then \n matches a newline (hence, no backtracking here) and then \s* matches 0+ whitespace (again, \n is not quantified so no backtracking into the pattern is \s* fails). The (?:[^\S\n]*\n\s*\d{2}-\d{2}-\d{4})* part is a * quantified non-capturing group that matches 0 or more occurrences of the quantified pattern sequence.
JS demos:
var matching_text = "\n01-01-2020\n 01-01-2020\n01-01-2020 \n\n\n 01-01-2020 \n";
var non_matching_text = "\n01-01-2020\n 01-01-2020\n01-01-2020 \n\n\n 01-01-2020 \n01201-01-20202020";
var regex_1 = /^\s*(?:\d{2}-\d{2}-\d{4}\s*)?$/;
var regex_2 = /^\s*\d{2}-\d{2}-\d{4}(?:[^\S\n]*\n\s*\d{2}-\d{2}-\d{4})*\s*$/;
// Test Solution 1:
console.log(matching_text.split("\n").every(x => regex_1.test(x))); // => true
console.log(non_matching_text.split("\n").every(x => regex_1.test(x))); // => false
// Test Solution 2:
console.log(regex_2.test(matching_text)); // => true
console.log(regex_2.test(non_matching_text)); // => false

You can use something like below to get all matches that satisfy the regular expression. Notice the parentheses () are only around the date part (\d{2}-\d{2}-\d{4}) so that is what you will end up capturing. Since the global flag g is also set on the regex, this will return all occurrences of the parenthesized expression.
Edit: added support for a leading and trailing whitespace.
Edit 2: added ^ and $ so the regex doesn't allow for more than 2 digits in day and more than 4 digits in year.
Run and test:
let regex = /[\\s]*(\d{2}-\d{2}-\d{4})[\\s]*[\\n]*/g;
let dates = " 12-02-2020 \n 09-10-2020\n 03-03-2020 ";
console.log( dates.match(regex) );
EDIT: In order to validate the string of dates you could use the regex.test() method like this:
let regex = /^\s*\d{2}-\d{2}-\d{4}\s*$/;
let dateString = " 12-02-2020 \n 09-10-2020\n 03-03-2020 ";
var dates = dateString.split('\n');
var datesValid = () => {
dates.forEach((el) => {
if(!regex.test(el))
return false;
});
return true;
};
console.log( datesValid() );

Related

What is the regex to match alphanumeric 6 character words, separated by space or comma

I am newbie in RegEx and trying to design a RegEx which could match the String like below:
pattern 1 separated by comma and a space: KEPM39, JEMGH5, HEPM21 ... (repeat)
pattern 2 separated only by a space: KEPM39 JEMGH5 HEPM21 ... (repeat)
pattern 3 separated only by a comma: KEPM39,JEMGH5,HEPM21 ... (repeat)
this is my concept: "^[a-zA-Z0-9]{6,}[,\s]+$" but it seems wrong.
#I want to validate the whole string, and I use javascript & html to validate user input. (textarea)
#duplicate change to repeat to be more suitable.
function validate(){
var term = "JEPM34, KEPM11 ";
var re = new RegExp("^[a-zA-Z0-9]{6,}[,\s]+$");
if (re.test(term)) {
return true
} else {
return false
}
}
thanks you in advance!
A very loose way to validate could be:
^[A-Z\d]{6}(?:[ ,]+[A-Z\d]{6})*$
See the online demo. With loose, I meant that [ ,]+ is not checking that each delimiter in your string is the same per definition. Therefor even "KEPM39, JEMGH5 HEPM21, HEGD44 ZZZZZZ" would be valid.
If you want consistent delimiters, and there can be trailing spaces (as there is in your example data) you can use a capture group with a backreference \1 to keep consistent delimiters and match optional spaces at the end.
Note that you can also use \s but that could also match a newline.
Using test will return a boolean, so you don't have to use return true or false but you can return the result test`
^[A-Z\d]{6}(?:(, ?| )(?:[A-Z\d]{6}\1)*[A-Z\d]{6} *)?$
The pattern matches:
^ Start of string
[A-Z\d]{6} Match 6 occurrences of a char A-Z or a digit
(?: Non capture group to match as a whole
(, ?| ) Capture group 1, match either a comma and optional space, or a space to be used as a backreference
(?:[A-Z\d]{6}\1)* Optionally repeat any of the listed followed by a backreference \1 to group 1 which will match the same delimiter
[A-Z\d]{6} * Match any of the listed and optional spaces at the end
)? Close the group and make it optional to also match an instance without delimiters
$ End of string
Regex demo
const regex = /^[A-Z\d]{6}(?:(, ?| )(?:[A-Z\d]{6}\1)*[A-Z\d]{6} *)?$/;
const validate = term => regex.test(term);
[
"KEPM39, JEMGH5, HEPM21",
"KEPM39 JEMGH5 HEPM21",
"KEPM39,JEMGH5,HEPM21",
"JEPM34, KEPM11 ",
"JEPM34, KEPM11",
"JEPM34",
"KEPM39, JEMGH5 HEPM21, HEGD44 ZZZZZZ",
"KEPM39, JEMGH5 HEPM21"
].forEach(s =>
console.log(`${s} ==> ${validate(s)}`)
);

Regex remove all leading and trailing special characters?

Let's say I have the following string in javascript:
&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&
I want to remove all the leading and trailing special characters (anything which is not alphanumeric or alphabet in another language) from all the words.
So the string should look like
a.b.c a.b.c a.b.c a.b.c a.b&.c a.b.&&dc ê.b..c
Notice how the special characters in between the alphanumeric is left behind. The last ê is also left behind.
This regex should do what you want. It looks for
start of line, or some spaces (^| +) captured in group 1
some number of symbol characters [!-\/:-#\[-``\{-~]*
a minimal number of non-space characters ([^ ]*?) captured in group 2
some number of symbol characters [!-\/:-#\[-``\{-~]*
followed by a space or end-of-line (using a positive lookahead) (?=\s|$)
Matches are replaced with just groups 1 and 2 (the spacing and the characters between the symbols).
let str = '&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&';
str = str.replace(/(^| +)[!-\/:-#\[-`\{-~]*([^ ]*?)[!-\/:-#\[-`\{-~]*(?=\s|$)/gi, '$1$2');
console.log(str);
Note that if you want to preserve a string of punctuation characters on their own (e.g. as in Apple & Sauce), you should change the second capture group to insist on there being one or more non-space characters (([^ ]+?)) instead of none and add a lookahead after the initial match of punctuation characters to assert that the next character is not punctuation:
let str = 'Apple &&& Sauce; -This + !That!';
str = str.replace(/(^| +)[!-\/:-#\[-`\{-~]*(?![!-\/:-#\[-`\{-~])([^ ]+?)[!-\/:-#\[-`\{-~]*(?=\s|$)/gi, '$1$2');
console.log(str);
a-zA-Z\u00C0-\u017F is used to capture all valid characters, including diacritics.
The following is a single regular expression to capture each individual word. The logic is that it will look for the first valid character as the beginning of the capture group, and then the last sequence of invalid characters before a space character or string terminator as the end of the capture group.
const myRegEx = /[^a-zA-Z\u00C0-\u017F]*([a-zA-Z\u00C0-\u017F].*?[a-zA-Z\u00C0-\u017F]*)[^a-zA-Z\u00C0-\u017F]*?(\s|$)/g;
let myString = '&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&'.replace(myRegEx, '$1$2');
console.log(myString);
Something like this might help:
const string = '&a.b.c. &a.b.c& .&a.b.c.&. *;a.b.c&*. a.b&.c& .&a.b.&&dc.& &ê.b..c&';
const result = string.split(' ').map(s => /^[^a-zA-Z0-9ê]*([\w\W]*?)[^a-zA-Z0-9ê]*$/g.exec(s)[1]).join(' ');
console.log(result);
Note that this is not one single regex, but uses JS help code.
Rough explanation: We first split the string into an array of strings, divided by spaces. We then transform each of the substrings by stripping
the leading and trailing special characters. We do this by capturing all special characters with [^a-zA-Z0-9ê]*, because of the leading ^ character it matches all characters except those listed, so all special characters. Between these two groups we capture all relevant characters with ([\w\W]*?). \w catches words, \W catches non-words, so \w\W catches all possible characters. By appending the ? after the *, we make the quantifier * lazy, so that the group stops catching as soon as the next group, which catches trailing special characters, catches something. We also start the regex with a ^ symbol and end it with an $ symbol to capture the entire string (they respectively set anchors to the start end the end of the string). With .exec(s)[1] we then execute the regex on the substring and return the first capturing group result in our transform function. Note that this might be null if a substring does not include proper characters. At the end we join the substrings with spaces.

Regular expression that matches 5 (exactly) comma separated currency values

I need to match 5 occurrences of comma separated currency values.
I do have this reg ex that does the job but I think that's not the great way to do it.
^(\$[0-9]{1,3}(?:[,.]?[0-9]{3})*(?:\.[0-9]{2})?,\s?){4}(\$[0-9]{1,3}(?:[,.]?[0-9]{3})*(?:\.[0-9]{2})?)$
P.S. I had to split the expression into matching, 4 comma separated occurrences and 1 to sniff out trailing comma (I don't think that's the way to do it)
Some of the valid matching inputs could be,
$200,000,$525,$60000,$120,000,$65,456 (space between currency values is optional)
$200,000, $525, $60000,$120,000, $65,456
Some of the invalid input values,
$200,000,$525,$60000,$120,000,$65,456, (Trailing comma)
$200,000,,$525,$60000.$120,000,$65,456,, etc
Any pointers would be greatly appreciated.
Edit: The solution I am looking at is a pure reg ex solution (better than what I have written above), so that I can fire validations as soon as erroneous inputs are entered by the user.
Update
If you want to match while validating prices you could do this which follows:
Including both dot and comma for formatting prices
Max one space character between prices
^\$\d+([,.]\d{3})*( ?, ?\$\d+([,.]\d{3})*){4}$
Live demo
Breakdown:
^ Match start of input string (or line if m flag is set)
\$\d+ Match a $ that preceds a number of digits
( Start of grouping (#1)
[,.]\d{3} Match a period or comma that preceds 3 digits
)* End of grouping (#1), match at least zero time
( Start of grouping (#2)
?, ? Match a comma surrounded by optional spaces (one space at either side)
\$\d+ Match a $ that preceds a number of digits
([,.]\d{3})* Match a period or comma that preceds 3 digits (thousand separator), match at least zero time
){4} End of grouping (#2), repeat exactly 4 times
$ End of input string (or line if m flag is set)
JS code:
var re = /^\$\d+([,.]\d{3})*( ?, ?\$\d+([,.]\d{3})*){4}$/g;
var prices = ['$200,000,$525,$60000,$120,000,$65,456',
'$200,000, $525, $60000,$120,000, $65,456',
'$200,000,$525,$60000,$120,000,$65,456, ',
'$200,000,,$525,$60000.$120,000,$65,456,,'];
prices.forEach(function(s) {
console.log(s + " => " + Boolean(s.match(re)))
})
This regex is a simpler version of what you're trying to achieve:
^(?:\$\d{1,3}(?:,?\d{3})*[,.] ?){4}\$\d{1,3}(?:,?\d{3})*$
-------------------------------
The underlined part matches 4 "prices" as you've defined, followed by a dot/comma and an optional space.
The rest matches the last "price".
Please let me know if something is unclear
The most prevalent character to base the pattern on is \$ (escaped), whether it is the first character of the string or preceded by a comma (optionally followed by whitespace), that is done using (?:^|,)\s*. After that you want any number of digits, which is \d+, optionally followed by a comma which is immediately followed by digits again; ,\d+.
Combining these, you'd get; /(?:^|,)\s*(\$\d+(?:,\d+)?)/g
const pattern = /(?:^|,|\.)\s*(\$\d+(?:,\d+)?)/g;
const test = [
'$200,000,$525,$60000,$120,000,$65,456',
'$200,000, $525, $60000,$120,000, $65,456',
'$200,000,$525,$60000,$120,000,$65,456,',
'$200,000,,$525,$60000.$120,000,$65,456,,',
];
const matches = test.reduce((carry, string) => {
let match = null;
while (match = pattern.exec(string)) {
carry.push(match[1]);
}
return carry;
}, []);
console.log(matches);
Added the extra examples from the modified question, including the . which now appeared as separator ($200,000,,$525,$60000.$120,000,$65,456,,) and modified the pattern in the example to account for this.

Regex for SpecialCharacters in bewteen in Javascript

I am trying to create regex which check if string has any special characters only in between. So I am checking for following cases:
"BX_#PO" -- Invalid
"40-66-7" -- Invalid
"_BXTP" -- Valid
"abc123?" -- Valid
"BXTP#" -- Valid
"PO#GO_" -- Invalid
I am trying below code but it which check special characters in anywhere in string and not only in between.
const hasSpecialCharacters = (str) => {
return !/[~`!#$%\^&*+=\-\[\]\\';,/{}|\\":<>\?]/g.test(str);
}
Try this regex:
^[\s\S][a-zA-Z0-9]*[\s\S]$
Click for Demo
Explanation:
^ - asserts the start of the string
[\s\S] - matches any character
[a-zA-Z0-9]* - matching 0+ occurrences of only letters and digits in the middle. Thus, not allowing special characters in the middle
[\s\S] - matches any character
$ - asserts the end of the string
Just like T.J's answer, but going for the readable way around:
const hasSpecialCharacters = (str) => {
return !/[a-zA-Z0-9][^a-zA-Z0-9]+[a-zA-Z0-9]/g.test(str)
}

javascript regex to check if first and last character are similar?

Is there any simple way to check if first and last character of a string are the same or not, only with regex?
I know you can check with charAt
var firstChar = str.charAt(0);
var lastChar = str.charAt(length-1);
console.log(firstChar===lastChar):
I'm not asking for this : Regular Expression to match first and last character
You can use regex with capturing group and its backreference to assert both starting and ending characters are same by capturing the first caharacter. To test the regex match use RegExp#test method.
var regex = /^(.).*\1$/;
console.log(
regex.test('abcdsa')
)
console.log(
regex.test('abcdsaasaw')
)
Regex explanation here :
^ asserts position at start of the string
1st Capturing Group (.)
.* matches any character (except newline) - between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\1 matches the same text as most recently matched by the 1st capturing group
$ asserts position at the end of the string
The . doesn't include newline character, in order include newline update the regex.
var regex = /^([\s\S])[\s\S]*\1$/;
console.log(
regex.test(`abcd
sa`)
)
console.log(
regex.test(`ab
c
dsaasaw`)
)
Refer : How to use JavaScript regex over multiple lines?
Regex explanation here :
[.....] - Match a single character present
\s - matches any whitespace character (equal to [\r\n\t\f\v ])
\S - matches any non-whitespace character (equal to [^\r\n\t\f ])
finally [\s\S] is matches any character.
You can try it
const rg = /^([\w\W]+)[\w\W]*\1$/;
console.log(
rg.test(`abcda`)
)
console.log(
rg.test(`aebcdae`)
)
console.log(
rg.test(`aebcdac`)
)
var rg = /^([a|b])([a|b]+)\1$|^[a|b]$/;
console.log(rg.test('aabbaa'))
console.log(rg.test('a'))
console.log(rg.test('b'))
console.log(rg.test('bab'))
console.log(rg.test('baba'))
This will make sure that characters are none other than a and b which have the same start and end.
It will also match single characters because they too start and end with same character.

Categories