Parsing array syntax using regex in javascript - javascript

I found the answer for this here but it's in php.
I would like to match an array like [123, "hehe", "lala"] but only if the array syntax is correct.
I made this regex /\["?.+"?(?:,"?.+"?)*\]/.
The problem is that if the input is [123, "hehe, "lala"], the regex match, but the syntax is incorrect.
How can I make it only match if the array syntax is correct?
My problem is making the second " required when the first "is matched.
Edit: I'm only trying to do it only with strings and numbers inside the array.

You can try this regex: /\[((\d+|"([^"]|\\")*?")\s*,?\s*)*(?<!,)\]/
Each item should either
"([^"]|\\")*?": start and end with ", containing anything but ". If " is contained it should be escaped (\").
\d+: a number
After each item should be
\s*,?\s*: a comma with any number of spaces before or after.
And before the closing bracket should not be a comma: (?<!,)
Demo: https://regex101.com/r/jRAQUc/1

You must have two (or more) separate expressions (using the | operator) in order to do that.
So it would be something like this:
/\[\s*("[^"]*"|[0-9]+)(\s*,\s*("[^"]*"|[0-9]+))*\s*\]/
(You may also want to use ^ at the start and $ at the end to make sure nothing else appears before/after the array: /^...snip...$/ to match the string from start to finish.)
If you need floating point numbers with exponents, add a period and the 'e' character: [0-9.eE]+ (which is why I did not use \d+ because only digits are allowed in that case.) To make sure a number is valid, it's much more complicated, obviously (sign, exponent with/without sign, digits only before or after the decimal point...)
You could also support single quoted strings. That too is a separate expression: '[^']*'.
You may want to allow spaces before and after the brackets too (start: /^\s*\[... and end: ...\]\s*$/).
Finally, if you want to really support JavaScript strings you would need to add support for the backslash. Something like this: ("([^"]|\\.)*").
Note
Your .+ expression would match " and , too and without the ^ and $ an array as follow matches your expression just fine:
This Array ["test", 123, true, "this"] Here

Related

How to use 'or' operator in regex properly?

I am trying Javascript's regular expression.
I understand that '|' is used to or-ing two regular expression.
I created a regex /^a*|b*$/, and I want it to detect any string that contains only charater of 'a' or 'b'.
But when I try /^a*|b*$/.test('c'), it produces true?
What I am missing understading of '|' operator?
Here's my code:
let reg = /^a*|b*$/;
< undefined
reg.test('c');
< true
| has very low precedence. ^a*|b*$ matches
either ^a*
or b*$
i.e. either a string beginning with 0 or more 'a's or a string ending with 0 or more 'b's. (Because matching 0 'a's is allowed by the regex, any string will match (because every string has a beginning).)
To properly anchor the match on both sides, you need
/^(?:a*|b*)$/
(the (?: ) construct is a non-capturing group).
You could also use
/^a*$|^b*$/
instead.
Note that both of these regexes will only match strings like aa, bbbbbb, etc., but not aba. If you want to allow the use of mixed a/b characters in a string, you need something like
/^(?:a|b)*$/
The OR in your example will split the expression in these alternatives:
^a* and b*$.
You can use groups to delimit the alternatives
Something like
/^(a*|b*)$/
This will match empty strings, strings that contains only a characters and strings that contain only b characters.
If you're looking to match strings that contain both a and b characters, you can use something like:
/^[ab]*$/

Regex pattern in Javascript

I want to match a string pattern which has first 4 characters, then the "|" symbol, then 4 characters, then the "|" symbol again and then a minimum of 7 characters.
For example, "test|test|test123" should be matched.
I tried RegExp("^([a-za-z0-9-|](4)[a-za-z0-9-|](5)[a-za-z0-9-|](3)+)$") for this, but it didn't match my test case.
test|test|test1234
Ramesh, does this do what you want?
^[a-zA-Z0-9-]{4}\|[a-zA-Z0-9-]{4}\|[a-zA-Z0-9-]{7,}$
You can try it at https://regex101.com/r/jilO6O/1
For example, the following will be matched:
test|test|test123
a1-0|b100|c10-200
a100|b100|c100200
But the following will not:
a10|b100|c100200
a100|b1002|c100200
a100|b100|c10020
Tips on modifying your original code.
You have "a-za-z" where you probably intended "a-zA-Z", to allow either upper or lower case.
To specify the number of characters to be exactly 4, use "{4}". You were nearly there with your round brackets, but they need to be curly, to specify a count.
To specify a range of number of characters, use "{lowerLimit,upperLimit}". Leaving the upper limit blank allows unlimited repeats.
We need to escape the "|" character because it has the special meaning of "alternate", in regular expressions, i.e. "a|b" matches either "a" or "b". By writing it as "\|" the regex interpreter knows we want to match the "|" character itself.

Regex - must contain number and must not contain special character

I want to check by regex if:
String contains number
String does not contain special characters (!<>?=+#{}_$%)
Now it looks like:
^[^!<>?=+#{}_$%]+$
How should I edit this regex to check if there is number anywhere in the string (it must contain it)?
you can add [0-9]+ or \d+ into your regex, like this:
^[^!<>?=+#{}_$%]*[0-9]+[^!<>?=+#{}_$%]*$
or
^[^!<>?=+#{}_$%]*\d+[^!<>?=+#{}_$%]*$
different between [0-9] and \d see here
Just look ahead for the digit:
var re = /^(?=.*\d)[^!<>?=+#{}_$%]+$/;
console.log(re.test('bob'));
console.log(re.test('bob1'));
console.log(re.test('bob#'))
The (?=.*\d) part is the lookahead for a single digit somewhere in the input.
You only needed to add the number check, is that right? You can do it like so:
/^(?=.*\d)[^!<>?=+#{}_$%]+$/
We do a lookahead (like peeking at the following characters without moving where we are in the string) to check to see if there is at least one number anywhere in the string. Then we do our normal check to see if none of the characters are those symbols, moving through the string as we go.
Just as a note: If you want to match newlines (a.k.a. line breaks), then you can change the dot . into [\W\w]. This matches any character whatsoever. You can do this in a number of ways, but they're all pretty much as clunky as each other, so it's up to you.

JavaScript regular expression match amount

I'm trying to write a regular expression to match amounts. In my case, what I need is that either the amount should be a positive integer or if the decimal is used, it must be followed by one or two integers. So basically, the following are valid amounts:
34000
345.5
876.45
What I wrote was this: /[0-9]+(\.[0-9]{1,2}){0,1}/
My thinking was that by using parenthesis like so: (\.[0-9]{1,2}), I would be able to bundle the whole "decimal plus one or two integers" part. But it isn't happening. Among other problems, this regex is allowing stuff like 245. and 345.567 to slip through. :(
Help, please!
Your regular expression is good, but you need to match the beginning and end of the string. Otherwise, your regex can match only a portion of the string and still (correctly) return a match. To match the beginning of the string, use ^, for the end, use $.
Update: as Avinash has noted, you can replace {0,1} with ?. JS supports \d for digits, so the regex can be further simplified
Finally, since if are only testing against a regex, you can use a non-capturing group ( (?:...) instead of (...)), which offers better performance.
original:
/[0-9]+(\.[0-9]{1,2}){0,1}/.test('345.567')
Fixed, and faster ;)
/^\d+(?:\.\d{1,2})?$/.test('345.567')

What is this "/\,$/"?

Tried to search for /\,$/ online, but coudnt find anything.
I have:
coords = coords.replace(/\,$/, "");
Im guessing it returns coords string index number. What I have to search online for this, so I can learn more?
/\,$/ finds the comma character (,) at the end of a string (denoted by the $) and replaces it with empty (""). You sometimes see this in regex code aiming to clean up excerpts of text.
It's a regular expression to remove a trailing comma.
That thing is a Regular Expression, also known as regex or regexp. It is a way to "match" strings using some rules. If you want to learn how to use it in JavaScript, read the Mozilla Developer Network page about RegExp.
By the way, regular expressions are also available on most languages and in some tools. It is a very useful thing to learn.
That's a regular expression that finds a comma at the end of a string. That code removes the comma.
// defines a JavaScript regular expression, used to match a pattern within a string.
\,$ is the pattern
In this case \, translates to ,. A backslash is used to escape special characters, but in this case, it's not necessary. An example where it would be necessary would be to remove trailing periods. If you tried to do that with /.$/ the period here has a different meaning; it is used as a wildcard to match [almost] any character (aside for some newlines). So in this case to match on "." (period character) you would have to escape the wildcard (/\.$/).
When $ is placed at the end of the pattern, it means only look at the end of the string. This means that you can't mistakingly find a comma anywhere in the middle of the string (e.g., not after help in help, me,), only at the end (trailing). It also speeds of the regular expression search considerably. If you wanted to match on characters only at the beginning of the string, you would start off the pattern with a carat (^), for instance /^,/ would find a comma at the start of a string if one existed.
It's also important to note that you're only removing one comma, whereas if you use the plus (+) after the comma, you'd be replacing one or more: /,+$/.
Without the +; trailing commas,, becomes trailing commas,
With the +; no trailing comma,, becomes no trailing comma

Categories