Regex for property paths

Regex for property paths - javascript

I am trying to match property-syntax with a Javascript regex. Is there a reliable way to do this? I would need to match a string like the following-
someobject.somekey.somechildkey.somegrandchildkey
I don't need the path members, I just need to know if a string contains a path. For example, given a string like this
This is some long string that contains a property.path.syntax, and I need to test it.

Try this:
/\b(?:\S+?\.)+\S+\b/g
Demo
This is bounded by two word boundaries, which should work in most cases (a word character next to a non-word character). Then we lazily repeat 1+ non-whitespace character followed by a . (which needs to be escaped). We use \S for non-whitespace, because like #TJCrowder said, properties can contain many characters. There always has to be another set of non-whitespace characters after the last period.

Working within the limits you've identified in the comments:
/(?:[a-zA-Z_$]+[\w$]*)(?:\.[a-zA-Z_$]+[\w$]*)+/g
Live Copy with details (The g flag if you need to do this repeated.)
That says:
Anything starting with a-z, A-Z, _, or $ (emphasizing again this is an incomplete list)
...followed by any number of those plus digits
Followed by one or more non-capturing groups of the same thing, but starting with a .
Or if you need it not to match one.that and should.not in:
blah one.that.1should.not blah
Then:
/(?:\s|^)((?:[a-zA-Z_$]+[\w$]*)(?:\.[a-zA-Z_$]+[\w$]*)+)(?:\s|$)/g
Live Copy
That says the same thing as the one earlier, but plus:
Tequires whitespace or beginning-of-input to start with ((?:\s|^)) and whitespace-or-end-of-input at the end ((?:\s|$)).
Uses a capture group so you can get just the property path without the optional whitespace on either side of it
Just to recap, the valid list of JavaScript identifier characters is very large, much larger than \w (which is [a-zA-Z0-9_]). It's not like some languages that only allow those characters. All sorts of normal-to-large-numbers-of-people characters are allowed, such as ç, ö, ñ (and arabic, and Japanese, and Chinese, and ...). And there are basically no limits on property names (e.g., if you exprss them as strings), only property name literals. More: http://ecma-international.org/ecma-262/5.1/#sec-7.6

var expr = /[a-zA-Z_]([a-zA-Z0-9_]*\.[a-zA-Z_][a-zA-Z0-9_]*)+/i;
expr.test("your.test.case");
The above regexp:
doesn't match .test
doesn't match test.
doesn't match test
doesn't match 0test, because it cannot be a Javascript property (you cannot start the name of a variable with a number)
EDIT: as suggested by Paulchenkiller, and also considering the i at the end stands by "case insensitive", you can also use the following shorter form:
var expr = /[a-z_](\w*\.[a-z_]\w*)+/i;

Related

Write a regex for usernames

I want a Regex for my mongoose schema to test if a username contains only letters, numbers and underscore, dash or dot. What I got so far is
/[a-zA-Z0-9-_.]/
but somehow it lets pass everything.

Your regex is set to match a string if it contains ANY of the contained characters, but it doesn't make sure that the string is composed entirely of those characters.
For example, /[a-zA-Z0-9-_.]/.test("a&") returns true, because the string contains the letter a, regardless of the fact that it also includes &.
To make sure all characters are one of your desired characters, use a regex that matches the beginning of the string ^, then your desired characters followed by a quantifier + (a plus means one or more of the previous set, a * would mean zero or more), then end of string $. So:
const reg = /^[a-zA-Z0-9-_.]+$/
console.log(reg.test("")) // false
console.log(reg.test("I-am_valid.")) // true
console.log(reg.test("I-am_not&")) // false

Try like this with start(^) and end($),
^[a-zA-Z0-9-_.]+$
See demo : https://regex101.com/r/6v0nNT/3

/^([a-zA-Z0-9]|[-_\.])*$/
This regex should work.
^ matches at the beginning of the string. $ matches at the end of the string. This means it checks for the entire string.
The * allows it to match any number of characters or sequences of characters. This is required to match the entire password.
Now the parentheses are required for this as there is a | (or) used here. The first stretch was something you already included, and it is for capital/lowercase letters, and numbers. The second area of brackets are used for the other characters. The . must be escaped with a backslash, as it is a reserved character in regex, used for denoting that something can be any character.

Why does the "." get not caught in the regex?

I want to have a regular epxresion, that allows that checks wether the email adress given is correct. Firstly, it will check if a specific provider is there, in this case (#test.de) - this is not the problem. However the email names that are allowed must consist only of letters or dots. so: .#test.de is valid. However this specific case does not get accepted. My regex looks like the following:
[A-Za-z\.]{1,}\b#test\.de\b
It works fine, for all other cases but if a "." is only in front of the #it does not fit.
Any pointers what I am doing wrong?

The first word boundary \b in your pattern requires that there must be a word char before #. Thus, a dot cannot appear there, the match is failed.
You need to remove the word boundary, use
[A-Za-z.]+#test\.de\b
Note you do not need to escape a dot inside a character class, it already denotes a literal dot.
If you still want to match "whole" words after removing \b, you might use lookbehinds (if the regex engine supports them):
(?<!\w)[A-Za-z.]+#test\.de\b
or to only match after whitespace/start of string:
(?<!\S)[A-Za-z.]+#test\.de\b
Or just use a word boundary if the name starts with a letter, and a non-word boundary if it starts with a dot:
(?:\b[A-Za-z]|\B\.)[A-Za-z.]*#test\.de\b
See this demo

Regex - must contain number and must not contain special character

I want to check by regex if:
String contains number
String does not contain special characters (!<>?=+#{}_$%)
Now it looks like:
^[^!<>?=+#{}_$%]+$
How should I edit this regex to check if there is number anywhere in the string (it must contain it)?

you can add [0-9]+ or \d+ into your regex, like this:
^[^!<>?=+#{}_$%]*[0-9]+[^!<>?=+#{}_$%]*$
or
^[^!<>?=+#{}_$%]*\d+[^!<>?=+#{}_$%]*$
different between [0-9] and \d see here

Just look ahead for the digit:
var re = /^(?=.*\d)[^!<>?=+#{}_$%]+$/;
console.log(re.test('bob'));
console.log(re.test('bob1'));
console.log(re.test('bob#'))
The (?=.*\d) part is the lookahead for a single digit somewhere in the input.

You only needed to add the number check, is that right? You can do it like so:
/^(?=.*\d)[^!<>?=+#{}_$%]+$/
We do a lookahead (like peeking at the following characters without moving where we are in the string) to check to see if there is at least one number anywhere in the string. Then we do our normal check to see if none of the characters are those symbols, moving through the string as we go.
Just as a note: If you want to match newlines (a.k.a. line breaks), then you can change the dot . into [\W\w]. This matches any character whatsoever. You can do this in a number of ways, but they're all pretty much as clunky as each other, so it's up to you.

Matching variable-term equations

I am trying to develop a regular expression to match the following equations:
(Price+10%+100+200)
(Price+20%+200)
(Price+30%)
(Price+100)
(Price-10%-100-200)
(Price-20%-200)
(Price-30%)
(Price-100)
My regex so far is...
/([(])+([P])+([r])+([i])+([c])+([e])+([+]|[-]){1}([\d])+([+]|[-])?([\d])+([%])?([)])/g
..., but it only matches the following equations:
(Price+100+10%)
(Price+100+100)
(Price+200)
(Price-100-10%)
(Price-100-100)
(Price-200)
Can someone help me understand how to make my pattern match the full set of equations provided?
Note: Parentheses and 'Price' are musts in the equations that the pattern must match.

Try this, which matches all the input strings provided in the question:
/\(Price([+-]\d+%?){1,3}\)/g
You can test it in a regex fiddle.
Things to note:
Only use parentheses where you want to group. Parentheses around single-possibility, fixed-quantity matches (e.g. ([P]) provide no value.
Use character classes (opened with [ and closed with ]) for multiple characters that can match at a position in the pattern (e.g. [+-]). Single-possibility character classes (e.g. [P]) similarly provide no value.
Yes, character classes (generally) implicitly escape regex special characters within them (e.g. ( in [(] vs. equivalent \( outside a character class), but to just escape regex special characters (i.e. to match them literally), you are better off not using a character class and just escaping them (e.g. \() – unless multiple characters should match at a position in the pattern (per the previous point to note).
The quantifier {1} is (almost) always useless: drop it.
The quantifier + means "one or more" as you probably know. However, in a series of cases where you used it (i.e. ([(])+([P])+([r])+([i])+([c])+([e])+), it would match many values that I doubt you expect (e.g. ((((((PPPrriiiicccceeeeee): basically, don't overuse it. Stop to consider whether you really want to match one or more of the character (class) or group to which + applies in the pattern.
To match a literal string without any regex special characters like Price, just use the literal string at the appropriate position in the pattern – e.g. Price in \(Price.

/\(Price[+-](\d)+(%)?([+-]\d+%?)?([+-]\d+%?)?\)/g
works on http://www.regexr.com/

/^[(Price]+\d+\d+([%]|[)])&/i
try at your own risk!

Regex to extract path tree from window.pathname

Say I access Javascript's window.pathname and get /you/are/here.
Is it possible to construct a regular expression that incrementally matches each part of the path starting from the beginning? In other words, my_regex.exec(window.pathname) would return an array of matches like this:
["/you", "/you/are", "/you/are/here", index: 0, input: "/you/are/here"]

No, regular expressions will not do it. You should match "/[a-zA-Z0-9]+" ( or something that captures the identifiers) and then create the strings by looping over the matches.

You should be able to run it like this:
location.pathname.match(/\w+/g)
That should return an array with all whole words. Of course, a path can also consist of spaces and underscores, as well as % in case of url encoding. So to cover those as well:
location.pathname.match(/[\w_\s.%]+/g)
The bracket creates a character class where any of the characters between the brackets are considered part of the character.
Inside the class we have \w for all regular characters (A-Za-z0-9), followed by underscore(_); any type of space (\s); a period (.); and finally a percentage sign (%).
After the character class we add + to say the want to find at least one, but as many as possible.
The g flag at the end forces it to become global, which should return an array with all hits.

We Keep Coding

JavaScript is the programming language of the Web.

Regex for property paths - javascript

Related

Write a regex for usernames

Why does the "." get not caught in the regex?

Regex - must contain number and must not contain special character

Matching variable-term equations

Regex to extract path tree from window.pathname

Categories

Resources