I would like to ignore 'regularexpression' but also ignore space and Uppercase so if someone types 'regular expression' or 'reguLarExpression' it will still match and ignore. Can you please help.
^(?!(regularexpression)$)[a-zA-Z](?:[ ()'.\-a-zA-Z]*[a-zA-Z()])
I use this code in parsley.js:
data-parsley-pattern="^(?!(regularexpression)$)[a-zA-Z](?:[ ()'.\-a-zA-Z]*[a-zA-Z()])"
I have a set of words that I want to ignore but they don't have spaces or lower case. So, I need to cover the variations like in the example above.
Judging by your pattern, you want the whole string to only contain letters, spaces, (, ), ', . and - and should start with a letter and end with a letter or parentheses. Beside that, you are trying to negate the match if the string contains regular expression, regularexpression, RegULar ExpressioN, etc.
In parsley.js, you may use both string and regex literal patterns, i.e. data-parsley-pattern="\d+" = data-parsley-pattern="/^\d+$/". Note that string patterns are anchored by the framework automatically, while with the regex literal notation you need to add the anchors to make sure the whole string matches the regex.
As JavaScript regex does not support inline modifiers, you need to use the *regex literal notation with / as delimitiers.
The data-parsley-pattern will look like
data-parsley-pattern="/^(?!.*regular\s*expression)[a-zA-Z](?:[ ()'.a-zA-Z-]*[a-zA-Z()])?$/i"
See the regex demo. Note the /.../i: the i is the case insensitive flag here.
To add more exceptions, keep on adding (?!.*my\s*new\s*phrase), or use an alternation inside a single lookahead, (?!.*(?:regular\s*expression|my\s*new\s*phrase)). Also, use word boundaries if you need to match these phrases as whole words, e.g. (?!.*\b(?:regular\s*expression|my\s*new\s*phrase)\b).
Pattern details
^ - start of string
(?!.*regular\s*expression) - no match if there is regular + 0 or more whitespaces and then expression after any 0+ chars other than line break chars as many as possible
[a-zA-Z] - an ASCII letter
(?:[ ()'.a-zA-Z-]*[a-zA-Z()])? - an optional sequence of
[ ()'.a-zA-Z-]* - 0+ ASCII letters, space, (, ), ', . or -
[a-zA-Z()] - an ASCII letter or ( or )
$ - end of string.
JS demo:
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="http://cdn.jsdelivr.net/parsleyjs/2.0.0-rc5/parsley.js"></script>
<form id="parsley" data-parsley-validate>
<input type="text" name="the_name" id="the_id" data-parsley-pattern="/^(?!.*regular\s*expression)[a-zA-Z](?:[ ()'.a-zA-Z-]*[a-zA-Z()])?$/i" required>
<input type="submit" />
</form>
Unfortunately I do not know how to combine these together but someone will know but here are the Regex Patterns I would use.
Whitespace and Character Casing
For the word Regular Expression.
^[A-Za-z.\s_-]+$
(?:regularexpression|regular expression)
It's a bit difficult to guess, maybe this expression might be closer to what you have in mind:
(?i)^(?!(regular\s*expression)$)[a-z](?:[ ()'.a-z-]*[a-z()])$
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.
Related
Simple regex question. I have a string on the following format:
this is a [sample] string with [some] special words. [another one]
What is the regular expression to extract the words within the square brackets, ie.
sample
some
another one
Note: In my use case, brackets cannot be nested.
You can use the following regex globally:
\[(.*?)\]
Explanation:
\[ : [ is a meta char and needs to be escaped if you want to match it literally.
(.*?) : match everything in a non-greedy way and capture it.
\] : ] is a meta char and needs to be escaped if you want to match it literally.
(?<=\[).+?(?=\])
Will capture content without brackets
(?<=\[) - positive lookbehind for [
.*? - non greedy match for the content
(?=\]) - positive lookahead for ]
EDIT: for nested brackets the below regex should work:
(\[(?:\[??[^\[]*?\]))
This should work out ok:
\[([^]]+)\]
Can brackets be nested?
If not: \[([^]]+)\] matches one item, including square brackets. Backreference \1 will contain the item to be match. If your regex flavor supports lookaround, use
(?<=\[)[^]]+(?=\])
This will only match the item inside brackets.
To match a substring between the first [ and last ], you may use
\[.*\] # Including open/close brackets
\[(.*)\] # Excluding open/close brackets (using a capturing group)
(?<=\[).*(?=\]) # Excluding open/close brackets (using lookarounds)
See a regex demo and a regex demo #2.
Use the following expressions to match strings between the closest square brackets:
Including the brackets:
\[[^][]*] - PCRE, Python re/regex, .NET, Golang, POSIX (grep, sed, bash)
\[[^\][]*] - ECMAScript (JavaScript, C++ std::regex, VBA RegExp)
\[[^\]\[]*] - Java, ICU regex
\[[^\]\[]*\] - Onigmo (Ruby, requires escaping of brackets everywhere)
Excluding the brackets:
(?<=\[)[^][]*(?=]) - PCRE, Python re/regex, .NET (C#, etc.), JGSoft Software
\[([^][]*)] - Bash, Golang - capture the contents between the square brackets with a pair of unescaped parentheses, also see below
\[([^\][]*)] - JavaScript, C++ std::regex, VBA RegExp
(?<=\[)[^\]\[]*(?=]) - Java regex, ICU (R stringr)
(?<=\[)[^\]\[]*(?=\]) - Onigmo (Ruby, requires escaping of brackets everywhere)
NOTE: * matches 0 or more characters, use + to match 1 or more to avoid empty string matches in the resulting list/array.
Whenever both lookaround support is available, the above solutions rely on them to exclude the leading/trailing open/close bracket. Otherwise, rely on capturing groups (links to most common solutions in some languages have been provided).
If you need to match nested parentheses, you may see the solutions in the Regular expression to match balanced parentheses thread and replace the round brackets with the square ones to get the necessary functionality. You should use capturing groups to access the contents with open/close bracket excluded:
\[((?:[^][]++|(?R))*)] - PHP PCRE
\[((?>[^][]+|(?<o>)\[|(?<-o>]))*)] - .NET demo
\[(?:[^\]\[]++|(\g<0>))*\] - Onigmo (Ruby) demo
If you do not want to include the brackets in the match, here's the regex: (?<=\[).*?(?=\])
Let's break it down
The . matches any character except for line terminators. The ?= is a positive lookahead. A positive lookahead finds a string when a certain string comes after it. The ?<= is a positive lookbehind. A positive lookbehind finds a string when a certain string precedes it. To quote this,
Look ahead positive (?=)
Find expression A where expression B follows:
A(?=B)
Look behind positive (?<=)
Find expression A where expression B
precedes:
(?<=B)A
The Alternative
If your regex engine does not support lookaheads and lookbehinds, then you can use the regex \[(.*?)\] to capture the innards of the brackets in a group and then you can manipulate the group as necessary.
How does this regex work?
The parentheses capture the characters in a group. The .*? gets all of the characters between the brackets (except for line terminators, unless you have the s flag enabled) in a way that is not greedy.
Just in case, you might have had unbalanced brackets, you can likely design some expression with recursion similar to,
\[(([^\]\[]+)|(?R))*+\]
which of course, it would relate to the language or RegEx engine that you might be using.
RegEx Demo 1
Other than that,
\[([^\]\[\r\n]*)\]
RegEx Demo 2
or,
(?<=\[)[^\]\[\r\n]*(?=\])
RegEx Demo 3
are good options to explore.
If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
RegEx Circuit
jex.im visualizes regular expressions:
Test
const regex = /\[([^\]\[\r\n]*)\]/gm;
const str = `This is a [sample] string with [some] special words. [another one]
This is a [sample string with [some special words. [another one
This is a [sample[sample]] string with [[some][some]] special words. [[another one]]`;
let m;
while ((m = regex.exec(str)) !== null) {
// This is necessary to avoid infinite loops with zero-width matches
if (m.index === regex.lastIndex) {
regex.lastIndex++;
}
// The result can be accessed through the `m`-variable.
m.forEach((match, groupIndex) => {
console.log(`Found match, group ${groupIndex}: ${match}`);
});
}
Source
Regular expression to match balanced parentheses
(?<=\[).*?(?=\]) works good as per explanation given above. Here's a Python example:
import re
str = "Pagination.go('formPagination_bottom',2,'Page',true,'1',null,'2013')"
re.search('(?<=\[).*?(?=\])', str).group()
"'formPagination_bottom',2,'Page',true,'1',null,'2013'"
The #Tim Pietzcker's answer here
(?<=\[)[^]]+(?=\])
is almost the one I've been looking for. But there is one issue that some legacy browsers can fail on positive lookbehind.
So I had to made my day by myself :). I manged to write this:
/([^[]+(?=]))/g
Maybe it will help someone.
console.log("this is a [sample] string with [some] special words. [another one]".match(/([^[]+(?=]))/g));
if you want fillter only small alphabet letter between square bracket a-z
(\[[a-z]*\])
if you want small and caps letter a-zA-Z
(\[[a-zA-Z]*\])
if you want small caps and number letter a-zA-Z0-9
(\[[a-zA-Z0-9]*\])
if you want everything between square bracket
if you want text , number and symbols
(\[.*\])
This code will extract the content between square brackets and parentheses
(?:(?<=\().+?(?=\))|(?<=\[).+?(?=\]))
(?: non capturing group
(?<=\().+?(?=\)) positive lookbehind and lookahead to extract the text between parentheses
| or
(?<=\[).+?(?=\]) positive lookbehind and lookahead to extract the text between square brackets
In R, try:
x <- 'foo[bar]baz'
str_replace(x, ".*?\\[(.*?)\\].*", "\\1")
[1] "bar"
([[][a-z \s]+[]])
Above should work given the following explaination
characters within square brackets[] defines characte class which means pattern should match atleast one charcater mentioned within square brackets
\s specifies a space
+ means atleast one of the character mentioned previously to +.
I needed including newlines and including the brackets
\[[\s\S]+\]
If someone wants to match and select a string containing one or more dots inside square brackets like "[fu.bar]" use the following:
(?<=\[)(\w+\.\w+.*?)(?=\])
Regex Tester
I have done something like this
but its not working
can anyone please correct following regex.
/^[a-zA-Z.\s]+$/
You can use
/^[a-zA-Z]*$/
Change the * to + if you don't want to allow empty matches.
References:
Character classes ([...]), Anchors (^ and $), Repetition (+, *)
The / are just delimiters, it denotes the start and the end of the regex. One use of this is now you can use modifiers on it.
If you want to get only alphabets, remove . from regex. This will match all the alphabets and spaces.
/^[a-zA-Z\s]+$/
I'll also recommend you to use instead of \s
/^[a-zA-Z ]+$/
so that, other space characters(tabs, etc.) will not matched.
Thanks for taking a look.
My goal is to come up with a regexp that will match input that contains no digits, whitespace or the symbols !#£$%^&*()+= or any other symbol I may choose.
I am however struggling to grasp precisely how regular expressions work.
I started out with the simple pattern /\D/, which from my understanding will match the first non-digit character it can find. This would match the string 'James' which is correct but also 'James1' which I don't want.
So, my understanding is that if I want to ensure that a pattern is not found anywhere in a given string, I use the ^ and $ characters, as in /^\D$/. Now because this will only match a single character that is not a digit, I needed to use + to specify that 1 or more digits should not be founds in the entire string, giving me the expression /^\D+$/. Brilliant, it no longer matches 'James1'.
Question 1
Is my reasoning up to this point correct?
The next requirement was to ensure no whitespace is in the given string. \s will match a single whitespace and [^\s] will match the first non-whitespace character. So, from my understanding I just had to add this to what I have already to match strings that contain no digits and no whitespace. Again, because [^\s] will only match a single non-white space character, I used + to match one or more whitespace characters, giving the new regexp of /^\D+[^\s]+$/.
This is where I got lost, as the expression now matches 'James1' or even 'James Smith25'. What? Massively confused at this point.
Question 2
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Question 3
How would I go about writing the regular expression I'm trying to solve?
While I am keen to solve the problem I am more interested in figuring where my understanding of regular expressions is lacking, so any explanations would be helpful.
Not quite; ^ and $ are actually "anchors" - they mean "start" and "end", it's actually a little more complicated, but you can consider them to mean the start and end of a line for now - look up the various modifiers on regular expressions if you're interested in learning more about this. Unfortunately ^ has an overloaded meaning; if used inside square brackets it means "not", which is the meaning you are already acquainted with. It's very important that you understand the difference between these two meanings and that the definition in your head actually applies only to character range matching!
Contributing further to your confusion is that \d means "a numerical digit" and \D means "not a numerical digit". Similarly \s means "a whitespace (space/tab/newline/etc.) character" and \S means "not a whitespace character."
It's worth noting that \d is effectively a shortcut for [0-9] (note that - has a special meaning inside square brackets), and \D is a shortcut for [^0-9].
The reason it's matching strings that contain spaces is that you've asked for "1+ non-numerical digits followed by 1+ non-space characters" - so it'll match lots of strings! I think that perhaps you don't understand that regular expressions match bits of strings, you're not adding constraints as you go, but rather building up bots of matchers that will match bits of corresponding strings.
/^[^\d\s!#£$%^&*()+=]+$/ is the answer you're looking for - I'd look at it like this:
i. [] - match a range of characters
ii. []+ - match one or more of that range of characters
iii. [^\d\s]+ - match one or more characters that do not match \d (numerical digit) or \s (whitespace)
iv. [^\d\s!#£$%^&*()+=]+ - here's a bunch of other characters I don't want you to match
v. ^[^\d\s!#£$%^&*()+=]+$ - now there are anchors applied, so this matcher has to apply to the whole line otherwise it fails to match
A useful website to explore regexs is http://regexr.com/3b9h7 - which I supply with my suggested solution as an example. Edit: Pruthvi Raj's link to debuggerx is awesome!
Is my reasoning up to this point correct?
Almost. /\D/ matches any character other than a digit, but not just the first one (if you use g option).
and [^\s] will match the first non-whitespace character
Almost, [^\s] will match any non-whitespace character, not just the first one (if you use g option).
/^\D+[^\s]+$/ matching strings that contain spaces?
Yes, it does, because \D matches a space (space is not a digit).
Why is /^\D+[^\s]+$/ matching strings that contain spaces?
Because \D+ in /^\D+[^\s]+$/can match spaces.
Conclusion:
Use
^[^\d\s!#£$%^&*()+=]+$
It will match strings that have no digits and spaces, and the symbols you do not allow.
Mind that to match a literal -, ] or [ with a character class, you either need to escape them, or use at the start or end of the expression. To play it safe, escape them.
Just insert every character you don't want to include in a negated character class as follows:
^[^\s\d!#£$%^&*()+=]*$
DEMO
Debuggex Demo
^ - start of the string
[^...] - matches one character that is not in `...`
\s - matches a whitespace (space, newline,tab)
\d - matches a digit from 0 to 9
* - a quantifier that repeats immediately preceeding element by 0 or more times
so the regex matches any string that has
1. string that has a beginning
2. containing 0 or more number of characters that is not whitesapce, digit, and all the symbols included in the character class ( In this example !#£$%^&*()+=) i.e., characters that are not included in the character class `[...]`
3.that has ending
NOTE:
If the symbols you don't want it to have also includes - , a hyphen, don't put it in between some other characters because it is a metacharacter in character class, put it at last of character class
I am going through some legacy code and I came across this regular express:
var REGEX_STRING_REGEXP = /^\/(.+)\/([a-z]*)$/;
I am slightly confused as to what this regular expression signifies.
I have so far concluded the following:
Begin with /
Then any character (numeric, alphabetic, symbols, spaces)
then a forward slash
End with alphabetic characters
Can someone advice?
You can use a tool like Regexper to visualise your regular expressions. If we pass your regular expression into Regexper, we'll be given the following visualisation:
Direct link to Regexper result.
regex: /^/(.+)/([a-z]*)$/
^ : anchor regex to start of line
(.+) : 1 or more instances of word characters, non-word characters, or digits
([a-z]*) : 0 or more instances of any single lowercase character a-z
$ : anchor regex to end of line
In summary, your regular expression is looking to match strings where it is the first forwardslash, then 1 or more instances of word characters, non-word characters, or digits followed, then another forwardslash, then 0 or more instances of any single lowercase character a-z. Lastly, since both (.+) and ([a-z]*) are surrounded in parenthesis, they will capture whatever matches when you use them to perform regular expression operations.
I would suggest going to rubular, placing the regex ^/(.+)/([a-z]*)$ in the top field and playing with example strings in the test string box to better understand what strings will fit within that regex. (/string/something for example will work with your regular expression).
I'm trying to write a regular that will check for numbers, spaces, parentheses, + and -
this is what I have so far:
/\d|\s|\-|\)|\(|\+/g
but im getting this error: unmatched ) in regular expression
any suggestions will help.
Thanks
Use a character class:
/[\d\s()+-]/g
This matches a single character if it's a digit \d, whitespace \s, literal (, literal ), literal + or literal -. Putting - last in a character class is an easy way to make it a literal -; otherwise it may become a range definition metacharacter (e.g. [A-Z]).
Generally speaking, instead of matching one character at a time as alternates (e.g. a|e|i|o|u), it's much more readable to use a character class instead (e.g. [aeiou]). It's more concise, more readable, and it naturally groups the characters together, so you can do e.g. [aeiou]+ to match a sequence of vowels.
References
regular-expressions.info/Character Class
Caveat
Beginners sometimes mistake character class to match [a|e|i|o|u], or worse, [this|that]. This is wrong. A character class by itself matches one and exactly one character from the input.
Related questions
Regex: why doesn’t [01-12] range work as expected?
Here is an awesome Online Regular Expression Editor / Tester! Here is your [\d\s()+-] there.
/^[\d\s\(\)\-]+$/
This expression matches only digits, parentheses, white spaces, and minus signs.
example:
888-111-2222
888 111 2222
8881112222
(888)111-2222
...
You need to escape your parenthesis, because parenthesis are used as special syntax in regular expressions:
instead of '(':
\(
instead of ')':
\)
Also, this won't work with '+' for the same reason:
\+
Edit: you may want to use a character class instead of the 'or' notation with '|' because it is more readable:
[\s\d()+-]
Try this:
[\d\s-+()]