I want to replace mm units to cm units in my code. In the case of the big amount of such replacements I use regexp.
I made such expression:
(?!a-zA-Z)mm(?!a-zA-Z)
But it still matches words like summa, gamma and dummy.
How to make up regexp correctly?
Use character classes and change the first (?!...) lookahead into a lookbehind:
(?<![a-zA-Z])mm(?![a-zA-Z])
^^^^^^^^^^^^^ ^^^^^^^^^^^
See the regex demo
The pattern matches:
(?<![a-zA-Z]) - a negative lookbehind that fails the match if there is an ASCII letter immediately to the left of the current location
mm - a literal substring
(?![a-zA-Z]) - a negative lookahead that fails the match if there is an ASCII letter immediately to the right of the current location
NOTE: If you need to make your pattern Unicode-aware, replace [a-zA-Z] with [^\W\d_] (and use re.U flag if you are using Python 2.x).
There's no need to use lookaheads and lookbehinds, so if you wish to simplify your pattern you can try something like this;
\d+\s?(mm)\b
This does assume that your millimetre symbol will always follow a number, with an optional space in-between, which I think that in this case is a reasonable assumption.
The \b checks for a word boundary to make sure the mm is not part of a word such as dummy etc.
Demo here
Related
I would like to ignore 'regularexpression' but also ignore space and Uppercase so if someone types 'regular expression' or 'reguLarExpression' it will still match and ignore. Can you please help.
^(?!(regularexpression)$)[a-zA-Z](?:[ ()'.\-a-zA-Z]*[a-zA-Z()])
I use this code in parsley.js:
data-parsley-pattern="^(?!(regularexpression)$)[a-zA-Z](?:[ ()'.\-a-zA-Z]*[a-zA-Z()])"
I have a set of words that I want to ignore but they don't have spaces or lower case. So, I need to cover the variations like in the example above.
Judging by your pattern, you want the whole string to only contain letters, spaces, (, ), ', . and - and should start with a letter and end with a letter or parentheses. Beside that, you are trying to negate the match if the string contains regular expression, regularexpression, RegULar ExpressioN, etc.
In parsley.js, you may use both string and regex literal patterns, i.e. data-parsley-pattern="\d+" = data-parsley-pattern="/^\d+$/". Note that string patterns are anchored by the framework automatically, while with the regex literal notation you need to add the anchors to make sure the whole string matches the regex.
As JavaScript regex does not support inline modifiers, you need to use the *regex literal notation with / as delimitiers.
The data-parsley-pattern will look like
data-parsley-pattern="/^(?!.*regular\s*expression)[a-zA-Z](?:[ ()'.a-zA-Z-]*[a-zA-Z()])?$/i"
See the regex demo. Note the /.../i: the i is the case insensitive flag here.
To add more exceptions, keep on adding (?!.*my\s*new\s*phrase), or use an alternation inside a single lookahead, (?!.*(?:regular\s*expression|my\s*new\s*phrase)). Also, use word boundaries if you need to match these phrases as whole words, e.g. (?!.*\b(?:regular\s*expression|my\s*new\s*phrase)\b).
Pattern details
^ - start of string
(?!.*regular\s*expression) - no match if there is regular + 0 or more whitespaces and then expression after any 0+ chars other than line break chars as many as possible
[a-zA-Z] - an ASCII letter
(?:[ ()'.a-zA-Z-]*[a-zA-Z()])? - an optional sequence of
[ ()'.a-zA-Z-]* - 0+ ASCII letters, space, (, ), ', . or -
[a-zA-Z()] - an ASCII letter or ( or )
$ - end of string.
JS demo:
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="http://cdn.jsdelivr.net/parsleyjs/2.0.0-rc5/parsley.js"></script>
<form id="parsley" data-parsley-validate>
<input type="text" name="the_name" id="the_id" data-parsley-pattern="/^(?!.*regular\s*expression)[a-zA-Z](?:[ ()'.a-zA-Z-]*[a-zA-Z()])?$/i" required>
<input type="submit" />
</form>
Unfortunately I do not know how to combine these together but someone will know but here are the Regex Patterns I would use.
Whitespace and Character Casing
For the word Regular Expression.
^[A-Za-z.\s_-]+$
(?:regularexpression|regular expression)
It's a bit difficult to guess, maybe this expression might be closer to what you have in mind:
(?i)^(?!(regular\s*expression)$)[a-z](?:[ ()'.a-z-]*[a-z()])$
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.
I'm attempting to match the first 3 letters that could be a-z followed by a specific character.
For testing I'm using a regex online tester.
I thought this should work (without success):
^[a-z]{0,3}$[z]
My test string is abcz.
Hope you can tell me what I'm doing wrong.
If you need to match a whole string abcz, use
/^[a-z]{0,3}z$/
^^
or - if the 3 letters are compulsory:
/^[a-z]{3}z$/
See the regex demo.
The $[z] in your pattern attempts to match a z after the end of string anchor, which makes the regex fail always.
Details:
^ - string start
[a-z]{0,3} - 0 to 3 lowercase ASCII letters (to require 3 letters, remove 0,)
z - a z
$ - end of string anchor.
You've got the end of line identifier too early
/^[a-z]{0,3}[z]$/m
You can see a working version here
You can do away with the [] around z. Square brackets are used to define a range or list of characters to match - as you're matching only one they're not needed here.
/^[a-z]{0,3}z$/m
I want to match all words which are starting with dollar sign but not slash and dollar sign.
I already try few regex.
(?:(?!\\)\$\w+)
\\(\\?\$\w+)\b
String
$10<i class="">$i01d</i>\$id
Expected result
*$10*
*$i01d*
but not this
*$id*
After find all expected matching word i want to replace this my object.
One option is to eliminate escape sequences first, and then match the cleaned-up string:
s = String.raw`$10<i class="">$i01d</i>\$id`
found = s.replace(/\\./g, '').match(/\$\w+/g)
console.log(found)
The big problem here is that you need a negative lookbehind, however, JavaScript does not support it. It's possible to emulate it crudely, but I will offer an alternative which, while not great, will work:
var input = '$10<i class="">$i01d</i>\\$id';
var regex = /\b\w+\b\$(?!\\)/g;
//sample implementation of a string reversal function. There are better implementations out there
function reverseString(string) {
return string.split("").reverse().join("");
}
var reverseInput = reverseString(input);
var matches = reverseInput
.match(regex)
.map(reverseString);
console.log(matches);
It is not elegant but it will do the job. Here is how it works:
JavaScript does support a lookahead expression ((?>)) and a negative lookahead ((?!)). Since this is the reverse of of a negative lookbehind, you can reverse the string and reverse the regex, which will match exactly what you want. Since all the matches are going to be in reverse, you need to also reverse them back to the original.
It is not elegant, as I said, since it does a lot of string manipulations but it does produce exactly what you want.
See this in action on Regex101
Regex explanation Normally, the "match x long as it's not preceded by y" will be expressed as (?<!y)x, so in your case, the regex will be
/(?<!\\)\$\b\w+\b/g
demonstration (not JavaScript)
where
(?<!\\) //do not match a preceding "\"
\$ //match literal "$"
\b //word boundary
\w+ //one or more word characters
\b //second word boundary, hence making the match a word
When the input is reversed, so do all the tokens in order to match. Furthermore, the negative lookbehind gets inverted into a negative lookahead of the form x(?!y) so the new regular expression is
/\b\w+\b\$(?!\\)/g;
This is more difficult than it appears at first blush. How like Regular Expressions!
If you have look-behind available, you can try:
/(?<!\\)\$\w+/g
This is NOT available in JS. Alternatively, you could specify a boundary that you know exists and use a capture group like:
/\s(\$\w+)/g
Unfortunately, you cannot rely on word boundaries via /b because there's no such boundary before '\'.
Also, this is a cool site for testing your regex expressions. And this explains the word boundary anchor.
If you're using a language that supports negative lookback assertions you can use something like this.
(?<!\\)\$\w+
I think this is the cleanest approach, but unfortunately it's not supported by all languages.
This is a hackier implementation that may work as well.
(?:(^\$\w+)|[^\\](\$\w+))
This matches either
A literal $ at the beginning of a line followed by multiple word characters. Or...
A literal $ this is preceded by any character except a backslash.
Here is a working example.
I am using javascript and trying to match mathematical functions such as f(2) or g(1200) but not match things like sqrt(2)
Right now I have
/[a-z]\([\d]+\)/
and that highlights these on regexr
g(2)
f(2)
f(x)
f(2) g(3)
sqrt(2)
everything is perfect except I don't want that part of sqrt to be selected. I want to lookbehind for letters but javascript doesn't support that, any idea on how I can work around this? Thanks :)
Just use a word boundary:
\b[a-z]\(\d+\)
See the regex demo
Since JS regex engine does not support a lookbehind, a logical way out is to utilize a word boundary assertion here: if the letter matched with [a-z] is not preceded with a word character (one from the [a-zA-Z0-9_] range), it is OK to match it - that is when \b comes in very handy.
Alsom, you do not have to place the only shorthand character class pattern into a character class, [\d] will match the same characters as \d.
I am trying to use regexp to match some specific key words.
For those codes as below, I'd like to only match those IFs at first and second line, which have no prefix and postfix. The regexp I am using now is \b(IF|ELSE)\b, and it will give me all the IFs back.
IF A > B THEN STOP
IF B < C THEN STOP
LOL.IF
IF.LOL
IF.ELSE
Thanks for any help in advance.
And I am using http://regexr.com/ for test.
Need to work with JS.
I'm guessing this is what you're looking for, assuming you've added the m flag for multiline:
(?:^|\s)(IF|ELSE)(?:$|\s)
It's comprised of three groups:
(?:^|\s) - Matches either the beginning of the line, or a single space character
(IF|ELSE) - Matches one of your keywords
(?:$|\s) - Matches either the end of the line, or a single space character.
Regexr
you can do it with lookaround (lookahead + lookbehind). this is what you really want as it explicitly matches what you are searching. you don't want to check for other characters like string start or whitespaces around the match but exactly match "IF or ELSE not surrounded by dots"
/(?<!\.)(IF|ELSE)(?!\.)/g
explanation:
use the g-flag to find all occurrences
(?<!X)Y is a negative lookbehind which matches a Y not preceeded by an X
Y(?!X) is a negative lookahead which matches a Y not followed by an X
working example: https://regex101.com/r/oS2dZ6/1
PS: if you don't have to write regex for JS better use a tool which supports the posix standard like regex101.com