I want to match only a dollar symbol without a backslash immediately before, as demonstrated below:
$not\$yes $
^.........^
So far, I have [^\\]\$, but this doesn't match any dollar that begins a line. The dollar could be the first symbol in the document, so matching a newline would not work. How do I match this? Is the regex I have so far even right?
You could use an alternation with the ^ anchor in order to match the $ character literally if it is the first character in the string or if it follows a character that is not a backslash.
/(?:^|[^\\])\$/
Explanation:
(?: - Start of a non-capturing group that is used to group the alternation.
^|[^\\] - Alternation that matches the start of the string using the ^ anchor or match a non-\ character
) - Close the non-capturing group that was used to group ^|[^\\]
\$ - The $ character literally
In other words, the ^ anchor will match the start of the string; while [^\\] will match anything but a backslash. The pipe | acts as an "or" operator that will match the start of the string or anything but a backslash (i.e., ^|[^\\]).
So in the string you provided, the first/last $ character would be matched.
Use a negative lookbehind assertion
(?<!\\)\$
In Action: https://regex101.com/r/dA8aA1/1
Related
I need a regular expression that matches the complete string with a zero/even number of backslashes anywhere in the string. If the string contains an odd number of backslashes, it should not match the complete string.
Example:
\\ -> match
\\\ -> does not match
test\\test -> match
test\\\test-> does not match
test\\test\ -> does not match
test\\test\\ -> match
and so on...
Note: We can assume any string of any length in place of 'test' in the above example
I am using this ^[^\\]*(\\\\)*[^\\]*$ regular expression, but it does not match the backslashes after the second test.
For example:
test\\test(doesn't match anything after this)
Thanks for any help in advance.
You may use this regex:
^(?:(?:[^\\]*\\){2})*[^\\]*$
RegEx Demo
RegEx Breakdown:
^: Start
(?:: Start non-capture group #1
(?:: Start non-capture group #2
[^\\]*: Match 0 or more og any char except a \
\\: Match a \
){2}: End non-capture group #2. Repeat this group 2 times.
)*: End non-capture group #1. Repeat this group 0 or more times.
[^\\]*: Match 0 or more og any char except a \
$: End
The current regular expression ^[^\\]*(\\\\)*[^\\]*$ can be interpreted as Any(\\)*Any, Where Any means any character except backslash.
The expected language shall be Any\\Any\\Any\\..., which can be obtained by containing the current regular expression in Kleene closure operator. That is (Any(\\)*Any)*
The original regular expression after modification:
^([^\\]*(\\\\)*[^\\]*)*$
It can be further optimized as:
^((\\\\)*[^\\]*)*$
I am writing a function to find attributes value from given string and given attribute name.
The input stings look like those below:
sip:+19999999999#trunkgroup2:5060;user=phone
<sip:+19999999999;tgrp=0180401;trunk-context=aaaa.aaaa.ca#10.10.10.100:8000;user=phone;transport=udp>
<sip:19999999999;tgrp=0306001;trunk-context=aaaa.aaaa.ca#10.10.10.100:8000;transport=udp>
<sip:+19999999999;tgrp=SMPPDIN;trunk-context=aaaa.aaaa.ca#10.10.10.100:8000;transport=udp>
After few hours I came out with this regular expression: /(\Wsip[:,+,=]+)(\w+)/g, but this is not working for the first example - as there is no not a word character before the attributes name.
How can I fix this expression to fetch both cases - <sip... and sip.. only when it is the beginning of the string.
I use this function to extract both sip and tgrp values.
Replace \W with \b, and use
\b(sip[:+=]+)(\w+)
Or, to match at the beginning of a string:
^\W?(sip[:+=]+)(\w+)
See the first regex demo and the second regex demo.
As \W is a consuming pattern matching any non-word char (a char other than a letter/digit/_) you won't have a match at the start of the string. A \b word boundary will match at the start of the string and in case there is a non-word char before s.
If you literally need to find a match at the beginning of a string after an optional non-word char, the \W must be replaced with ^\W? where ^ match the start of a string, and \W? matches 1 or 0 non-word chars.
Also, note that , inside a character class is matched as a literal ,. If you mean to use it to enumerate chars, you should remove it.
Pattern details:
\b - a word boundary
OR
^ - start of string
\W? - 1 or 0 (due to the ? quantifier) non-word chars (i.e. chars other than letters/digits and _)
(sip[:+=]+) - Group 1: sip substring followed with one or more :, + or = chars
(\w+) - Group 2: one or more word chars.
for begining of line use ^ and to make < is optional use ?
^<?(sip[:,+,=]+)(\w+)
I'm looking for a regular expression that can match both of these lines:
foo/bar
foo/bar baz
And capture foo, bar, and baz into separate match groups.
I've tried with this regex:
^([^\/]+)\/([^\/#]+)? (\w+)$
You can use below regex
^(\w+)\/(\w+)\s*(\w+)?$
^: Starts with anchor
(\w+): Match one or more word characters(alphabets, numbers and underscore) and add them to capturing group
\/: Match forward slash
\s*: Match any number of spaces
(\w+)?: Optional alphanumeric+underscore match
$: Ends with anchor
Here's demo on RegEx101.com.
This will match first word before / in first capture group which can be accessed by $1, word after / in second group-$2 and optional word in $3.
If there are other characters than \w i.e. [a-zA-Z0-9_], you can use below regex
^([^\/]+)\/(\S+)\s*(\S+)?$
Demo
[^\/]+ will match one or more characters except /. \S+ will match one or more non-space characters.
Try using this ^([^\/]+)\/([^\/#]+)\s*(\w*)$ with g and m flags.
I have to write a regex with matches following:
String should start with alphabets - [a-zA-Z]
String can contain alphabets, spaces, numbers, _ and - (underscore and hyphen)
String should not end with _ or - (underscore and hyphen)
Underscore character should not have space before and after.
I came up with the following regex, but it doesn't seems to work
/^[a-zA-Z0-9]+(\b_|_\b)[a-zA-Z0-9]+$/
Test case:
HelloWorld // Match
Hello_World //Match
Hello _World // doesn't match
Hello_ World // doesn't match
Hello _ World // doesn't match
Hello_World_1 // Match
He110_W0rld // Match
Hello - World // Match
Hello-World // Match
_HelloWorld // doesn't match
Hello_-_World // match
You may use
^(?!.*(?:[_-]$|_ | _))[a-zA-Z][\w -]*$
See the regex demo
Explanation:
^ - start of string
(?!.*(?:[_-]$|_ | _)) - after some chars (.*) there must not appear ((?!...)) a _ or - at the end of string ([_-]$), nor space+_ or _+space
[a-zA-Z] - the first char matched and consumed must be an ASCII letter
[\w -]* - 0+ word (\w = [a-zA-Z0-9_]) chars or space or -
$ - end of string
You could use this one:
^(?!^[ _-]|.*[ _-]$|.* _|.*_ )[\w -]*$
regex tester
For the test cases I used modifier gm to match each line individually.
If emtpy string should not be considered as acceptable, then change the final * to a +:
^(?!^[ _-]|.*[ _-]$|.* _|.*_ )[\w -]+$
Meaning of each part
^ and $ match the beginning/ending of the input
(?! ): list of things that should not match:
|: logical OR
^[ _-]: starts with any of these three characters
.*[ _-]$: ends with any of these three characters
.* _: has space followed by underscore anywhere
.*_: has underscore followed by space anywhere
[\w -]: any alphanumeric character or underscore (also matched by \w) or space or hyphen
*: zero or more times
+: one or more times
What about this?
^[a-zA-Z](\B_\B|[a-zA-Z0-9 -])*[a-zA-Z0-9 ]$
Broken down:
^
[a-zA-Z] allowed characters at beginning
(
\B_\B underscore with no word-boundary
| or
[a-zA-Z0-9 -] other allowed characters
)*
[a-zA-Z0-9 ] allowed characters at end
$
Oh! I love me some regex!
Would this work? /^[a-z]$|^[a-z](?:_(?=[^ ]))?(?:[a-z\d -][^ ]_[^ ])*[a-z\d -]*[^_-]$/i
I was a tad unsure of rule 4--do you mean underscores can have a space before or after or neither, but not before and after?
I am an amateur in JavaScript. I saw this other (now deleted) question, and it made me wonder. Can you tell me what does the below regular expression exactly mean?
split(/\|(?=\w=>)/)
Does it split the string with |?
The regular expression is contained in the slashes.
It means
\| # A pipe symbol. It needs to be scaped with a backslash
# because otherwise it means "OR"
(?= # a so-called lookahead group. It checks if its contents match
# at the current position without actually advancing in the string
\w=> # a word character (a-z, A-Z, 0-9, _) followed by =>
) # end of lookahead group.
It splits the string on | but only if its followed by a char in [a-zA-Z0-9_] and =>
Example:
It will split a|b=> on the |
It will not split a|b on the |
It splits the string on every '|' followed by (?) an alphanumerical character (\w, shorthand for [a-zA-Z0-9_]) + the character sequence '=>'.
Here's a link that can help you understand regular expressions in javascript
Breakdown of the regular expression:
/ regular expression literal start delimiter
\| match | in the string, | is a special character in regex, so \ is used to escape it
(?= Is a lookahead expression, it checks to see if a string follows the expression without matching it
\w=> matches any alphanumeric string (including _), followed by =>
)/ marks the end of the lookahead expression and the end of the regex
In short, the string will be split on | if it is followed by any alphanumeric character or underscore and then =>.
In this case, the pipe character is escaped so it's treated as a literal pipe. The split occurs on pipes that are followed by any alphanumeric and '=>'.
The '|' is also used in regular expressions as a sort of OR operator. For example:
split(/k|i|tt|y/)
Would split on either a 'k', an 'i', a 'tt' or a 'y' character.
Trimming the delimiting characters, we get \|(?=\w=>)
| is a special character in regex, so it should be escaped with a backslash as \|
(?=REGEX) is syntax for positive look ahead: matches only if REGEX matches, but doesn't consume the substring that matches REGEX. The match to the REGEX doesn't become part of the matched result. Had it been mere \|\w=>, the parent string would be split around |a=> instead of |.
Thus /\|(?=\w=>)/ matches only those | characters that are followed by \w=>. It matches |a=> but not |a>, || etc.
Consider the example string from the linked question: a=>aa|b=>b||b|c=>cc. If it wasn't for the lookahead, split will yield an array of [a=>aa, b||b, cc]. With lookahead, you'll get [a=>aa, b=>b||b, c=>cc], which is the desired output.