javascript regex: string contains this, but not that - javascript

I'm trying to put together a regex pattern that matches a string that does contain the word "front" and does NOT contain the word "square". I have can accomplish this individually, but am having trouble putting them together.
front=YES
^((?=front).)*$
square=NO
^((?!square).)*$
However, how to I combine these into as single regex expression?

You can use just a single negative lookahead for this:
/^(?!.*square).*front/
RegEx Demo
RegEx Details:
^: Start
(?!.*square) is negative lookahead to assert a failure if text square is present anywhere in input starting from the start position
.*front will match front anywhere in input

You could use lookahead assertions to express the logical and:
The final pattern would look like that:
^(?=.*?front)(?!.*square)

Related

How to not match given prefix in RegEx without negative lookbehind?

Goal
The goal is matching a string in JavaScript without certain delimiters, i.e. a string between two characters (the characters can be included in the match).
For example, this string should be fully matched: $ test string $. This can appear anywhere in a string. That would be trivial, however, we want to allow escaping the syntax, e.g. The price is 5\$ to 10\$.
Summarized:
Match any string that is enclosed by two $ signs.
Do not match it if the dollar signs are escaped using \$.
Solution using negative lookbehind
A solution that achieves this goal perfectly is: (?<!\\)\$(.*?)(?<!\\)\$.
Problem
This solution uses negative lookbehind, which is not supported on Safari. How can the same matches be achieved without using negative lookbehind (i.e. on Safari)?
A solution that partially works is (?<!\\)\$(.*?)(?<!\\)\$. However, this will also match the character in front of the $ sign if it is not a \.
You might rule out what you don't want by matching it, and capture what you want to keep in group 1
\\\$.*?\$|\$.*?\\\$|(\$.*?\$)
Regex demo
You may use this regex and grab your inner text using capture group #1 as you are already doing in your current regex using lookbehind:
(?:^|[^\\])\$((?:\\.|[^$])*)\$
RegEx Demo
RegEx Details:
(?:^|[^\\]): Match start position or a non-backslash character in a non-capturing group
\$: Match starting $
(: Start capturing group
(?:\\.|[^$])*: Match any escaped character or a non-$ character. Repeat this group 0 or more times
): End capturing group
\$: Match closing $
PS: This regex will give same matches as your current regex: (?<!\\)\$(.*?)(?<!\\)\$

RegEx matching help: won't match on each appearence

I need to write a little RegEx matcher which will match any occurrence of strings in the form of
[a-zA-Z]+(_[a-zA-Z0-9]+)?
If I use the regex above it does match the sections needed but would also match onto the abc part of 4_abc which is not intended. I tried to exclude it with:
(?:[^a-zA-Z0-9_]|^)([a-zA-Z]+(_[a-zA-Z0-9]+)?)(?:[^a-zA-Z0-9_]|$)
The problem is that the 'not' matches at the beginning and end are not really working like I hoped they would. If I use them on the example
a_d Dd_da 4_d d_4
they would block matching the second Dd_da because the space was used in the first match.Sadly I can't use lookarounds because I am using JS.
So the input:
a_d Dd_da 4_d d_4
should match: a_d, Dd_da and d_4
but matches: a_d (there is a space at the end)
Is there another way to match the needed sections, or to not consume the 'anchor' matches?
I really appreciate your help.
You can make use of \b:
\b[a-zA-Z]+(_[a-zA-Z0-9]+)?\b
\b matches the (zero-width) point where either the preceding character or following character is a letter, digit or underscore, but not both. It also matches with the start/end of the string if the first/last character is a letter, digit or underscore.

Regex Positive lookahead get first occurrence

I Have String like this, and I want capture characters between .html and the first slash
http://example.org/some-path/some-title-in-1978.html
This part some-title-in-1978, for that I came up with this regex:
/\/.+?(?=\.html)/ and result are not what i want, it's like this:
//domain.org/some-path/some-title-in-1978
Use the following regex pattern:
[^/]+(?=\.html)
https://regex101.com/r/wep2Im/1
[^/]+ - matches all characters that are followed by .html except forward slash

Regular expression match array of words excluding words found in contractions

Say I have an array of words, for example: (hi|ll|this|that|etc) and I want to find it in the following text:
Hi, I'll match this and ll too
I'm using: \\b(hi|ll|this|that|etc)\\b
But I want to only match whole words, excluding words found in contractions. Basically treat apostrophes as another "word seperator". In this case, it shouldn't match the "ll" in "I'll".
Ideas?
Use the apostrophe in addition to \b to begin and end a match:
(?:\b|')(hi|ll|this|that|etc)(?:\b|')
(?:...) means a non-capturing group. Stub on Regex101
If you want match just words you can try with:
(?:^|(?=[^']).\b)(hi|ll|th(?:is|at)|etc)\b
DEMO
and get words with group 1. However the \b will still allow to match fragments like: -this or #ll. I don't know is it desired result.

Regex to create a group from an entire line, or just up to a given token

I'm using a JavaScript Regex Engine.
The regex ^(.*?)\s*(?=[*\[]).* will capture a group containing all the characters up to a [ or * character. It works well with these lines, matching the entire line and capturing the first section:
This should be captured up to here[ but no further]
This should be captured up to this asterisk* but not after it*
However, I would like to also capture an entire line if it contains neither of these characters:
This entire line should be captured.
This regex ^(.*?)\s*(?=[*\[]).*|^(.*)$ will match the entire line, but it will not capture anything in group \1.
Is it possible to modify the lookahead so that it will also find no more characters?
Just add an end of the line anchor inside the positive lookahead assertion.
^(.*?)\s*(?=[*\[]|$)
DEMO
You can use this regex:
/^(.*?)\s*(?=[*\[]|[^*\[]$)/
RegEx Demo

Categories