make regex find smallest groups in bbcode - javascript

I am working in Javascript and I have the following regex:
[img]([a-z0-9\-\./]+[^"\' ]*)[/img]/g
When I have the following text (with space separating between the 2 groups):
[img]http://www.bla.com[/img] [img]http://www.bla.com[/img]
the regex finds the 2 separate groups successfuly.
However when given the following text (without space separating between the 2 groups):
[img]http://www.bla.com[/img][img]http://www.bla.com[/img]
the regex does not separate it into 2 groups, but rather 1 big group with http://www.bla.com[/img][img]http://www.bla.com inside it.
What am I missing in order to make the regex find the smallest groups when they are not separated by a space?

You may use this regex:
/\[img]([-a-z0-9.\/]+[^"'\s]*?)\[\/img]/g
RegEx Demo
[ and / etc need to be escaped in regex to avoid it being interpreted as character class.
Using *? we use lazy quantifier to match as little as possible before matching [/img]
If we are placing - at the start or end in a character class then it doesn't need escaping
dot doesn't need to be escaped in a character class

why not just write it like this:
\[img](.*?)\[\/img]/g
notice: use ? to forbid greedy matching.

Related

javascript regular expression allow name with one space and special Alphabets

how to write regular expression allow name with one space and special Alphabets?
I tried with this [a-zA-Z]+(?:(?:\. |[' ])[a-zA-Z]+)* but not working for me,
example string Björk Guðmundsdóttir
You may try something along these lines:
^(?!.*[ ].*[ ])[ A-Za-zÀ-ÖØ-öø-ÿ]+$
The first negative lookahead asserts that we do not find two spaces in the name. This implies that at most one space is present (or no spaces at all). Then, we match any number of alphabets, with most accented letters included. Spaces can also be matched, but the lookahead would already ensure that at most one space can be present.
Demo
Use this one:
[a-zA-Z\u00C0-\u00ff]*[ ]{1}[a-zA-Z\u00C0-\u00ff]*
Answer from other question

RegEx matching help: won't match on each appearence

I need to write a little RegEx matcher which will match any occurrence of strings in the form of
[a-zA-Z]+(_[a-zA-Z0-9]+)?
If I use the regex above it does match the sections needed but would also match onto the abc part of 4_abc which is not intended. I tried to exclude it with:
(?:[^a-zA-Z0-9_]|^)([a-zA-Z]+(_[a-zA-Z0-9]+)?)(?:[^a-zA-Z0-9_]|$)
The problem is that the 'not' matches at the beginning and end are not really working like I hoped they would. If I use them on the example
a_d Dd_da 4_d d_4
they would block matching the second Dd_da because the space was used in the first match.Sadly I can't use lookarounds because I am using JS.
So the input:
a_d Dd_da 4_d d_4
should match: a_d, Dd_da and d_4
but matches: a_d (there is a space at the end)
Is there another way to match the needed sections, or to not consume the 'anchor' matches?
I really appreciate your help.
You can make use of \b:
\b[a-zA-Z]+(_[a-zA-Z0-9]+)?\b
\b matches the (zero-width) point where either the preceding character or following character is a letter, digit or underscore, but not both. It also matches with the start/end of the string if the first/last character is a letter, digit or underscore.

Matching variable-term equations

I am trying to develop a regular expression to match the following equations:
(Price+10%+100+200)
(Price+20%+200)
(Price+30%)
(Price+100)
(Price-10%-100-200)
(Price-20%-200)
(Price-30%)
(Price-100)
My regex so far is...
/([(])+([P])+([r])+([i])+([c])+([e])+([+]|[-]){1}([\d])+([+]|[-])?([\d])+([%])?([)])/g
..., but it only matches the following equations:
(Price+100+10%)
(Price+100+100)
(Price+200)
(Price-100-10%)
(Price-100-100)
(Price-200)
Can someone help me understand how to make my pattern match the full set of equations provided?
Note: Parentheses and 'Price' are musts in the equations that the pattern must match.
Try this, which matches all the input strings provided in the question:
/\(Price([+-]\d+%?){1,3}\)/g
You can test it in a regex fiddle.
Things to note:
Only use parentheses where you want to group. Parentheses around single-possibility, fixed-quantity matches (e.g. ([P]) provide no value.
Use character classes (opened with [ and closed with ]) for multiple characters that can match at a position in the pattern (e.g. [+-]). Single-possibility character classes (e.g. [P]) similarly provide no value.
Yes, character classes (generally) implicitly escape regex special characters within them (e.g. ( in [(] vs. equivalent \( outside a character class), but to just escape regex special characters (i.e. to match them literally), you are better off not using a character class and just escaping them (e.g. \() – unless multiple characters should match at a position in the pattern (per the previous point to note).
The quantifier {1} is (almost) always useless: drop it.
The quantifier + means "one or more" as you probably know. However, in a series of cases where you used it (i.e. ([(])+([P])+([r])+([i])+([c])+([e])+), it would match many values that I doubt you expect (e.g. ((((((PPPrriiiicccceeeeee): basically, don't overuse it. Stop to consider whether you really want to match one or more of the character (class) or group to which + applies in the pattern.
To match a literal string without any regex special characters like Price, just use the literal string at the appropriate position in the pattern – e.g. Price in \(Price.
/\(Price[+-](\d)+(%)?([+-]\d+%?)?([+-]\d+%?)?\)/g
works on http://www.regexr.com/
/^[(Price]+\d+\d+([%]|[)])&/i
try at your own risk!

Match simple regex pattern using JS (key: value)

I have a simple scenario where I want to match the follow and capture the value:
stuff_in_string,
env: 'local', // want to match this and capture the content in quotes
more_stuff_in_string
I have never written a regex pattern before so excuse my attempt, I am well aware it is totally wrong.
This is what I am trying to say:
Match "env:"
Followed by none or more spaces
Followed by a single or double quote
Capture all until..
The next single or double quote
/env:*?\s+('|")+(.*?)+('|")/g
Thanks
PS here is a #failed fiddle: http://jsfiddle.net/DfHge/
Note: this is the regex I ended up using (not the answer below as it was overkill for my needs): /env:\s+(?:"|')(\w+)(?:"|')/
You can use this:
/\benv: (["'])([^"']*)\1/g
where \1 is a backreference to the first capturing group, thus your content is in the second. This is the simple way for simple cases.
Now, other cases like:
env: "abc\"def"
env: "abc\\"
env: "abc\\\def"
env: "abc'def"
You must use a more constraining pattern:
first: avoid the different quotes problem:
/\benv: (["'])((?:[^"']+|(?!\1)["'])*)\1/g
I put all the possible content in a non capturing group that i can repeat at will, and I use a negative lookahead (?!\1) to check if the allowed quote is not the same as the captured quote.
second: the backslash problem:
If a quote is escaped, it can't be the closing quote! Thus you must check if the quote is escaped or not and allow escaped quotes in the string.
I remove the backslashes from allowed content:
/\benv: (["'])((?:[^"'\\]+|(?!\1)["'])*)\1/g
I allow escaped characters:
/\benv: (["'])((?:[^"'\\]+|(?!\1)["']|\\[\s\S])*)\1/g
To allow a variable number of spaces before the quoted part, you can replace : by :\s*
/\benv:\s*(["'])((?:[^"'\\]+|(?!\1)["']|\\[\s\S])*)\1/g
You have now a working pattern.
third: pattern optimization
a simple alternation:
Using a capture group and a backreferences can be seducing to deal with the different type of quotes since it allow to write the pattern in a concise way. However, this way needs to create a capture group and to test a lookahead in this part (?!\1)["']`, so it is not so efficient. Writing a simple alternation increases the pattern length and needs to use two captures groups for the two cases but is more efficient:
/\benv:\s*(?:"((?:[^"\\]+|\\[\s\S])*)"|'((?:[^'\\]+|\\[\s\S])*)')/g
(note: if you decided to do that you must check which one of the two capture groups is defined.)
unrolling the loop:
To match the content inside quotes we use (?:[^"\\]+|\\[\s\S])* (for double quotes here) that works but can be improved to reduce the amount of steps needed. To do that we will unroll the loop that consists to avoid the alternation:
[^"\\]*(?:\\[\s\S][^"\\]*)*
finally the whole pattern can be written like this:
/\benv:\s*(?:"([^"\\]*(?:\\[\s\S][^"\\]*)*)"|'([^'\\]*(?:\\[\s\S][^'\\]*)*)')/g
env *('|").*?\1 is what you're looking for
the * means none or more
('|") matches either a single or double quote, and also saves it into a group for backreferencing
.*? is a reluctant greedy match all
\1 will reference the first group, which was either a single or double quote
regex=/env: ?['"]([^'"])+['"]/
answer=str.match(regex)[1]
even better:
regex=/env: ?(['"])([^\1]*)\1/

Matching by beginning of a string or by a single character in JavaScript

I have two regular expressions:
/\/(\w\w+)/g
/(^\w\w+)/g
and am wondering if there's any way to combine them into a single regex? Basically I want to find any part of a string that either starts with /, or is the beginning of the string, and then is a word with 2 or more characters in it.
Thanks.
Yes you can:
/(?:^|\/)(\w{2,})/g
Use a non-capturing group to alternate between the starting conditions.
This will keep the capturing group number the same as in the originals too.

Categories