Need help writing a regex pattern

Need help writing a regex pattern - javascript

I am trying to find a pattern in a string that has a value that starts with ${ and ends with }. There will be a word between the curly brackets, but I won't know what word it is.
This is what I have \$\\{[a-zA-Z]\\}
${a} works, but ${aa} doesn't. It seems it's only looking for a single character.
I am unsure what I am doing wrong, or how to fix it and would appreciate any help anyone can provide.

I think this could help you
var str = "The quick brown ${fox} jumps over the lazy ${dog}";
var re = /\$\{([a-z]+)\}/gi;
var match;
while (match = re.exec(str)) {
console.log(match[1]);
}
Click Run code snippet and check your developer console for output
"fox"
"dog"
Explanation
+ means match 1 or more of the previous term — in this example, match 1 or more of [a-z]
the (...) parentheses will "capture" the match so you can actually do something with it — in my example, I'm just using console.log to output it
the i modifier (at the end of the regexp) means perform a case-insensitive match
the g modifier means match all instances of this regexp in the target string
The while loop will continue running for each match that re.exec finds. Once re.exec cannot match another instance, it will return null and the loop will exit.
Additional information
Try console.log(match) using the code above. Each match comes with other useful information such as the string index where the match occurred
Gotchas
This will not work for nested ${} sets
For example, this regexp will not work on "The quick brown ${fox jumps ${over}} the lazy ${dog}."

You're close!
All you need is to use a + to tell the expression that there will be one or more of whatever was just before it (in this case [a-zA-Z]) like this:
\${[a-zA-Z]+}

A good website for regex reference and testing is http://rubular.com/
It looks like you need to add a +, which tells the regex to look for one or more of a character.
Try: \${[a-zA-Z]+}

You need to use * (zero or more) or + (one or more). So this [a-zA-Z] would be [a-zA-Z]+, meaning 1 or more letters. The entire regex would look like:
\$\{[a-zA-Z]+\}

Related

Use regex to find and replace every second backtick in a string

I am trying to write a script that will replace every SECOND backtick with a backtick and semi colon. See below for expected behavior:
"`Here is my string`"
Needs to become:
"`Here is my string`;"
I have found a few helpful answers on stack, such as this one, this one and this one but when I try the replacement on this solution it selects all occurrences, rather than every second occurrence. And on this solution it selects every FIRST occurrence instead of every second one.
As of now I have tried...
str.replace(/\`.*?\`*/g, '`;')
...as well as...
str.replace('\w*\`\b/gm, '`;')
Both have gotten me close but I can't seem to just get every SECOND backtick by itself.

If you want to replace every second backtick, you might use a capturing group and a negated character class
In the replacement you could use $1;`
(`[^`]*)`
Explanation
( Capture group
`[^`]* Match a backtick, match 0+ times any char except a backtick using a negated character class
)` Close group 1 and match a backtick
Regex demo
const regex = /(`[^`]*)`/g;
const str = `\`Here is my string\` this is another test \`Here is my string\``;
const result = str.replace(regex, `$1\`;`);
console.log(result);

Finding the second backtick is easy, but finding every second backtick is harder. I think this should work:
"`Here is my string` and `another` and `another`".replace(/`.*?(`.*?`)*?`/g, '$&;');
// -> "`Here is my string`; and `another`; and `another`;"
Let's dig into what that regex means.
it finds 1st backtick, followed by anything. Note the ? in .*?: this makes the match lazy, so that finds the shortest match, not the longest.
it then finds an even number (0, 2, 4) of following backticks (1 + even = odd number of backticks in total), again separated by anything lazily (.*?).
it then finds a final backtick
it replaces that in the string with $& (= everything was matched) then adds the semicolon.
The g flag at the end then makes it global, so we replace every available match, not just the first one.
Depending on your input you might want to make this more rigorous, it's just a proof of concept. For potentially large inputs especially you may need to watch out for catastrophic backtracking with regular expressions including multiple .* sections like this.

You can try this
(`[^`]*)`
let str = "`Here is my string` some more string with ``` some more ``` and` and `"
let final = str.replace(/(`[^`]*)`/g,'$1`;')
console.log(final)

I will share the trick i spotted, then was learning regex. To match everything between two specific char, use this construction:
# -- character, that enclosing some content
#[^#]*#
In your case, i believe, you want to use this approach:
`[^`]+(`)
^
use * here, if you want match case, then two backticks do not
contain anything
Here you match every second backtick into first group. After that you can substitute this group to `;

This expression might simply do that:
const regex = /(.*`.*`)/gm;
const str = `\`Here is my string\``;
const subst = `$1;`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log(result);
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.

Inconsistent Regex Results [duplicate]

I need a regex that will only find matches where the entire string matches my query.
For instance if I do a search for movies with the name "Red October" I only want to match on that exact title (case insensitive) but not match titles like "The Hunt For Red October". Not quite sure I know how to do this. Anyone know?
Thanks!

Try the following regular expression:
^Red October$
By default, regular expressions are case sensitive. The ^ marks the start of the matching text and $ the end.

Generally, and with default settings, ^ and $ anchors are a good way of ensuring that a regex matches an entire string.
A few caveats, though:
If you have alternation in your regex, be sure to enclose your regex in a non-capturing group before surrounding it with ^ and $:
^foo|bar$
is of course different from
^(?:foo|bar)$
Also, ^ and $ can take on a different meaning (start/end of line instead of start/end of string) if certain options are set. In text editors that support regular expressions, this is usually the default behaviour. In some languages, especially Ruby, this behaviour cannot even be switched off.
Therefore there is another set of anchors that are guaranteed to only match at the start/end of the entire string:
\A matches at the start of the string.
\Z matches at the end of the string or before a final line break.
\z matches at the very end of the string.
But not all languages support these anchors, most notably JavaScript.

I know that this may be a little late to answer this, but maybe it will come handy for someone else.
Simplest way:
var someString = "...";
var someRegex = "...";
var match = Regex.Match(someString , someRegex );
if(match.Success && match.Value.Length == someString.Length){
//pass
} else {
//fail
}

Use the ^ and $ modifiers to denote where the regex pattern sits relative to the start and end of the string:
Regex.Match("Red October", "^Red October$"); // pass
Regex.Match("The Hunt for Red October", "^Red October$"); // fail

You need to enclose your regex in ^ (start of string) and $ (end of string):
^Red October$

If the string may contain regex metasymbols (. { } ( ) $ etc), I propose to use
^\QYourString\E$
\Q starts quoting all the characters until \E.
Otherwise the regex can be unappropriate or even invalid.
If the language uses regex as string parameter (as I see in the example), double slash should be used:
^\\QYourString\\E$
Hope this tip helps somebody.

Sorry, but that's a little unclear.
From what i read, you want to do simple string compare. You don't need regex for that.
string myTest = "Red October";
bool isMatch = (myTest.ToLower() == "Red October".ToLower());
Console.WriteLine(isMatch);
isMatch = (myTest.ToLower() == "The Hunt for Red October".ToLower());

You can do it like this Exemple if i only want to catch one time the letter minus a in a string and it can be check with myRegex.IsMatch()
^[^e][e]{1}[^e]$

What does this JavaScript Regular Expression /[^\d.-] mean?

We had a developer here who had added following line of code to a web application:
var amount = newValue.replace(/[^\d.-]/g, '');
The particular line deals with amount values that a user may enter into a field.
I know the following about the regular expression:
that it replaces the matches with empty strings (i.e. removes them)
that /g is a flag that means to match all occurrences inside "newValue"
that the brackets [] denote a special group
that ^ means beginning of the line
that d means digits
Unfortunately I do not know enough to determine what kind of strings this should match. I checked with some web-based regex testers if it matches e.g. strings like 98.- and other alternatives with numbers but so far no luck.
My problem is that it seems to make IE very slow so I need to replace it with something else.
Any help on this would be appreciated.
Edit:
Thanks to all who replied. I tried not just Google but sites like myregextester.com, regular-expressions.info, phpliveregex.com, and others. My problem was misunderstanding the meaning of ^ and expecting that this required a numeric string like 44.99.

Inside the group, when the ^ is the first character, it works as a negation of the character matches. In other words, it's saying match any character that are not the ones in the group.
So this will mean "match anything that is not a digit, a period, or a hyphen".

The ^ character is a negation character.
var newValue = " x44x.-x ";
var amount = newValue.replace(/[^\d.-]/g, '');
console.log(amount);
will print
44.-
I suspect the developer maybe just wanted to remove trailing whitespaces? I would rather try to parse the string for numbers and remove anything else.

Why doesn’t the alternation (pipe) operator ( | ) in JavaScript regular expressions give me two matches?

Here is my regular expression:
"button:not([DISABLED])".match(/\([^()]+\)|[^()]+/g);
The result is:
["button:not", "([DISABLED])"]
Is it correct? I'm confused. Because the (pipe) operator | means "or", I think the correct result is:
["button:not", "[DISABLED]", "([DISABLED])"]
Because this:
["button:not", "[DISABLED]"]
is the result of:
"button:not([DISABLED])".match(/[^()]+/g);
and this:
["([DISABLED])"]
is the result of:
"button:not([DISABLED])".match(/\([^()]+\)/g);
But the result output in console tell me the result is:
["button:not", "([DISABLED])"]
Where is the problem?

The regex
/\([^()]+\)|[^()]+/g
Basically says: There are two options, match (1) \([^()]+\) OR (2) [^()]+, wherever you see any of them (/g).
Let's iterate at your sample string so you understand the reason behind the obtained result.
Starting string:
button:not([DISABLED])
Steps:
The cursor begins at the char b (actually it begins at the start-of-string anchor, ^, but for this example it is irrelevant).
Between the two available options, b can only match the (2), as the (1) requires a starting (.
Now that it has begun to match the (2), it will keep on matching it all the way, meaning it will consume everything that's not a ( or ).
From the item above, it consumes everything up to (and including) the t char (because the next char is a ( which does not match [^()]+) thus leaving button:not as first matched string.
(room for clarity)
Now the cursor is at (. Does it begin to match any of the options? Yes, the first one: \([^()]+\).
Again, now that it has begun to match the (1), it will go through it all the way, meaning it will consume everything that's not a ( or ) until it finds a ) (if while consuming it finds a ( before a ), it will backtrack as that will mean the (1) regex was ultimately not matched).
Now it keeps consuming all the remaining characters until it finds ), leaving then ([DISABLED]) as second matched string.
(room for clarity)
Since we have reached the last character, the regex processing ends.
Edit: There's a very useful online tool that allows you to see the regex in a graphical form. Maybe it helps to understand how the regex will work:
You can also move the cursor step by step and see what I tried to explain above: live link.
Note about the precedence of expressions separed by |: Due to the way the JavaScript regex engine process the strings, the order in which the expressions appear matter. It will evaluate each alternative in the order they are given. If one is those options is matched to the end, it will not attempt to match any other option, even if it could. Hopefully an example makes it clearer:
"aaa".match(/a|aa|aaa/g); // ==> ["a", "a", "a"]
"aaa".match(/aa|aaa|a/g); // ==> ["aa", "a"]
"aaa".match(/aaa|a|aa/g); // ==> ["aaa"]

Your understanding of the alternation operator seems to be incorrect. It does not look for all possible matches, only for the first one that matches (from left to right).
Consider (a | b) as "match either a or b".
See also: http://www.regular-expressions.info/alternation.html

I’m not very good on regular expressions, but I think they work by giving you one thing that matches them, rather than all things that could match them.
So, the | operator says: “give me something that matches the left regular expression, or something that matches the right regular expression”.
As your string contains something that matches the left regular expression, you just get that.

Regex finds the best match, not all possible matches. The best match for that regex is "([DISABLED])", not "[DISABLED]" which is a subset of the "better" match.
Consider the following example:
"123 456789".match( /[0-9]{4,6}/g )
You want to find the one number that is between 4 and 6 digits long. If the result would be all possible numbers that match the regex, it wouldn't be of much use:
[ "4567", "5678", "6789", "45678", "56789", "456789" ] // you don't want this

JavaScript regexp not matching

I am having a difficult time getting a seemingly simple Regexp. I am trying to grab the last occurrences of word characters between square brackets in a string. My code:
pattern = /\[(\w+)\]/g;
var text = "item[gemstones_attributes][0][shape]";
if (pattern.test(text)) {
alert(RegExp.lastMatch);
}
The above code is outputting "gemstones_attributes", when I want it to output "shape". Why is this regexp not working, or is there something wrong with my approach to getting the last match? I'm sure that I am making an obvious mistake - regular expressions have never been my string suit.
Edit:
There are cases in which the string will not terminate with a right-bracket.

You can greedily match as much as possible before your pattern which will result in your group matching only the last match:
pattern = /.*\[(\w+)\]/g;
var text = "item[gemstones_attributes][0][shape]";
var match = pattern.exec(text);
if (match != null) alert(match[1]);

RegExp.lastMatch gives the match of the last regular expression. It isn't the last match in the text.
Regular expressions parse left to right and are greedy. So your regexp matches the first '[' it sees and grabs the words between it. When you call lastMatch it gives you the last pattern matched. What you need is to match everything you can first .* and then your pattern.

i think your problem is in your regex not in your src line .lastMatch.
Your regex returns just the first match of your square brackets and not all matches. You can try to add some groups to your regular expression - and normally you should get all matches.
krikit

Use match() instead of test()
if (text.match(pattern))
test() checks for a match inside a string. This is successfull after the first occurence, so there is no need for further parsing.

We Keep Coding

JavaScript is the programming language of the Web.

Need help writing a regex pattern - javascript

You're close! All you need is to use a + to tell the expression that there will be one or more of whatever was just before it (in this case [a-zA-Z]) like this: \${[a-zA-Z]+}

A good website for regex reference and testing is http://rubular.com/ It looks like you need to add a +, which tells the regex to look for one or more of a character. Try: \${[a-zA-Z]+}

You need to use * (zero or more) or + (one or more). So this [a-zA-Z] would be [a-zA-Z]+, meaning 1 or more letters. The entire regex would look like: \$\{[a-zA-Z]+\}

Related

Use regex to find and replace every second backtick in a string

Inconsistent Regex Results [duplicate]

What does this JavaScript Regular Expression /[^\d.-] mean?

Why doesn’t the alternation (pipe) operator ( | ) in JavaScript regular expressions give me two matches?

JavaScript regexp not matching

Categories

Resources