Matching "{CHARACTERS}" using Javascript RegExp

Matching "{CHARACTERS}" using Javascript RegExp - javascript

I'm really struggling with the Javascript version of Regular Expression matching, despite knowing how to do it in other languages like C# and PHP.
I wish to match {ANYCHARACTERS}.
It must have:
a { at the start
a } at the end
1 or more characters between (any characters, symbols etc.)
So far I have the following:
<script type="text/javascript">
// The string that I want to perform a match on
var str = "{ASTRINGINHERE£$%^&*éáó}";
// Mt Matching expression
var patt1 = ^/{(.*){1,*}/}$/i;
// Write the matched result
document.write(str.match(patt1));
</script>

As written, your current pattern should result in a javascript syntax error. Here are the problems I see:
You have your ^ character outside the actual regular expression.
You have two regular expression ending characters (/).
See #kopischke's answer on why I removed the {1,} portion.
This should resolve your issues:
/^{(.+)}$/i

The string start / string end codes belong inside the regex. Also, your repetition code is unnecessarily complex. Finally, there is no need to indicate case independence when you match any character. This should do:
patt1 = /^{.+}$/

Related

Negate random regular expression

Is there a way to negate any regular expression? I'm using regular expressions to validate input on a form. I'm now trying to create a button that sanitizes my input. Is there a way so I can use the regular expression used for the validating also for stripping the invalid characters?
I'm using this regex for validation of illegal characters
<input data-val-regex-pattern="[^|<>:\?'\*\[\]\=%\$\+,;~&\{\}]*" type="text" />
When clicking on a button next to it, I'm calling this function:
$('#button').click(function () {
var inputElement = $(this).prev();
var regex = new RegExp(inputElement.attr('data-val-regex-pattern'), 'g');
var value = inputElement.val();
inputElement.val(value.replace(regex, ''));
});
At the moment the javascript is doing the exact opposite of what I'm trying to accomplish. I need to find a way to 'reverse' the regex.
Edit: I'm trying to reverse the regex in the javascript function. The regex in the data-val-regex-pattern-attribute is doing his job for validation.

To find the invalid characters, just take the ^ off from your regex. The carret is the negative of everything that is inside the brackets.
data-val-regex-pattern="[|<>:\?'\*\[\]\=%\$\+,;~&\{\}]*"
This will return the undesired characters so you can replace them.
Also, as you want to take off a lot of non-word characters, you could try a simpler regex. If you want only word characters and spaces, you could use something like this:
data-val-regex-pattern="[\W\S]*"

Your reges is as so:
[^|<>:\?'\*\[\]\=%\$\+,;~&\{\}]*
That means, it matches any non-invalid character multiple times.
Then you replace this for empty, so you leave only the bad characters.
Try this instead, without the negation (hat moved somewhere else):
[|^<>:\?'\*\[\]\=%\$\+,;~&\{\}]*

The following answer is to the general question of negating a regular expression. In your specific case you just need to negate a character group, or more precisely remove the negation of a character group - which is detailed in other answers.
Regular languages – those consisting of all strings entirely by matched some RE – are in fact closed under negation: there is another RE which matches exactly those strings the original RE does not. It is however not trivial to construct, which perhaps explains why RE implementations often do not offer a negation operator.
However the Javascript regexp language has extensions that make it more expressive than regular languages; in particular there is the construct of negative lookahead.
If R1 is a regexp then
^(?!.*(R1))
matches precisely the strings that does not contain a match for R1.
And
^(?!R1$)
matches precisely the strings where the whole string is not a match for R1.
Ie. negation.
For rewriting any substring not matching a given regexp, the above is insufficient. One would have to do something like
((?!R1).)*
Which would catch any substring not containing a subsubstring that matches R1. - But consideration of the edge cases show that this does not quite do what we are after. For example ((?!ab).)* matches "b" in "ab", because "ab" is not a substring of "b".
One can cheat, and make your regexp like;
(.*)(R1|$)
And rewrite to T1$2
Where T1 is the target string you want to rewrite to.
This should rewrite any portion of the string not matching R1 to T1. However I would be very careful about any edge cases for this. So much so that it might be better to write the regexp from scratch rather than trying a general approach.

JavaScript regexp not matching

I am having a difficult time getting a seemingly simple Regexp. I am trying to grab the last occurrences of word characters between square brackets in a string. My code:
pattern = /\[(\w+)\]/g;
var text = "item[gemstones_attributes][0][shape]";
if (pattern.test(text)) {
alert(RegExp.lastMatch);
}
The above code is outputting "gemstones_attributes", when I want it to output "shape". Why is this regexp not working, or is there something wrong with my approach to getting the last match? I'm sure that I am making an obvious mistake - regular expressions have never been my string suit.
Edit:
There are cases in which the string will not terminate with a right-bracket.

You can greedily match as much as possible before your pattern which will result in your group matching only the last match:
pattern = /.*\[(\w+)\]/g;
var text = "item[gemstones_attributes][0][shape]";
var match = pattern.exec(text);
if (match != null) alert(match[1]);

RegExp.lastMatch gives the match of the last regular expression. It isn't the last match in the text.
Regular expressions parse left to right and are greedy. So your regexp matches the first '[' it sees and grabs the words between it. When you call lastMatch it gives you the last pattern matched. What you need is to match everything you can first .* and then your pattern.

i think your problem is in your regex not in your src line .lastMatch.
Your regex returns just the first match of your square brackets and not all matches. You can try to add some groups to your regular expression - and normally you should get all matches.
krikit

Use match() instead of test()
if (text.match(pattern))
test() checks for a match inside a string. This is successfull after the first occurence, so there is no need for further parsing.

Nice way to do this regex substitution

I'm writing a javascript function which takes a regex and some elements against which it matches the regex against the name attribute.
Let's say i'm passed this regex
/cmw_step_attributes\]\[\d*\]/
and a string that is structured like this
"foo[bar][]chicken[123][cmw_step_attributes][456][name]"
where all the numbers could vary, or be missing. I want to match the regex against the string in order to swap out the 456 for another number (which will vary), eg 789. So, i want to end up with
"foo[bar][]chicken[123][cmw_step_attributes][789][name]"
The regex will match the string, but i can't swap out the whole regex for 789 as that will wipe out the "[cmw_step_attributes][" bit. There must be a clean and simple way to do this but i can't get my head round it. Any ideas?
thanks, max

Capture the first part and put it back into the string.
.replace(/(cmw_step_attributes\]\[)\d*/, '$1789');
// note I removed the closing ] from the end - quantifiers are greedy so all numbers are selected
// alternatively:
.replace(/cmw_step_attributes\]\[\d*\]/, 'cmw_step_attributes][789]')

Either literally rewrite part that must remain the same in replacement string, or place it inside capturing brackets and reference it in replace.

See answer on: Regular Expression to match outer brackets.
Regular expressions are the wrong tool for the job because you are dealing with nested structures, i.e. recursion.

Have you tried:
var str = 'foo[bar][]chicken[123][cmw_step_attributes][456][name]';
str.replace(/cmw_step_attributes\]\[\d*?\]/gi, 'cmw_step_attributes][XXX]');

JavaScript split function - use of escape characters?

The following two examples do the same thing.
I was wondering why Option 1 is given in a code example I found and not Option 2?
What is the significance of the forward/backward slashes in '/\&/'
Option 1.
var pairs = qString.split(/\&/);
Option 2.
var pairs = qString.split('&');

split() is a function that can take a regex as well as a string parameter, the forward slash usage is something called a regex literal and it is not really passing a string but a regex object.
The following statements in javascript are the same.
var regex = /\&/; // Literal
var regex = new RegExp("\\&"); // Explicit

Option 1 uses a RegEx constant which is declared with surrounding forward slashed (/).
Option 2 uses a string.
See https://developer.mozilla.org/en/Core_JavaScript_1.5_Guide/Regular_Expressions

The first example splits on a regular expression (constructed using the leaning-toothpick (/.../) syntax), while the second splits on a plain string.
Regular expressions are a powerful sub-language that allow complex string matching; in this case, the overhead of using one to split on a literal character (while probably negligible) is a little silly. It's like hiring a top-notch architect to build a wooden cube.
In the first example, the & character is mistakenly escaped (with the \), since it is not special in regular expressions. The regular expression engine gracefully handles that, however, and still treats it as a literal &.

Javascript: String replace problem

I've got a string which contains q="AWORD" and I want to replace q="AWORD" with q="THEWORD". However, I don't know what AWORD is.. is it possible to combine a string and a regex to allow me to replace the parameter without knowing it's value? This is what I've got thus far...
globalparam.replace('q="/+./"', 'q="AWORD"');

What you have is just a string, not a regular expression. I think this is what you want:
globalparam.replace(/q=".+?"/, 'q="THEWORD"');
I don't know how you got the idea why you have to "combine" a string and a regular expression, but a regex does not need to exist of wildcards only. A regex is like a pattern that can contain wildcards but otherwise will try to match the exact characters given.
The expression shown above works as follows:
q=": Match the characters q, = and ".
.+?": Match any character (.) up to (and including) the next ". There must be at least one character (+) and the match is non-greedy (?), meaning it tries to match as few characters as possible. Otherwise, if you used .+", it would match all characters up to the last quotation mark in the string.
Learn more about regular expressions.

Felix's answer will give you the solution, but if you actually want to construct a regular expression using a string you can do it this way:
var fullstring = 'q="AWORD"';
var sampleStrToFind = 'AWORD';
var mat = 'q="'+sampleStrToFind+'"';
var re = new RegExp(mat);
var newstr = fullstring.replace(re,'q="THEWORD"');
alert(newstr);
mat = the regex you are building, combining strings or whatever is needed.
re = RegExp constructor, if you wanted to do global, case sensitivity, etc do it here.
The last line is string.replace(RegExp,replacement);

We Keep Coding

JavaScript is the programming language of the Web.