Regex back reference, what I am missing? - javascript

I am writing a micro-templating script where parts of a string are replaced with object options. Here is a short example:
var person = {name:"Smith",age:43};
var string = "I am {name} and I am {age} years old";
document.write(string.replace(/{([\s\S]+?)}/g,(person['$1']||"")));
document.write("<br/>");
document.write(string.replace(/{([\s\S]+?)}/g
, function($0,$1){return person[$1]||"";}));
​Also in a JS Fiddle
The second expression works fine, but not the first one. Could anybody explain why? I thought $1 could be directly used as a back reference within a string.

$1, $2, ..., $& can only be used when they're part of a string value passed to replace:
string.replace(/{([\s\S]+?)}/g, '$1(matched)');
// result: "I am name(matched) and I am age(matched) years old"
But, the 1st snippet is effectively:
document.write(string.replace(/{([\s\S]+?)}/g,""));
That's because (person['$1']||"") is not a value that can be passed as-is. It's a property-lookup and logical-or that will be evaluated first and their resulting value -- "" -- will be what's actually passed to replace.
To be able to evaluate an expression after you have a match, you have to use a function to delay the evaluation, as you have in the 2nd snippet.

It makes sense to me, I mean if you look at this:
person['$1']
You're basically saying "give me the object $1 inside person" which in this case is undefined. I guess it tries to evaluate the object before the regex capture group. That's why you have the function replacement and it works, for this circumstances.

RegExp.replace method attempts to parse $n expressions (as well as $& and several others - look here for complete list) only within strings. But something like person['$1'] is not a string - it's an expression which may be (or may be not) evaluated to string. Only after this evaluation the backreference 'markers' will be parsed.
I suppose the callback in replace is the only quite normal way to go in your case.

You don't have single quotes around the $1 in the second regex.

Related

Regular expression (match function), javascript

I think this is a very basic question, but I really can't understand the concept. I have the following regular expression:
var t = '11:59 am';
t.match(/^(\d+)/);
Now, according to my understanding when I print the value I should just get 11 since I am just checking for digits. However, I get 11,11. I have to use 0th element to pick the required value like t.match(/^(\d+)/)[0].
This is because you are using a capture group, (), around the digits. Try replacing this with:
t.match(/^\d+/);
Note: this will still return an array, because that's just what .match() does.
match() always returns an array if there are any matches. Element [0] is the whole match, and element [1] is what is inside the first set of parentheses.

JavaScript - Understanding the role of a variable in a function passed as an argument

I'm trying to work with regular expressions, in particular I've to shuffle the middle content of a string. I found an example that fits my needs Here and the answer I'm studying is the one by Brian Nickel.
This is the code proposed by Brian Nickel in the question:
myStr.replace(/\b([a-z])([a-z]+)([a-z])\b/ig, function(str, first, middle, last) {
return first +
middle.split('').sort(function(){return Math.random()-0.5}).join('') +
last;
});
I'm very beginner in JavaScript and RegEx, I see here a function is passed as an argument, but I don't understand why there are four parameters, in particular I do not understand the first parameter str and why if I remove it, function doesn't work anymore correctly.
I now it's a silly question, but I don't found what I want on the Web, or maybe I don't know how to search properly. Thanks in advance
When using replace with RegExp, the function use as callback receive 1+n parameters where n are the match inside parenthesis.
The always come in that order :
The complete matched string.
The first parenthesis.
The second parenthesis.
Go on...
If you remove the str and the argument first become the matched string. So even if you don't use this argument, you need it!

How can I search for {{.*}} and replace with json

I am trying to create my own javascript simple template function
I want to create a html page that will look like this
<p>
{{HELLO_WORLD}}
<br />
{{MY_NAME_IS}}
</p>
and than with javascript to replace anything that is in {{}}
with a json var that will look like this
{HELLO_WORLD: "Hello World!", MY_NAME_IS: "My name is"}
I am a little confused about the right method to do this.
the point is to make a multilanguage web site, that way I load the json for the desired language.
thank's.
JavaScript supports regular expression-based find-and-replace, with functions for the replacement. So you can do this:
myInputString.replace( /\{\{([^\}]*)\}\}/g, function( s, v ) { return myJSON[v] } );
To explain:
replace takes 2 arguments. The first is a regular expression object. In this case we build one inline using JavaScript's /expression/flags syntax. It looks for 2 opening braces (which need to be escaped because they have special meaning in regular expressions) followed by any characters which are not a closing brace, followed by 2 closing braces. The g means "global", so that it will match all cases rather than just the first one.
When a match is found, the function will be called. The first argument (I called it s) is the full matched string (like "{{abc}}"), the second (I called it v) is set to the first bit in brackets (like "abc").
In real code, you should add error checking (variables which don't exist), and possibly convert to lowercase / whatever.
Full details on replace are here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace

How to match between characters but not include them in the result

Say I have a string "&something=variable&something_else=var2"
I want to match between &something= and &, so I'll write a regular expression that looks like:
/(&something=).*?(&)/
And the result of .match() will be an array:
["&something=variable&", "&something=", "&"]
I've always solved this by just replacing the start and end elements manually but is there a way to not include them in the match results at all?
You're using the wrong capturing groups. You should be using this:
/&something=(.*?)&/
This means that instead of capturing the stuff you don't want (the delimiters), you capture what you do want (the data).
You can't avoid them showing up in your match results at all, but you can change how they show up and make it more useful for you.
If you change your match pattern to /&something=(.+?)&/ then using your test string of "&something=variable&something_else=var2" the match result array is ["&something=variable&", "variable"]
The first element is always the entire match, but the second one, will be the captured portion from the parentheses, which is much more useful, generally.
I hope this helps.
If you are trying to get variable out of the string, using replace with backreferences will get you what you want:
"&something=variable&something_else=var2".replace(/^.*&something=(.*?)&.*$/, '$1')
gives you
"variable"

RegEx inner content

Using JavaScript, I'm looking to pinpoint text that's inside two other strings WITHOUT including those strings. For example:
input: ONE example TWO
regular expression: (?=ONE).+(?=TWO)
matches: ONE example
I want: example
I'm really surprised that the question mark (which is supposed just include that string in the query but not the result) works on the end of the string, but not on the start.
Ah-ha! I figured it out.
for example, here's how to get text inside parenthesis without the parenthesis
(?<=\().+(?=\))
Here's a nice reference: http://www.regular-expressions.info/lookaround.html
Part of my confusion was javascript's fault. It evidently doesn't support "lookbehinds" natively. I found this workaround though:
http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript
(I use Python's re module to show the examples -- exactly how to do this depends on your regexp implementation [some don't have groups, for example -- or backreferences])
Use a backwards assertion, not a forward assertion, for the first assertion.
>>> re.search(r"(?<=ONE).+(?=TWO)", "ONE x a b TWO").group()
' x a b '
The problem is that the zero width assertion (?=ONE) matches the text "ONE", but doesn't "consume" it -- i.e. it just checks that it's there, but leaves the string as-is. Then the .+ starts reading text, and does consume it.
Backwards assertions don't look ahead, they look behind, so .+ doesn't get run until whatever is behind it is "ONE".
It is probably better not to bother with these at all, but use groups. Consider:
>>> re.search(r"ONE(.+)TWO", "ONE x a b TWO").group(1)
' x a b '

Categories