replacing dots with _ using javascript globally? - javascript

im trying to replace all the dots and spaces in "
var name = "C. S. Lewis"
and replace them with _
and convert it into "C_S_LEWIS"
this is what i tried but it converts the whole thing into underscores (_)
var mystring = "C. S. Lewis";
var find = ".";
var regex = new RegExp(find, "g");
alert(mystring.replace(regex, "_"));

That's because dots need to be escaped in regular expressions (unless it's part of a character class). This expression should work:
var regex = /[.\s]+/g;
alert(mystring.replace(regex, '_'))
It matches a sequence of at least one period or space, which is then replaced by a single underscore in the subsequent .replace() call.
Btw, this won't save the new string back into mystring. For that you need to assign the results of the replacement operation back into the same variable:
mystring = mystring.replace(regex, '_')

The . means 'any character'. Use \. to get a literal dot, which means in your regex string you'd have to put \\. to get a literal underscore followed by a literal dot. But I'm not sure why you're making a string first - you could just do this:
var find = /\./g;
Of course, that's not what you're looking for - you want not just any dot, but just the dots followed by spaces. That's different:
var find = /\.\s+/g;

Related

Understanding some JavaScript with a RegExp

I have the following js code
var regex = new RegExp('([\'"]?)((?:\\\\\\1|.)+?)\\1(,|$)', 'g'),
key = regex.exec( m ),
val = regex.exec( m );
I would like to understand it.
In particular:
why there are all those backslash in the definition of the RegExp? I can clearly see that \\1 is a reference to the first saved element. Why in a new RegExp using ' and not " we need to use \\1 and not simple \1?
why there is a comma between the two definitions of key and val? I may guess that it depends on the "instances" finded using "g", but it is not very clear anyway to me.
I tried to execute the code with
m = 'batman, robin'
and the result is pretty a mess, and I cannot really explain it very well.
The code is taken from JQuery Cookbook, 2.12
why there are all those backslash in the definition of the RegExp?
"\\" is a string whose value is \. One backslash is used as an escape, the second for the value. Then, within the regex you also need to escape the backslash character again because backslash characters are used to mean special things within regex.
For example
"\\1"
is a string whose value is \1, which, in a regular expression, matches the first captured group.
"\\\\"
is a string whose value is \\, which, in a regular expression, matches a single \ character.
"\\\\\\1"
is a string whose value is \\\1, which, in a regular expression, matches a single \ followed by the first captured group.
This need to escape backslashes, and then escape them again is called "double escaping". The reason you need to double escape is so that you have the correct value within the regular expression. The first escape is to make sure that the string has the correct value, the second escape is so that the regular expression matches the correct pattern.
why there is a comma between the two definitions of key and val?
The code you posted is a variable declaration. It's easier to see when formatted:
var regex = ...,
key = ...,
val = ...;
Each of the variable names in the list are declared via the var keyword. It is the same as declaring the keywords separately:
var regex,
key,
val;
regex = ...
key = ...
val = ...
Which is the same as declaring each var with a different var keyword:
var regex = ...
var key = ...
var val = ...
There's a difference when writing dynamic regex objects and static regex objects. When you initialize a regex object with a string it needs to be transformed into a regex object. However, not only does the '\' holds a special value within regex objects but also within javascript strings, hence the double escape.
Edit: Regarding your second question. You can do multiple declarations with comma, like so:
var one = 'one',
two = 'two',
three = 'three';
2nd Edit: Here's what happens with your string once it compiles into a RegEx object.
/(['"]?)((?:\\\1|.)+?)\1(,|$)/g
The regex is better represented as a regex literal:
var regex = /(['"]?)((?:\\\1|.)+?)\1(,|$)/g;
Backslashes are used to escape special characters. For example, if your regex needs to match a literal period, writing . will not work, since . matches any character: you need to "escape" the period with a backslash: \..
Backslashes that are not themselves part of an escape sequence must be escaped, so if you want to match just a backslash in the text, you must escape it with a backslash: \\.
The reason your regular expression is so complicated when passed into the RegExp constructor is because you are representing the above regular expression as a string, which adds another "layer" of escaping. Thus, every single backslash must be escaped by yet another backslash and because the string is enclosed in single quotes, your single quote must be escaped with yet another backslash:
var regex = new RegExp('([\'"]?)((?:\\\\\\1|.)+?)\\1(,|$)', 'g'),

what does `\\s` mean in regular expression

I have a problem with regular expression:
var regex = new RegExp('(^|\\s)' + clsName + '(\\s|$)');
What does (^|\\s) mean? Isn't it equal to (^|\s), what does (^|) mean?
Am I right, it means that the string should start with any letter or white space? I tried to test with browser and console.log but still can't get any solution.
In all tutorials \s is used to be a space pattern not \\s.
Ok i got it, the problem was:
When using the RegExp constructor: for each backslash in your regular expression, you have to type \\ in the RegExp constructor. (In JavaScript strings, \\ represents a single backslash!) For example, the following regular expressions match all leading and trailing whitespaces (\s); note that \\s is passed as part of the first argument of the RegExp constructor:
re = /^\s+|\s+$/g
re = new RegExp('^\\s+|\\s+$','g')
(^|\\s) means: Start of the string (^) OR (|) a space \\s.
If clsName is "abc", for example, it builds the pattern (^|\\s)abc(\\s|$). That searches for "abc" at the start, middle, or end of the string, and it may be surrounded by spaces, so these are valid:
"abc"
"abc x"
"x abc"
"x abc y"
Note that here you are using a string to build a RegExp. JavaScript ignores escape characters it doesn't know - '\s' would be the same as 's', which isn't right.
Another option is to use word boundaries, but might fail on some case (for example, searching for btn would match for btn-primary):
var regex = new RegExp('\\b' + clsName + '\\b');
I'd also warn that clsName might contain regex meta-characters, so you may want to escape it.
Why not just split the string on " "?
var string = 'abc defh ij klm';
var elements = string.split(' ');
var clsName = 'abc';
elements.filter(function (el) {
return el === clsName;
});
No need for a RegEx like the one you posted.

Javascript Regex with variable and $1

I have read How do you pass a variable to a Regular Expression javascript
I'm looking to create a regular expression to get and replace a value with a variable..
section = 'abc';
reg = new RegExp('\[' + section + '\]\[\d+\]','g');
num = duplicate.replace(reg,"$1++");
where $1 = \d+ +1
and... without increment... it doesn't work...
it returns something like:
[abc]$1
Any idea?
Your regex is on the right track, however to perform any kind of operation you must use a replacement callback:
section = "abc";
reg = new RegExp("(\\["+section+"\\]\\[)(\\d+)(\\])","g");
num = duplicate.replace(reg,function(_,before,number,after) {
return before + (parseInt(number,10)+1) + after;
});
I think you need to read up more on Regular Expressions. Your current regular expression comes out to:
/[abc][d+]/g
Which will match an "a" "b" or "c", followed by a "d" or "+", like: ad or c+ or bd or even zebra++ etc.
A great resource to get started is: http://www.regular-expressions.info/javascript.html
I see at least two problems.
The \ character has a special meaning in JavaScript strings. It is used to escape special characters in the string. For example: \n is a new line, and \r is a carriage return. You can also escape quotes and apostrophes to include them in your string: "This isn't a normally \"quoted\" string... It has actual \" characters inside the string as well as delimiting it."
The second problem is that, in order to use a backreference ($1, $2, etc.) you must provide a capturing group in your pattern (the regex needs to know what to backreference). Try changing your pattern to:
'\\[' + section + '\\]\\[(\\d+)\\]'
Note the double-backslashes. This escapes the backslash character itself, allowing it to be a literal \ in a string. Also note the use of ( and ) (the capturing group). This tells the regex what to capture for $1.
After the regex is instantiated, with section === 'abc':
new RegExp('\\[' + section + '\\]\\[(\\d+)\\]', 'g');
Your pattern is now:
/\[abc\]\[(\d+)\]/g
And your .replace will return \d+++ (where \d+ is the captured digits from the input string).
Demo: http://jsfiddle.net/U46yx/

Javascript reg exp not right

Here is a string str = '.js("aaa").js("bbb").js("ccc")', I want to write a regular expression to return an Array like this:
[aaa, bbb, ccc];
My regular expression is:
var jsReg = /.js\(['"](.*)['"]\)/g;
var jsAssets = [];
var js;
while ((js = jsReg.exec(find)) !== null) {
jsAssets.push(js[1]);
}
But the jsAssets result is
[""aaa").js("bbb").js("ccc""]
What's wrong with this regular expression?
Use the lazy version of .*:
/\.js\(['"](.*?)['"]\)/g
^
And it would be better if you escape the first dot.
This will match the least number of characters until the next quote.
jsfiddle demo
If you want to allow escaped quotes, use something like this:
/\.js\(['"]((?:\\['"]|[^"])+)['"]\)/g
regex101 demo
I believe it can be done in one-liner with replace and match method calls:
var str = '.js("aaa").js("bbb").js("ccc")';
str.replace(/[^(]*\("([^"]*)"\)[^(]*/g, '$1,').match(/[^,]+/g);
//=> ["aaa", "bbb", "ccc"]
The problem is that you are using .*. That will match any character. You'll have to be a bit more specific with what you are trying to capture.
If it will only ever be word characters you could use \w which matches any word character. This includes [a-zA-Z0-9_]: uppercase, lowercase, numbers and an underscore.
So your regex would look something like this :
var jsReg = /js\(['"](\w*)['"]\)/g;
In
/.js\(['"](.*)['"]\)/g
matches as much as possible, and does not capture group 1, so it matches
"aaa").js("bbb").js("ccc"
but given your example input.
Try
/\.js\(('(?:[^\\']|\\.)*'|"(?:[\\"]|\\.)*"))\)/
To break this down,
\. matches a literal dot
\.js\( matches the literal string ".js("
( starts to capture the string.
[^\\']|\\. matches a character other than quote or backslash or an escaped non-line terminator.
(?:[\\']|\\.)* matches the body of a string
'(?:[\\']|\\.)*' matches a single quoted string
(...|...) captures a single quoted or double quoted string
)\) closes the capturing group and matches a literal close parenthesis
The second major problem is your loop.
You're doing a global match repeatedly which makes no sense.
Get rid of the g modifier, and then things should work better.
Try this one - http://jsfiddle.net/UDYAq/
var str = new String('.js("aaa").js("bbb").js("ccc")');
var regex = /\.js\(\"(.*?)\"\){1,}/gi;
var result = [];
result = str.match (regex);
for (i in result) {
result[i] = result[i].match(/\"(.*?)\"/i)[1];
}
console.log (result);
To be sure that matched characters are surrounded by the same quotes:
/\.js\((['"])(.*?)\1\)/g

JS - Finding a match in array keys using regexp

There is an array
var words =new Array(
'apple',
'apa',
'found',
'stackoverflow',
'will'
);
and a variable
var search = 'papa.com';
Now I want to set an expression like this
var Flag=false;
var regexp;
for(var i in words)
{
regexp = new RegExp('(^(.*\.))?' + words[i] + '\.([a-z]{2,3})(\.(\w+))?','i');
if (regexp.test(search)) {Flag=true;}
}
alert (Flag);
The loop is supposed to get words array keys one by one, then set the regular expression and test the search variable against the built expression, if there where one or more match the Flag will come out with true.
But, id doesn't work.
You need to escape your escape sequences when you build a regex from a string. This is because a \ in string literal notation also begins an escape sequence, and so the \ is removed.
To include a literal \ character in a string built from literal syntax, you need \\.
regexp = new RegExp('(^(.*\\.))?' + words[i] + '\\.([a-z]{2,3})(\\.(\\w+))?','i');
Your regex was ending up with . instead of \., which would of course have very different meaning.

Categories