I am trying to write regular expression for the following pattern in javascript but it is not working. Please can someone advise?
Pattern: ",anyONELetter,"
Sample string
James Yashiro,a,b,c Dean J Loza,a,b
Required string:
James Yashiro, Andrew J Loza
My javascript:
var string = string.replace(/[,^a-z{1},]/gi, '');
Everything inside a pair of unescaped [...] is treated as one symbol. [\s*] matches 1 whitespace or * character. Your [,^a-z{1},] regex matches 1 symbol that is either a , or ^, ASCII letter, {, or 1, or } or a comma.
You need the following regex to replace all the ,+standalone letter patterns with ,:
/(?:,[a-z]\b)+,*/gi
And replace with one ,. See the regex demo
The (?:,[a-z]\b)+,* matches one or more sequence of:
, a comma followed with
[a-z]\b - an ASCII letter (as a whole word because of \b after it)
followed with zero or more commas (,*).
Then, you will need to trim trailing , symbols from the string with another regex:
.replace(/,+$/, '')
The ,+$ matches one or more commas at the end of the string ($ asserts the position at the end of the string, so the commas inside the string will be kept intact).
JS demo:
var s = 'James Yashiro,a,b,c, Dean J Loza,a,b';
var res = s.replace(/(?:,[a-z]\b)+,*/gi, ',').replace(/,+$/, '');
document.write(res);
I think you should try a regex like this one:
/(,[a-z],*?)/gi
and replace with an empty string.
Example:
var test = 'James Yashiro,a,b,c, Dean J Loza,a,b';
var result = test.replace(/(,[a-z],*?)/gi, '');
document.write(result);
Related
I have the following regex in javascript for a split operation since I can't do a negative look behind to find any delimiters , in a string that is not proceeded by one or more escape characters of \.
[^\\],
The regex works fine for finding where the commas not proceeded by \ are, but also finds the character that proceeds the comma as a match and thus splits the string incorrectly.
For example if I had the string
hello\,there,are
The result would be that e, matches my regex and not just ,. Making the split string array read
[hello\,ther] [are]
Why does the regex I am using keep finding the comma and the proceeding character instead of only matching the comma?
You cannot use split here because you'd need a lookbehind that JS regex does not support. Use a match with appropriate regex. Like the one below:
/(?:[^\\,]|\\.)+/g
See the regex demo.
The pattern matches 1 or more (+) sequences of any char other than , and \ ([^\\,]) or (|) any escaped character (excluding linebreak chars) with \\.
JS demo:
var regex = /(?:[^\\,]|\\.)+/g;
var str = "hello\\,there,are";
var res = str.match(regex);
console.log(res);
i have a basic string and would like to get only specific charaters between the brackets
Base string: This is a test string [more or less]
regex: to capture all r's and e's works just fine.
(r|e)
=> This is a test string [more or less]
Now i want to use the following regex and group it with my regex to give only r's and e's between the brackets, but unfortunately this doesn't work:
\[(r|e)\]
Expected result should be : more or less
can someone explain?
edit: the problem is very similar to this one: Regular Expression to find a string included between two characters while EXCLUDING the delimiters
but with the difference, that i don't want to get the whole string between the brackets.
Follow up problem
base string = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed'
I need a regex for finding the non-ascii characters äöü in order to replace them but only in the link:...] substring which starts with the word link: and ends with a ] char.
The result string will look like this:
result string = 'this is a link:/en/test/apfel/ohr[MyLink_with_aou] BREAK äöü is now allowed again'
The regex /[äöü]+(?=[^\]\[]*])/g from the solution in the comments only delivers the äöü chars between the two brackets.
I know that there is a forward lookahead with a char list in the regex, but i wonder why this one does not work:
/link:([äöü]+(?=[^\]\[]*])/
thanks
You can use the following solution: match all between link: and ], and replace your characters only inside the matched substrings inside a replace callback method:
var hashmap = {"ä":"a", "ö":"o", "ü":"u"};
var s = 'this is a link:/en/test/äpfel/öhr[MyLink_with_äöü] BREAK äöü is now allowed';
var res = s.replace(/\blink:[^\]]*/g, function(m) { // m = link:/en/test/äpfel/öhr[MyLink_with_äöü]
return m.replace(/[äöü]/g, function(n) { // n = ä, then ö, then ü,
return hashmap[n]; // each time replaced with the hashmap value
});
});
console.log(res);
Pattern details:
\b - a leading word boundary
link: - whole word link with a : after it
[^\]]* - zero or more chars other than ] (a [^...] is a negated character class that matches any char/char range(s) but the ones defined inside it).
Also, see Efficiently replace all accented characters in a string?
i need a regular expression or any other method to add whitespaces between numbers and letters in a string.
Example:
"E2356" => "E 2356"
"E123-F456" => "E 123-F 456"
I already found a regular expression capable of it but it is not possible with Javascript:
(?<=[^0-9])(?=[0-9])
Thanks!
Instead of a look-behind, just match the non-digit:
[^0-9](?=[0-9])
And replace with "$& ".
The [^0-9] subpattern will match 1 character that is not a digit that can be referenced with $& (the whole matched text) in the replacement pattern. (?=[0-9]) lookahead will make sure there is a digit right after it.
See demo
var re = /[^0-9](?=[0-9])/g;
var str = 'E2356<br/>E123-F456';
var result = str.replace(re, '$& ');
document.write(result);
Match the two-character sequence of letter followed by number, with capture groups for both the letter and number, then use String#replace with the $1 and $2 placeholders to refer to the content of the capture groups, with a space in between.
str.replace(/([^0-9])([0-9])/g, '$1 $2')
^^$1^^ ^^$2^
The g flag ensures all occurrences are replaced, of course.
Use String#replace:
'E123-F456'.replace(/([A-Z])(\d)/g, '$1 $2')
// >>> "E 123-F 456"
$1 and $2 are the captured groups from the regex and are separated by a space. The expression assumes you only have uppercase characters. Remember to add the g flag to your expression to replace every occurrence.
You cannot format a string using regex.
Regex helps you validate that whether a certain string follow the language described by expression.
Regex helps you capture certain parts of the string in different variables and then format them as you want to get the desired output.
So I would suggest you do something like this :
var data = "E2304" ;
var regex = ([^0-9])([0-9]*)/g ;
data.replace(/regex, '$1 $2') ;
Try below code
var test = "E123-F456".match(/[a-zA-Z]+|[0-9]+/g);
console.log(test.join(' '));
fiddle http://jsfiddle.net/anandgh/h2g8cnha/
Here is a string str = '.js("aaa").js("bbb").js("ccc")', I want to write a regular expression to return an Array like this:
[aaa, bbb, ccc];
My regular expression is:
var jsReg = /.js\(['"](.*)['"]\)/g;
var jsAssets = [];
var js;
while ((js = jsReg.exec(find)) !== null) {
jsAssets.push(js[1]);
}
But the jsAssets result is
[""aaa").js("bbb").js("ccc""]
What's wrong with this regular expression?
Use the lazy version of .*:
/\.js\(['"](.*?)['"]\)/g
^
And it would be better if you escape the first dot.
This will match the least number of characters until the next quote.
jsfiddle demo
If you want to allow escaped quotes, use something like this:
/\.js\(['"]((?:\\['"]|[^"])+)['"]\)/g
regex101 demo
I believe it can be done in one-liner with replace and match method calls:
var str = '.js("aaa").js("bbb").js("ccc")';
str.replace(/[^(]*\("([^"]*)"\)[^(]*/g, '$1,').match(/[^,]+/g);
//=> ["aaa", "bbb", "ccc"]
The problem is that you are using .*. That will match any character. You'll have to be a bit more specific with what you are trying to capture.
If it will only ever be word characters you could use \w which matches any word character. This includes [a-zA-Z0-9_]: uppercase, lowercase, numbers and an underscore.
So your regex would look something like this :
var jsReg = /js\(['"](\w*)['"]\)/g;
In
/.js\(['"](.*)['"]\)/g
matches as much as possible, and does not capture group 1, so it matches
"aaa").js("bbb").js("ccc"
but given your example input.
Try
/\.js\(('(?:[^\\']|\\.)*'|"(?:[\\"]|\\.)*"))\)/
To break this down,
\. matches a literal dot
\.js\( matches the literal string ".js("
( starts to capture the string.
[^\\']|\\. matches a character other than quote or backslash or an escaped non-line terminator.
(?:[\\']|\\.)* matches the body of a string
'(?:[\\']|\\.)*' matches a single quoted string
(...|...) captures a single quoted or double quoted string
)\) closes the capturing group and matches a literal close parenthesis
The second major problem is your loop.
You're doing a global match repeatedly which makes no sense.
Get rid of the g modifier, and then things should work better.
Try this one - http://jsfiddle.net/UDYAq/
var str = new String('.js("aaa").js("bbb").js("ccc")');
var regex = /\.js\(\"(.*?)\"\){1,}/gi;
var result = [];
result = str.match (regex);
for (i in result) {
result[i] = result[i].match(/\"(.*?)\"/i)[1];
}
console.log (result);
To be sure that matched characters are surrounded by the same quotes:
/\.js\((['"])(.*?)\1\)/g
I am trying to do a basic string replace using a regex expression, but the answers I have found do not seem to help - they are directly answering each persons unique requirement with little or no explanation.
I am using str = str.replace(/[^a-z0-9+]/g, ''); at the moment. But what I would like to do is allow all alphanumeric characters (a-z and 0-9) and also the '-' character.
Could you please answer this and explain how you concatenate expressions.
This should work :
str = str.replace(/[^a-z0-9-]/g, '');
Everything between the indicates what your are looking for
/ is here to delimit your pattern so you have one to start and one to end
[] indicates the pattern your are looking for on one specific character
^ indicates that you want every character NOT corresponding to what follows
a-z matches any character between 'a' and 'z' included
0-9 matches any digit between '0' and '9' included (meaning any digit)
- the '-' character
g at the end is a special parameter saying that you do not want you regex to stop on the first character matching your pattern but to continue on the whole string
Then your expression is delimited by / before and after.
So here you say "every character not being a letter, a digit or a '-' will be removed from the string".
Just change + to -:
str = str.replace(/[^a-z0-9-]/g, "");
You can read it as:
[^ ]: match NOT from the set
[^a-z0-9-]: match if not a-z, 0-9 or -
/ /g: do global match
More information:
https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Regular_Expressions
Your character class (the part in the square brackets) is saying that you want to match anything except 0-9 and a-z and +. You aren't explicit about how many a-z or 0-9 you want to match, but I assume the + means you want to replace strings of at least one alphanumeric character. It should read instead:
str = str.replace(/[^-a-z0-9]+/g, "");
Also, if you need to match upper-case letters along with lower case, you should use:
str = str.replace(/[^-a-zA-Z0-9]+/g, "");
str = str.replace(/\W/g, "");
This will be a shorter form
We can use /[a-zA-Z]/g to select small letter and caps letter sting in the word or sentence and replace.
var str = 'MM-DD-yyyy'
var modifiedStr = str.replace(/[a-zA-Z]/g, '_')
console.log(modifiedStr)