is it possible to make a regex with multiple delimiters? For example I want to split a string which can come in two forms: 1. "string1, string2, string3" or 2. "string1,string2,string3". I've been trying to do this in javascript but with no success so far.
Just use a regex split():
var string = "part1,part2, part3, part4, part5",
components = string.split(/,\s*/);
JS Fiddle demo.
The reason I've used * rather than ? is simply because it allows for no white-space or many white-spaces. Whereas the ? matches zero-or-one white-space (which is exactly what you asked, but even so).
Incidentally, if there might possibly be white-spaces preceding the comma, then it might be worth amending the split() regex to:
var string = "part1,part2 , part3, part4, part5",
components = string.split(/\s*,\s*/);
console.log(components);
JS Fiddle demo.
Which splits the supplied string on zero-or-more whitespace followed by a comma followed by zero-or-more white-space. This may, of course, be entirely unnecessary.
References:
Regular Expressions.
string.split().
Yes, make the whitespace (\s) optional using ?:
var s = "string1,string2,string3";
s.split(/,\s?/);
In addition to silva
just in case you have doubt it can have more than one space then use (or no space)
var s = "string1, string2, string3";
s.split(/,\s*/);
Related
I have an input string like this:
ABCDEFG[HIJKLMN]OPQRSTUVWXYZ
How can I replace each character in the string between the [] with an X (resulting in the same number of Xs as there were characters)?
For example, with the input above, I would like an output of:
ABCDEFG[XXXXXXX]OPQRSTUVWXYZ
I am using JavaScript's RegEx for this and would prefer if answers could be an implementation that does this using JavaScript's RegEx Replace function.
I am new to RegEx so please explain what you do and (if possible) link articles to where I can get further help.
Using replace() and passing the match to a function as parameter, and then Array(m.length).join("X") to generate the X's needed:
var str = "ABCDEFG[HIJKLMN]OPQRSTUVWXYZ"
str = str.replace(/\[[A-Z]*\]/g,(m)=>"["+Array(m.length-1).join("X")+"]")
console.log(str);
We could use also .* instead of [A-Z] in the regex to match any character.
About regular expressions there are thousands of resources, specifically in JavaScript, you could see Regular Expressions MDN but the best way to learn, in my opinion, is practicing, I find regex101 useful.
const str="ABCDEFG[HIJKLMN]OPQRSTUVWXYZ";
const run=str=>str.replace(/\[.*]/,(a,b,c)=>c=a.replace(/[^\[\]]/g,x=>x="X"));
console.log(run(str));
The first pattern /\[.*]/ is to select letters inside bracket [] and the second pattern /[^\[\]]/ is to replace the letters to "X"
We can observe that every individual letter you wish to match is followed by a series of zero or more non-'[' characters, until a ']' is found. This is quite simple to express in JavaScript-friendly regex:
/[A-Z](?=[^\[]*\])/g
regex101 example
(?= ) is a "positive lookahead assertion"; it peeks ahead of the current matching point, without consuming characters, to verify its contents are matched. In this case, "[^[]*]" matches exactly what I described above.
Now you can substitute each [A-Z] matched with a single 'X'.
You can use the following solution to replace a string between two square brackets:
const rxp = /\[.*?\]/g;
"ABCDEFG[HIJKLMN]OPQRSTUVWXYZ".replace(rxp, (x) => {
return x.replace(rxp, "X".repeat(x.length)-2);
});
I've got three working regexp's,
string.replace(\catalogue\g, "") // replace a the word catalogue
string.replace(/[/:]/g, "") // replace the characters /, :
string.replace(\20%\g, "") // replace '20%'
Instead of replacing the string three times, I want to combine my regexp's.
Wanted result = 'removethewordnow';
var string = 'rem:ove20%the/word:catalogue20%now';
My latest try was:
string.replace(/\catalogue\b|[/20%:]/g, ""); // works, but catalouge is unaffected and 20% isn't combined as a word
Off the top of my head:
string.replace(/(catalogue|[\/:]|20%)/g,"");
Just use an alternative, i.e. separate each of the regular expressions you had before by the alternation operator |:
catalogue|20%|[/:]
Also note that you cannot just combine character classes and literal strings in the way you have done there. Above naïve combination works and everything beyond that might be optimisation (even if it can't be optimised further in this case) – but that only works if you don't change the language described by the regex.
You seem to be having a typo there (\c), also you don't want 20% inside the character class (and you should escape the slash). You also need to remove the word boundaries if you want to allow catalogue20% to match - there is no boundary between catalogue and 20, therefore the \b fails:
string.replace(/catalogue|20%|[\/:]/g, "");
var string = 'rem:ove20%the/word:catalogue20%now';
string.replace(/([:/]|20%|catalogue)/g, '');
\b refers to a word boundary, but your word catalogue is mixed with other words. So your regex should be:
string.replace(/catalogue|[\/20%:]/g, "");
Also do escape the / with \/.
string.replace(/catalogue|20%|[/:]/g, '')
I recently came across the statement :
var cookies = document.cookie.split(/;/);
and
var pair = allCookies[i].split("=", 2);
if (pair[0].replace(/^ +/, "") == "lastvisit")
In the first statement what does /;/ in the argument of split denote ?
In the second statement what does /^ +/ in the argument of replace denote ?
These are Regular Expressions.
Javascript supports them natively.
In this particular example:
.split(/;/) uses ; as the split character;
.replace(/^ +/, "") removes ("") any (+) leading (^) whitespace ().
In both examples, / surround or delimit the regular expression (or "regex"), informing Javascript that you're providing a regex.
Follow the links provided above for more information; regexes are broad in scope and worth learning.
Slashes delimit a regular expression, just like quotes delimit a string.
/;/ matches a semi-colon. Specifically:
var cookies = document.cookie.split(/;/);
Means we split the document.cookie string into an array, splitting it where there are semicolons. So it would take something like "a;b;c" and turn it into ["a", "b", "c"].
pair[0].replace(/^ +/, "")
Just strips all leading whitespace. It turns
" lastvisit"
into
"lastvisit"
The caret ^ means "beginning of line", it's followed by space, and the + means to repeat the space one or more times, as many as possible.
The // syntax denotes a regular expression (also known as a 'regex').
Regex is a syntax for searching and replacing strings.
The first example you gave is /;/. This is a very simply regex which just searches the string for semi-colons, and then splits it into an array based on the result. Since this is not using any special regex functionality, it could just as easily have been expressed as a simple string, ie split(";") (as has been done with the equal sign in your other example), without making any difference to the result.
The second example is /^ +/. This is more complex and requires a bit of knowledge of how regex works. In short, what it is doing is searching for leading spaces on a string, and removing them.
To learn more about regex, I recommend this site as a good starting point: http://www.regular-expressions.info/
Hope that helps.
I think that /^ +/ means: one or more no-" " characters
Need a function to strip off a set of illegal character in javascript: |&;$%#"<>()+,
This is a classic problem to be solved with regexes, which means now I have 2 problems.
This is what I've got so far:
var cleanString = dirtyString.replace(/\|&;\$%#"<>\(\)\+,/g, "");
I am escaping the regex special chars with a backslash but I am having a hard time trying to understand what's going on.
If I try with single literals in isolation most of them seem to work, but once I put them together in the same regex depending on the order the replace is broken.
i.e. this won't work --> dirtyString.replace(/\|<>/g, ""):
Help appreciated!
What you need are character classes. In that, you've only to worry about the ], \ and - characters (and ^ if you're placing it straight after the beginning of the character class "[" ).
Syntax: [characters] where characters is a list with characters.
Example:
var cleanString = dirtyString.replace(/[|&;$%#"<>()+,]/g, "");
I tend to look at it from the inverse perspective which may be what you intended:
What characters do I want to allow?
This is because there could be lots of characters that make in into a string somehow that blow stuff up that you wouldn't expect.
For example this one only allows for letters and numbers removing groups of invalid characters replacing them with a hypen:
"This¢£«±Ÿ÷could&*()\/<>be!##$%^bad".replace(/([^a-z0-9]+)/gi, '-');
//Result: "This-could-be-bad"
You need to wrap them all in a character class. The current version means replace this sequence of characters with an empty string. When wrapped in square brackets it means replace any of these characters with an empty string.
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");
Put them in brackets []:
var cleanString = dirtyString.replace(/[\|&;\$%#"<>\(\)\+,]/g, "");
using http://www.regular-expressions.info/javascriptexample.html I tested the following regex
^\\{1}([0-9])+
this is designed to match a backslash and then a number.
It works there
If I then try this directly in code
var reg = /^\\{1}([0-9])+/;
reg.exec("/123")
I get no matches!
What am I doing wrong?
Update:
Regarding the update of your question. Then the regex has to be:
var reg = /^\/(\d+)/;
You have to escape the slash inside the regex with \/.
The backslash needs to be escaped in the string too:
reg.exec("\\123")
Otherwise \1 will be treated as special character.
Btw, the regular expression can be simplified:
var reg = /^\\(\d+)/;
Note that I moved the quantifier + inside the capture group, otherwise it will only capture a single digit (namely 3) and not the whole number 123.
You need to escape the backslash in your string:
"\\123"
Also, for various implementation bugs, you may want to set reg.lastIndex = 0;.
In addition, {1} is completely redundant, you can simplify your regex to /^\\(\d)+/.
One last note: (\d)+ will only capture the last digit, you may want (\d+).