In Javascript, I am trying to split the following string
var s1 = "(-infinity, -3],(-3,-2),[1,infinity)";
into an array
["(-infinity, -3]","(-3,-2)","[1,infinity)"]
by using this statement
s1.split(/(?=[\]\)]),/);
to explain, I want to split the string by commas that follow a closing square bracket or parenthesis. I use the Look Ahead (?=[\]\)]), to do so, but it doesn't match any commas. When I change it to (?![\]\)]),, it matches every commas. Please suggest what is the problem in my regex.
Your logic is backwards. (?=...) is a look-ahead group, not a look-behind. This means that s1.split(/(?=[\]\)]),/); matches only if the next character is simultaneously ] or ) and ,, which is impossible.
Try this instead:
s1.split(/,(?=[\[\(])/);
Related
I have an input string like this:
ABCDEFG[HIJKLMN]OPQRSTUVWXYZ
How can I replace each character in the string between the [] with an X (resulting in the same number of Xs as there were characters)?
For example, with the input above, I would like an output of:
ABCDEFG[XXXXXXX]OPQRSTUVWXYZ
I am using JavaScript's RegEx for this and would prefer if answers could be an implementation that does this using JavaScript's RegEx Replace function.
I am new to RegEx so please explain what you do and (if possible) link articles to where I can get further help.
Using replace() and passing the match to a function as parameter, and then Array(m.length).join("X") to generate the X's needed:
var str = "ABCDEFG[HIJKLMN]OPQRSTUVWXYZ"
str = str.replace(/\[[A-Z]*\]/g,(m)=>"["+Array(m.length-1).join("X")+"]")
console.log(str);
We could use also .* instead of [A-Z] in the regex to match any character.
About regular expressions there are thousands of resources, specifically in JavaScript, you could see Regular Expressions MDN but the best way to learn, in my opinion, is practicing, I find regex101 useful.
const str="ABCDEFG[HIJKLMN]OPQRSTUVWXYZ";
const run=str=>str.replace(/\[.*]/,(a,b,c)=>c=a.replace(/[^\[\]]/g,x=>x="X"));
console.log(run(str));
The first pattern /\[.*]/ is to select letters inside bracket [] and the second pattern /[^\[\]]/ is to replace the letters to "X"
We can observe that every individual letter you wish to match is followed by a series of zero or more non-'[' characters, until a ']' is found. This is quite simple to express in JavaScript-friendly regex:
/[A-Z](?=[^\[]*\])/g
regex101 example
(?= ) is a "positive lookahead assertion"; it peeks ahead of the current matching point, without consuming characters, to verify its contents are matched. In this case, "[^[]*]" matches exactly what I described above.
Now you can substitute each [A-Z] matched with a single 'X'.
You can use the following solution to replace a string between two square brackets:
const rxp = /\[.*?\]/g;
"ABCDEFG[HIJKLMN]OPQRSTUVWXYZ".replace(rxp, (x) => {
return x.replace(rxp, "X".repeat(x.length)-2);
});
I need to write a little RegEx matcher which will match any occurrence of strings in the form of
[a-zA-Z]+(_[a-zA-Z0-9]+)?
If I use the regex above it does match the sections needed but would also match onto the abc part of 4_abc which is not intended. I tried to exclude it with:
(?:[^a-zA-Z0-9_]|^)([a-zA-Z]+(_[a-zA-Z0-9]+)?)(?:[^a-zA-Z0-9_]|$)
The problem is that the 'not' matches at the beginning and end are not really working like I hoped they would. If I use them on the example
a_d Dd_da 4_d d_4
they would block matching the second Dd_da because the space was used in the first match.Sadly I can't use lookarounds because I am using JS.
So the input:
a_d Dd_da 4_d d_4
should match: a_d, Dd_da and d_4
but matches: a_d (there is a space at the end)
Is there another way to match the needed sections, or to not consume the 'anchor' matches?
I really appreciate your help.
You can make use of \b:
\b[a-zA-Z]+(_[a-zA-Z0-9]+)?\b
\b matches the (zero-width) point where either the preceding character or following character is a letter, digit or underscore, but not both. It also matches with the start/end of the string if the first/last character is a letter, digit or underscore.
I want to get the last string between special characters. I've done for square bracket as \[(.*)\]$
But, when I use it on something like Blah [Hi]How is this[KoTuWa]. I get the result as [Hi]How is this[KoTuWa].
How do i modify it to get the last stringthat is KotuWa.
Also, I would like to generalise to general special characters, instead of just matching the string between square brackets as above.
Thanks,
Sai
I would do this:
[^[\]]+(?=][^[\]]*$)
Debuggex Demo
To extend this to other types of brackets/special chars, say I also wanna match curly braces { and double quotes ":
[^{}"[\]]+(?=["\]}][^{}"[\]]*$)
Debuggex Demo (I added the multi-line /m only to show multiple examples)
Here is one way to do it:
\[([^\[]*)\]$
You can require that the string between brackets does not contain brackets:
Edit: thanks to funkwurm and jcubic for pointing out an error. Here's the fixed expression:
\[([^[]+)\][^\[]*$
If you need to use other separators than brackets, you should:
replace the \[ and \] with your new separators
replace the negative character classes with your beginning separator.
For example, assuming you need to use the separators <> instead of [], you'd do this:
<([^<]+)>[^\>]*$
This is my string:
<address>tel+1234567890</address>
This is my regex:
([\d].*<)
which matches this:
1234567890<
but I dont want to match the last <character.
You can use a positive lookahead:
\d+(?=<)
The (?=...) syntax makes sure what's inside the parens matches at that position, without moving the match cursor forward, thus without consuming the input string. It's also called a zero-width assertion.
By the way, the square brackets in [\d] are redundant, so you can omit them. Also, I've changed the regex, but perhaps you really meant to match this:
\d.*?(?=<)
This pattern matches everything between a digit and a <, including the digit. It makes use of an ungreedy quantifier (*?) to match up until the first < if there are several.
([\d]+)
This should work , try it out and let me know
Check the demo
Also as #LucasTrzesniewski said , you can use the look ahead
(\d+.(?=<))
Here is the demo
Tried to search for /\,$/ online, but coudnt find anything.
I have:
coords = coords.replace(/\,$/, "");
Im guessing it returns coords string index number. What I have to search online for this, so I can learn more?
/\,$/ finds the comma character (,) at the end of a string (denoted by the $) and replaces it with empty (""). You sometimes see this in regex code aiming to clean up excerpts of text.
It's a regular expression to remove a trailing comma.
That thing is a Regular Expression, also known as regex or regexp. It is a way to "match" strings using some rules. If you want to learn how to use it in JavaScript, read the Mozilla Developer Network page about RegExp.
By the way, regular expressions are also available on most languages and in some tools. It is a very useful thing to learn.
That's a regular expression that finds a comma at the end of a string. That code removes the comma.
// defines a JavaScript regular expression, used to match a pattern within a string.
\,$ is the pattern
In this case \, translates to ,. A backslash is used to escape special characters, but in this case, it's not necessary. An example where it would be necessary would be to remove trailing periods. If you tried to do that with /.$/ the period here has a different meaning; it is used as a wildcard to match [almost] any character (aside for some newlines). So in this case to match on "." (period character) you would have to escape the wildcard (/\.$/).
When $ is placed at the end of the pattern, it means only look at the end of the string. This means that you can't mistakingly find a comma anywhere in the middle of the string (e.g., not after help in help, me,), only at the end (trailing). It also speeds of the regular expression search considerably. If you wanted to match on characters only at the beginning of the string, you would start off the pattern with a carat (^), for instance /^,/ would find a comma at the start of a string if one existed.
It's also important to note that you're only removing one comma, whereas if you use the plus (+) after the comma, you'd be replacing one or more: /,+$/.
Without the +; trailing commas,, becomes trailing commas,
With the +; no trailing comma,, becomes no trailing comma