Help interpreting a javascript Regex

Help interpreting a javascript Regex - javascript

I have found the following expression which is intended to modify the id of a cloned html element e.g. change contactDetails[0] to contactDetails[1]:
var nel = 1;
var s = $(this).attr(attribute);
s.replace(/([^\[]+)\[0\]/, "$1["+nel+"]");
$(this).attr(attribute, s);
I am not terribly familiar with regex, but have tried to interpret it and with the help of The Regex Coach however I am still struggling. It appears that ([^\[]+) matches one or more characters which are not '[' and \[0\]/ matches [0]. The / in the middle I interpret as an 'include both', so I don't understand why the author has even included the first expression.
I dont understand what the $1 in the replace string is and if I use the Regex Coach replace functionality if I simply use [0] as the search and 1 as the replace I get the correct result, however if I change the javascript to s.replace(/\[0\]/, "["+nel+"]"); the string s remains unchanged.
I would be grateful for any advice as to what the original author intended and help in finding a solution which will successfully replace the a number in square brackets anywhere within a search string.

Find
/ # Signifies the start of a regex expression like " for a string
([^\[]+) # Capture the character that isn't [ 1 or more times into $1
\[0\] # Find [0]
/ # Signifies the end of a regex expression
Replace
"$1[" # Insert the item captured above And [
+nel+ # New index
"]" # Close with ]
To create an expression that captures any digit, you can replace the 0 with \d+ which will match a digit 1 or more times.
s.replace(/([^\[]+)\[\d+\]/, "$1["+nel+"]");

The $1 is a backreference to the first group in the regex. Groups are the pieces inside (). So, in this case $1 will be replaced by whatever the ([^\[]+) part matched.
If the string was contactDetails[0] the resulting string would be contactDetails[1].
Note that this regex only replaces 0s inside square brackets. If you want to replace any number you will need something like:
([^\[]+)\[\d+\]
The \d matches any digit character. \d+ then becomes any sequence of at least one digit.
But your code will still not work, because Javascript strings are immutable. That means they can't be changed once created. The replace method returns a new string, instead of changing the original one. You should use:
s = s.replace(...)

looks like it replaces arrays of 0 with 1.
For example: array[0] goes to array[1]
Explanation:
([^[]+) - This part means save everything that is not a [ into variable $1
[0]/ - This part limits Part 1 to save everything up to a [0]
"$1["+nel+"]" - Print out the contents of $1 (loaded from part 1) and add the brackets with the value of nel. (in your example nel = 1)

Square braces define a set of characters to match. [abc] will match the letters a, b or c.
By adding the carat you are now specifying that you want characters not in the set. [^abc] will match any character that is not an a, b or c.
Because square braces have special meaning in RegExps you need to escape them with a slash if you want to match one. [ starts a character set, \[ matches a brace. (Same concept for closing braces.)
So, [^\[]+ captures 1 or more characters that are not [.
Wrapping that in parenthesis "captures" the matched portion of the string (in this case "contactDetails" so that you can use it in the replacement.
$1 uses the "captured" string (i.e. "contactDetails") in the replacement string.

This regex matches "something" followed by a [0].
"something" is identified by the expression [^\[]+ which matches all charactes that are not a [. You can see the () around this expression, because the match is reused with $1, later. The rest of your regex - that is \[0\] just matches the index [0]. The author had to write \[ and \] because [ and ] are special charactes for regular expressions and have to be escaped.

$1 is a reference to the value of the first paranthesis pair. In your case the value of
[^\[]+
which matches one or more characters which are not a '['
The remaining part of the regexp matches string '[0]'.
So if s is 'foobar[0]' the result will be 'foobar[1]'.

[^\[] will match any character that is not [, the '+' means one or more times. So [^[]+ will match contactDetails. The brackets will capture this for later use. The '\' is an escape symbol so the end \[0\] will match [0]. The replace string will use $1 which is what was captured in the brackets and add the new index.

Your interpretation of the regular expression is correct. It is intended to match one or more characters which are not [, followed by a literal [0]. And used in the replace method, the match would be replaced with the match of the first grouping (that’s what $1 is replaced with) together with the sequence [ followed by the value of nel and ] (that’s how "$1["+nel+"]" is to be interpreted).
And again, a simple s.replace(/\[0\]/, "["+nel+"]") does the same. Except if there is nothing in front of [0], because in that case the first regex wouldn’t find a match.

Related

Remove Last Instance Of Character From String - Javascript - Revisited

According to the accepted answer from this question, the following is the syntax for removing the last instance of a certain character from a string (In this case I want to remove the last &):
function remove (string) {
string = string.replace(/&([^&]*)$/, '$1');
return string;
}
console.log(remove("height=74&width=12&"));
But I'm trying to fully understand why it works.
According to regex101.com,
/&([^&]*)$/
& matches the character & literally (case sensitive)
1st Capturing Group ([^&]*)
Match a single character not present in the list below [^&]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
& matches the character & literally (case sensitive)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)
So if we're matching the character & literally with the first &:
Then why are we also "matching a single character not present in the following list"?
Seems counter productive.
And then, "$ asserts position at the end of the string" - what does this mean? That it starts searching for matches from the back of the string first?
And finally, what is the $1 doing in the replaceValue? Why is it $1 instead of an empty string? ""

1- The solution for that problem I think is different to the solution you want:
That regex will replace the last "&" no matter where it is, in the middle or in the end of the string.
If you apply this regex to this two examples you will see that the first get "incorrectly" replaced:
height=74&width=12&test=1
height=74&width=12&test=1&
They get replaced as :
height=74&width=12test=1
height=74&width=12&test=1
So to really replace the last "&" the only thing you need to do is :
string.replace(/&$/, '');
Now, if you want to replace the last ocurrence of "&" no matter where it is, I will explain that regex :
$1 Represents a (capturing group), everything inside those ([^&]*) are captured inside that $1. This is a oversimplification.
&([^&]*)$
& Will match a literal "&" then in the following capturing group this regex will look for any amount (0 to infinite) of characters (NOT EQUAL TO "&", explained latter) until the end of the string or line (Depending on the flag you use in the regex, /m for matching lines ). Anything captured in this capturing group will go to $1 when you apply the replacement.
So, If you apply this logic in your mind you will see that it will always match the last & and replace it with anything on its right that does not contain a single "&""
&(<nothing-like-a-&>*)<until-we-reach-the-end> replaced by anything found inside (<nothing-like-a-&>*) == $1. In this case because of the use of * , it means 0 or more times, sometimes the capturing group $1 will be empty.
NOT EQUAL TO part:
The regex uses a [^], in simple terms [] represents a group of independent characters, example: [ab] or [ba] represents the same, it will always look for "a" or "b". Inside this you can also look for ranges like 0 to 9 like this [0-9ba], it will always match anything from 0 to 9, a or b.
The "^" here [^] represents a negation of the content, so, it will match anything not in this group, like [^0-9] will always match anything that is not a number. In your regex [^&] it was used for looking for anything that is not a "&"

JavaScript - making my regular expression work

I have these 2 expressions:
1: [^a-zA-Z0-9]
2: [^a-zA-Z]
The first one must be used whenever my string starts with data- and the second one if it doesn't. However, I need this built-in into my regular expression (so using .slice(0, 5) == "data-" is no option for this situation).
Is it possible to do this inlined (so by just having to use 1 regular expression)? Or do I first have to validate (if string starts with data-) and then use the correct expression?
Some examples:
data-attribute#!#!19 => data-attribute19
data-attribute17 => data-attribute17
attribute19 => attribute
attribute1#!#!##183 => attribute

You can do something a bit like this:
/^(data-[a-zA-Z0-9]+).+?(\d*)$|^([a-zA-Z]+).+$/
Which will match what you want, and then return the results inside either one or two capture groups (depending on which option it matches).
Breaking it Down
Going from left to right:
The ^ character means "beginning of line" - in this case, the beginning of a single string.
The parentheses () indicate a capture group - some substring that you want to capture and output separately from your main match string.
data- indicates the literal string "data-", with the hyphen at the end.
[a-zA-Z0-9]+ is a character class, repeated one or more times.
.+? is one or more of any characters, matched lazily - meaning it will "give up" some of its match to enable the next token to match as much as possible.
\d* means zero or more digits (equivalent to [0-9]*).
The $ character means "match the end of the line" (again, in this case, the end of your string).
The | character means "alternate" - basically, it will match either the pattern on the left or the pattern on the right, enabling this single regex to match either of your two strings.

str.replace('/[#!#]/', '')
str.match('/^data-(.+)$/') // Contains true or false
This should do the trick.
First we remove every special chars (you can add your own.)
[abc] is a class of characters, wich says to JavaScript : match any of the characters between square brackets
Then we test if it matches with data-attribute
^ and $ match beginning and end of the input (it can't start or end with a space or any other character)
() catches the characters inside them. You can access what was catched with RegExp.$1-9
. means any characters, excepts line terminators.
+ is a quantifier for 1 time or more. It is the same as {1,}.
You just have now to test if it matches with the input. If it matches the attribute starts with data-

Grab full regex word if pattern inside it matches

How do I retrieve an entire word that has a specific portion of it that matches a regex?
For example, I have the below text.
Using ^.[\.\?\!:;,]{2,} , I match the first 3, but not the last. The last should be matched as well, but $ doesn't seem to produce anything.
a!!!!!!
n.......
c..,;,;,,
huhuhu..
I want to get all strings that have an occurrence of certain characters equal to or more than twice. I produced the aforementioned regex, but on Rubular it only matches the characters themselves, not the entire string. Using ^ and $
I've read a few stackoverflow posts similar, but not quite what I'm looking for.

Change your regex to:
/^.*[.?!:;,]{2,}/gm
i.e. match 0 more character before 2 of those special characters.
RegEx Demo

If I understand well you are trying to match an entire string that contains at least the same punctuation character two times:
^.*?([.?!:;,])\1.*
Note: if your string has newline characters, change .* to [\s\S]*
The trick is here:
([.?!:;,]) # captures the punct character in group 1
\1 # refers to the character captured in group 1

can someone help to explain this regular expression in javascript?

This code is used to get rid of mime type from rawdata.but I can not understand how it works
content.replace(/^[^,]*,/ , '')
it seems quite different from java.... any help will be appreciated.

Your mime-type probably is seperated by a comma , and at the beginning of your raw data.
This regex says take everything from the beginning (^) that is NOT a comma ([^,]*) (the star makes it as many characters until there is a comma) and take the comma itself (,). Then replace it by nothing ('').
This one only gets the first appearence because it is marked by the beginning ^ that it must be at the beginning of the string.

The first thing you need to know is that there are regex literals in JavaScript, constructed by pairs of slashes. So like "..." is a string, /.../ is a regex. That's actually the only difference your code shows as compared to a Java regex.
Then, [abc] within a regex is called a character class, meaning "one character out of a, b or c". Conversely, [^abc] is a negated character class, meaning "one character except a, b or c".
So your sample means:
/ # Start of regex literal
^ # Start the match at the start of the string
[^,]* # Match any number of characters except commas
, # Match a comma
/ # End of regex literal

The regular expression is the text between the two forward slashes, the first carat (^) means at the begining of the string, the brackets mean a character class, the carat inside the brackets means any character except a comma, then asterisk after the closing bracket means match zero or more of the character defined by the character class (which again is any character except the comma), and then finally the last comma means match the comma after all this. Then its used in a replace function so the matching result will be replaced with the second parameter, in your case: an empty string.
Basically it matches the first characters up to and including the first comma in the 'content' variable and then replaces it with an empty string.

Find the first letter of the last word with jquery inside a string (string can have multiple words)

Hy, is there a way to find the first letter of the last word in a string? The strings are results in a XML parser function. Inside the each() loop i get all the nodes and put every name inside a variable like this: var person = xml.find("name").find().text()
Now person holds a string, it could be:
Anamaria Forrest Gump
John Lock
As you see, the first string holds 3 words, while the second holds 2 words.
What i need are the first letters from the last words: "G", "L",
How do i accomplish this? TY

This should do it:
var person = xml.find("name").find().text();
var names = person.split(' ');
var firstLetterOfSurname = names[names.length - 1].charAt(0);

This solution will work even if your string contains a single word. It returns the desired character:
myString.match(/(\w)\w*$/)[1];
Explanation: "Match a word character (and memorize it) (\w), then match any number of word characters \w*, then match the end of the string $". In other words : "Match a sequence of word characters at the end of the string (and memorize the first of these word characters)". match returns an array with the whole match in [0] and then the memorized strings in [1], [2], etc. Here we want [1].
Regexps are enclosed in / in javascript : http://www.w3schools.com/js/js_obj_regexp.asp

You can hack it with regex:
'Marry Jo Poppins'.replace(/^.*\s+(\w)\w+$/, "$1"); // P
'Anamaria Forrest Gump'.replace(/^.*\s+(\w)\w+$/, "$1"); // G
Otherwise Mark B's answer is fine, too :)
edit:
Alsciende's regex+javascript combo myString.match(/(\w)\w*$/)[1] is probably a little more versatile than mine.
regular expression explanation
/^.*\s+(\w)\w+$/
^ beginning of input string
.* followed by any character (.) 0 or more times (*)
\s+ followed by any whitespace (\s) 1 or more times (+)
( group and capture to $1
\w followed by any word character (\w)
) end capture
\w+ followed by any word character (\w) 1 or more times (+)
$ end of string (before newline (\n))
Alsciende's regex
/(\w)\w*$/
( group and capture to $1
\w any word character
) end capture
\w* any word character (\w) 0 or more times (*)
summary
Regular expressions are awesomely powerful, or as you might say, "Godlike!" Regular-Expressions.info is a great starting point if you'd like to learn more.
Hope this helps :)

We Keep Coding

JavaScript is the programming language of the Web.

Help interpreting a javascript Regex - javascript

$1 is a reference to the value of the first paranthesis pair. In your case the value of [^\[]+ which matches one or more characters which are not a '[' The remaining part of the regexp matches string '[0]'. So if s is 'foobar[0]' the result will be 'foobar[1]'.

Related

Remove Last Instance Of Character From String - Javascript - Revisited

JavaScript - making my regular expression work

Grab full regex word if pattern inside it matches

can someone help to explain this regular expression in javascript?

Find the first letter of the last word with jquery inside a string (string can have multiple words)

Categories

Resources