Split a String by a Regex in JavaScript - javascript

I have the string:
"Vl.55.16b25.3d.42b50.59b30.90.24b35.3d.56.67b70.Tv.54b30.Vl.41b35.Tv.Bd.71b50.3d.99b20.03b50.Tv.73b50.Vl.05b25.12b40.Bd.Tv.82b25."
How to detached get results like:
["Vl.55.16b25", 3d.42.b50.59b30.90.24b35, 3d.56.67b70, ...]
The logic:
Condition 1: The End will be start b and 2 number. Example: b20, b25.
If pass condition 1 I need to check condition 2.
Condition 2: maybe like "3d" or 2 characters. If don't match condition 2 we need to pass the next character to the current block.
Many thanks.

If I understand your question correctly, the following code should work:
var string = "Vl.55.16b25.3d.42b50.59b30.90.24b35.3d.56.67b70.Tv.54b30.Vl.41b35.Tv.Bd.71b50.3d.99b20.03b50.Tv.73b50.Vl.05b25.12b40.Bd.Tv.82b25.";
console.log(string.split(/(?<=b\d\d)\.(?=3d)/g))
Explanation:
(?<=) is look-behind.
b matches the literal character "b".
\d matches any digit so \d\d will match two digits in a row.
\. matches a literal ".", it needs the \ before it because otherwise it would match any character.
(?=) is look-ahead.
The g flag stands for global so the string will be split up at every occurrence of the regular expression.
This means that the string will be split at every occurrence of "." that is preceded the letter "b" then two digits, and followed by "3d".

Assuming you want to separate by last having 'b' and two digits followed by 3d, two digits or the end of string (this is necessary) and by omitting leading dot, you could take the following regular expression.
const
string = "Vl.55.16b25.3d.42b50.59b30.90.24b35.3d.56.67b70.Tv.54b30.Vl.41b35.Tv.Bd.71b50.3d.99b20.03b50.Tv.73b50.Vl.05b25.12b40.Bd.Tv.82b25.",
result = string.match(/[^.].*?b\d\d(?=\.(3d|\D\D|$))/g);
console.log(result);

Related

regex - don't allow name to finish with hyphen

I'm trying to create a regex using javascript that will allow names like abc-def but will not allow abc-
(hyphen is also the only nonalpha character allowed)
The name has to be a minimum of 2 characters. I started with
^[a-zA-Z-]{2,}$, but it's not good enough so I'm trying something like this
^([A-Za-z]{2,})+(-[A-Za-z]+)*$.
It can have more than one - in a name but it should never start or finish with -.
It's allowing names like xx-x but not names like x-x. I'd like to achieve that x-x is also accepted but not x-.
Thanks!
Option 1
This option matches strings that begin and end with a letter and ensures two - are not consecutive so a string like a--a is invalid. To allow this case, see the Option 2.
^[a-z]+(?:-?[a-z]+)+$
^ Assert position at the start of the line
[a-z]+ Match any lowercase ASCII letter one or more times (with i flag this also matches uppercase variants)
(?:-?[a-z]+)+ Match the following one or more times
-? Optionally match -
[a-z]+ Match any ASCII letter (with i flag)
$ Assert position at the end of the line
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a", // valid
"aa-a-","a","a-","-a","a--a" // invalid
]
var r = /^[a-z]+(?:-?[a-z]+)+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
Option 2
If you want to match strings like a--a then you can instead use the following regex:
^[a-z]+[a-z-]*[a-z]+$
var a = [
"aa","a-a","a-a-a","aa-aa-aa","aa-a","a--a", // valid
"aa-a-","a","a-","-a" // invalid
]
var r = /^[a-z]+[a-z-]*[a-z]+$/i
a.forEach(function(s) {
console.log(`${s}: ${r.test(s)}`)
})
You can use a negative lookahead:
/(?!.*-$)^[a-z][a-z-]+$/i
Regex101 Example
Breakdown:
// Negative lookahead so that it can't end with a -
(?!.*-$)
// The actual string must begin with a letter a-z
[a-z]
// Any following strings can be a-z or -, there must be at least 1 of these
[a-z-]+
let regex = /(?!.*-$)^[a-z][a-z-]+$/i;
let test = [
'xx-x',
'x-x',
'x-x-x',
'x-',
'x-x-x-',
'-x',
'x'
];
test.forEach(string => {
console.log(string, ':', regex.test(string));
});
The problem is that the first assertion accepts 2 or more [A-Za-z]. You will need to modify it to accept one or more character:
^[A-Za-z]+((-[A-Za-z]{1,})+)?$
Edit: solved some commented issues
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-dfe'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('x-d'); // Logs true
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('xggg-'); // Logs false
Edit 2: Edited to accept characters only
/^[A-Za-z]+((-[A-Za-z]{1,})+)?$/.test('abc'); // Logs true
Use this if you want to accept such as A---A as well :
^(?!-|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/4UYd9l/4/
If you don't want to accept such as A---A do this:
^(?!-|.*[-]{2,}.*|.*-$)[A-Za-z-]{2,}$
https://regex101.com/r/qH4Q0q/4/
So both will accept only word starting from two characters of the pattern [A-Za-z-] and not start or end (?!-|.*-$) (negative lookahead) with - .
Try this /([a-zA-Z]{1,}-[a-zA-Z]{1,})/g
I suggest the following :
^[a-zA-Z][a-zA-Z-]*[a-zA-Z]$
It validates :
that the matched string is at least composed of two characters (the first and last character classes are matched exactly once)
that the first and the last characters aren't dashes (the first and last character classes do not include -)
that the string can contain dashes and be greater than 2 characters (the second character class includes dashes and will consume as much characters as needed, dashes included).
Try it online.
^(?=[A-Za-z](?:-|[A-Za-z]))(?:(?:-|^)[A-Za-z]+)+$
Asserts that
the first character is a-z
the second is a-z or hyphen
If this matches
looks for groups of one or more letters prefixed by a hyphen or start of string, all the way to end of string.
You can also use the I switch to make it case insensitive.

How does the following code mean two consecutive numbers?

This is from an exercise on FCC beta and i can not understand how the following code means two consecutive numbers seeing how \D* means NOT 0 or more numbers and \d means number, so how does this accumulate to two numbers in a regexp?
let checkPass = /(?=\w{5,})(?=\D*\d)/;
This does not match two numbers. It doesn't really match anything except an empty string, as there is nothing preceding the lookup.
If you want to match two digits, you can do something like this:
(\d)(\d)
Or if you really want to do a positive lookup with the (?=\D*\d) section, you will have to do something like this:
\d(?=\D*\d)
This will match against the last digit which is followed by a bunch of non-digits and a single digit. A few examples (matched numbers highlighted):
2 hhebuehi3
^
245673
^^^^^
2v jugn45
^ ^
To also capture the second digit, you will have to put brackets around both numbers. Ie:
(\d)(?=\D*(\d))
Here it is in action.
In order to do what your original example wants, ie:
number
5+ \w characters
a non-number character
a number
... you will need to precede your original example with a \d character. This means that your lookups will actually match something which isn't just an empty string:
\d(?=\w{5,})(?=\D*\d)
IMPORTANT EDIT
After playing around a bit more with a JavaScript online console, I have worked out the problem with your original Regex.
This matches a string with 5 or more characters, including at least 1 number. This can match two numbers, but it can also match 1 number, 3 numbers, 12 numbers, etc. In order to match exactly two numbers in a string of 5-or-more characters, you should specify the number of digits you want in the second half of your lookup:
let regex = /(?=\w{5,})(?=\D*\d{2})/;
let string1 = "abcd2";
let regex1 = /(?=\w{5,})(?=\D*\d)/;
console.log("string 1 & regex 1: " + regex1.test(string1));
let regex2 = /(?=\w{5,})(?=\D*\d{2})/;
console.log("string 1 & regex 2: " + regex2.test(string1));
let string2 = "abcd23";
console.log("string 2 & regex 2: " + regex2.test(string2));
My original answer was about Regex in a vacuum and I glossed over the fact that you were using Regex in conjunction with JavaScript, which works a little differently when comparing Regex to a string. I still don't know why your original answer was supposed to match two numbers, but I hope this is a bit more helpful.
?= Positive lookahead
w{5,} matches any word character (equal to [a-zA-Z0-9_])
{5,}. matches between 5 and unlimited
\D* matches any character that\'s not a digit (equal to [^0-9])
* matches between zero and unlimited
\d matches a digit (equal to [0-9])
This expression is global - so tries to match all
You can always check your expression using regex101

Regex to match # followed by square brackets containing a number

I want to parse a pattern similar to this using javascript:
#[10] or #[15]
With all my efforts, I came up with this:
#\\[(.*?)\\]
This pattern works fine but the problem is it matches anything b/w those square brackets. I want it to match only numbers. I tried these too:
#\\[(0-9)+\\]
and
#\\[([(0-9)+])\\]
But these match nothing.
Also, I want to match only pattern which are complete words and not part of a word in the string. i.e. should contain spaces both side if its not starting or ending the script. That means it should not match phrase like this:
abxdcs#[13]fsfs
Thanks in advance.
Use the regex:
/(?:^|\s)#\[([0-9]+)\](?=$|\s)/g
It will match if the pattern (#[number]) is not a part of a word. Should contain spaces both sides if its not starting or ending the string.
It uses groups, so if need the digits, use the group 1.
Testing code (click here for demo):
console.log(/(?:^|\s)#\[([0-9]+)\](?=$|\s)/g.test("#[10]")); // true
console.log(/(?:^|\s)#\[([0-9]+)\](?=$|\s)/g.test("#[15]")); // true
console.log(/(?:^|\s)#\[([0-9]+)\](?=$|\s)/g.test("abxdcs#[13]fsfs")); // false
console.log(/(?:^|\s)#\[([0-9]+)\](?=$|\s)/g.test("abxdcs #[13] fsfs")); // true
var r1 = /(?:^|\s)#\[([0-9]+)\](?=$|\s)/g
var match = r1.exec("#[10]");
console.log(match[1]); // 10
var r2 = /(?:^|\s)#\[([0-9]+)\](?=$|\s)/g
var match2 = r2.exec("abxdcs #[13] fsfs");
console.log(match2[1]); // 13
var r3 = /(?:^|\s)#\[([0-9]+)\](?=$|\s)/g
var match3;
while (match3 = r3.exec("#[111] #[222]")) {
console.log(match3[1]);
}
// while's output:
// 111
// 222
You were close, but you need to use square brackets:
#\[[0-9]+\]
Or, a shorter version:
#\[\d+\]
The reason you need those slashes is to "escape" the square bracket. Usually they are used for denoting a "character class".
[0-9] creates a character class which matches exactly one digit in the range of 0 to 9. Adding the + changes the meaning to "one or more". \d is just shorthand for [0-9].
Of course, the backslash character is also used to escape characters inside of a javascript string, which is why you must escape them. So:
javascript
"#\\[\\d+\\]"
turns into:
regex
#\[\d+\]
which is used to match:
# a literal "#" symbol
\[ a literal "[" symbol
\d+ one or more digits (nearly identical to [0-9]+)
\] a literal "]" symbol
I say that \d is nearly identical to [0-9] because, in some regex flavors (including .NET), \d will actually match numeric digits from other cultures in addition to 0-9.
You don't need so many characters inside the character class. More importantly, you put the + in the wrong place. Try this: #\\[([0-9]+)\\].

Regular Expression - Match any character except +, empty string should also be matched

I am having a bit of trouble with one part of a regular expression that will be used in JavaScript. I need a way to match any character other than the + character, an empty string should also match.
[^+] is almost what I want except it does not match an empty string. I have tried [^+]* thinking: "any character other than +, zero or more times", but this matches everything including +.
Add a {0,1} to it so that it will only match zero or one times, no more no less:
[^+]{0,1}
Or, as FailedDev pointed out, ? works too:
[^+]?
As expected, testing with Chrome's JavaScript console shows no match for "+" but does match other characters:
x = "+"
y = "A"
x.match(/[^+]{0,1}/)
[""]
y.match(/[^+]{0,1}/)
["A"]
x.match(/[^+]?/)
[""]
y.match(/[^+]?/)
["A"]
[^+] means "match any single character that is not a +"
[^+]* means "match any number of characters that are not a +" - which almost seems like what I think you want, except that it will match zero characters if the first character (or even all of the characters) are +.
use anchors to make sure that the expression validates the ENTIRE STRING:
^[^+]*$
means:
^ # assert at the beginning of the string
[^+]* # any character that is not '+', zero or more times
$ # assert at the end of the string
If you're just testing the string to see if it doesn't contain a +, then you should use:
^[^+]*$
This will match only if the ENTIRE string has no +.

Help interpreting a javascript Regex

I have found the following expression which is intended to modify the id of a cloned html element e.g. change contactDetails[0] to contactDetails[1]:
var nel = 1;
var s = $(this).attr(attribute);
s.replace(/([^\[]+)\[0\]/, "$1["+nel+"]");
$(this).attr(attribute, s);
I am not terribly familiar with regex, but have tried to interpret it and with the help of The Regex Coach however I am still struggling. It appears that ([^\[]+) matches one or more characters which are not '[' and \[0\]/ matches [0]. The / in the middle I interpret as an 'include both', so I don't understand why the author has even included the first expression.
I dont understand what the $1 in the replace string is and if I use the Regex Coach replace functionality if I simply use [0] as the search and 1 as the replace I get the correct result, however if I change the javascript to s.replace(/\[0\]/, "["+nel+"]"); the string s remains unchanged.
I would be grateful for any advice as to what the original author intended and help in finding a solution which will successfully replace the a number in square brackets anywhere within a search string.
Find
/ # Signifies the start of a regex expression like " for a string
([^\[]+) # Capture the character that isn't [ 1 or more times into $1
\[0\] # Find [0]
/ # Signifies the end of a regex expression
Replace
"$1[" # Insert the item captured above And [
+nel+ # New index
"]" # Close with ]
To create an expression that captures any digit, you can replace the 0 with \d+ which will match a digit 1 or more times.
s.replace(/([^\[]+)\[\d+\]/, "$1["+nel+"]");
The $1 is a backreference to the first group in the regex. Groups are the pieces inside (). So, in this case $1 will be replaced by whatever the ([^\[]+) part matched.
If the string was contactDetails[0] the resulting string would be contactDetails[1].
Note that this regex only replaces 0s inside square brackets. If you want to replace any number you will need something like:
([^\[]+)\[\d+\]
The \d matches any digit character. \d+ then becomes any sequence of at least one digit.
But your code will still not work, because Javascript strings are immutable. That means they can't be changed once created. The replace method returns a new string, instead of changing the original one. You should use:
s = s.replace(...)
looks like it replaces arrays of 0 with 1.
For example: array[0] goes to array[1]
Explanation:
([^[]+) - This part means save everything that is not a [ into variable $1
[0]/ - This part limits Part 1 to save everything up to a [0]
"$1["+nel+"]" - Print out the contents of $1 (loaded from part 1) and add the brackets with the value of nel. (in your example nel = 1)
Square braces define a set of characters to match. [abc] will match the letters a, b or c.
By adding the carat you are now specifying that you want characters not in the set. [^abc] will match any character that is not an a, b or c.
Because square braces have special meaning in RegExps you need to escape them with a slash if you want to match one. [ starts a character set, \[ matches a brace. (Same concept for closing braces.)
So, [^\[]+ captures 1 or more characters that are not [.
Wrapping that in parenthesis "captures" the matched portion of the string (in this case "contactDetails" so that you can use it in the replacement.
$1 uses the "captured" string (i.e. "contactDetails") in the replacement string.
This regex matches "something" followed by a [0].
"something" is identified by the expression [^\[]+ which matches all charactes that are not a [. You can see the () around this expression, because the match is reused with $1, later. The rest of your regex - that is \[0\] just matches the index [0]. The author had to write \[ and \] because [ and ] are special charactes for regular expressions and have to be escaped.
$1 is a reference to the value of the first paranthesis pair. In your case the value of
[^\[]+
which matches one or more characters which are not a '['
The remaining part of the regexp matches string '[0]'.
So if s is 'foobar[0]' the result will be 'foobar[1]'.
[^\[] will match any character that is not [, the '+' means one or more times. So [^[]+ will match contactDetails. The brackets will capture this for later use. The '\' is an escape symbol so the end \[0\] will match [0]. The replace string will use $1 which is what was captured in the brackets and add the new index.
Your interpretation of the regular expression is correct. It is intended to match one or more characters which are not [, followed by a literal [0]. And used in the replace method, the match would be replaced with the match of the first grouping (that’s what $1 is replaced with) together with the sequence [ followed by the value of nel and ] (that’s how "$1["+nel+"]" is to be interpreted).
And again, a simple s.replace(/\[0\]/, "["+nel+"]") does the same. Except if there is nothing in front of [0], because in that case the first regex wouldn’t find a match.

Categories