How to split string contains separator in javascript? - javascript

I have string like as "1 + 2 - 3 + 10".
I want split it to "1", "+2", "-3", "+10".
Here is my code.
var expression = "1 + 2 - 3 + 10";
expression = expression.replace(/\s+/g, '');
let fields = expression.split(/([+-]\d+)/g);
console.log(fields);
But result is
["1", "+2", "", "-3", "", "+10", ""]
How can I make result ["1", "+2", "-3", "+10"]?

Your regular expression takes a group
/([+-]\d+)/
^ ^ group
which is included in the result set.
as result you get for each following iteration two parts, the part previous from the group and the group itself.
"1" first find
"+2" group as separator for splitting, included to result set
"" second find, empty because of the found next separator
"-3" second separator/group
"" third part without separator
"+10" third separator
"" rest part between separator and end of string
You could split with a positive lookahead of an operator.
const
string = '1 + 2 - 3 + 10',
result = string.replace(/\s+/g, '').split(/(?=[+-])/);
console.log(result);

I would handle this by first stripping off all whitespace, then using match() with the regex pattern [/*+-]?\d+, which will match all digits, with an optional leading operator (not present for the first term).
var input = "1 + 2 - 3 + 10";
var matches = input.replace(/\s+/g, '').match(/[/*+-]?\d+/g);
console.log(matches);

Related

JAVASCRIPT Problem with regex in the .split() method

I have a string formed by number and mathematical operator like "1 + 1 *1" that is the text content of the number appendend on the screen div, I want to form an array of them and then divide it using mathematical operators such as + or - as a divisor, the problem is that when I try to divide them the array is actually divided, except for when the "-" sign is present, in fact if I have as a string "1 + 1 * 1 -1" the result will be an array ["1", "1", "1-1"] while it should be ["1", "1", "1", "1"]
Thanks everyone in advance.
let regex = /[+ | - | * | / ]/
let Arrays
Arrays = screen.textContent.split(regex);
You seem to be confusing alternatives with character sets.
Put the operators inside a character set, and optional spaces around it.
You need to escape - because it's used to separate the ends of a character range (unless you put it at the beginning or end of the character set).
let regex = /\s*[+\-*/]\s*/;
let text = '1 + 1 * 1 -1';
console.log(text.split(regex));
UPDATE
Splitting the string at +, -, *, /
let screentextContent = "1 + 1 * 1 -1"
let regex = /[+\-*/]/
let Arrays
Arrays = screentextContent.split(regex);
console.log(Arrays)
white space after 1 or before 1 will be preserved.
const str = "1 + 1 * 1 - 1";
let regex = /[+|\-|*|/]/
let arr
arr = str.split(regex).map(itm => itm.trim());
console.log(arr);

JavaScript: Splitting multiple space character string with a space character returns an array with 2 blank characters

I am trying to split a string by space character (\s) in JavaScript. But when I provide a string having multiple space characters, it returns an array with 2 blank characters. Below is my code:
let s = " ";
console.log(s.split(/\s+/));
This is the output:
►(2) ["", ""]
Can anybody please explain, what is happening here?
\s+ will not split the string by each space character but by limitless sequences of space characters. I mean, every sequence of spaces will be considered as a single separator. For example:
let s = "asd fgh jkl";
console.log(s.split(/\s+/));
>> (3) ["asd", "fgh", "jkl"]
When you split, even an emtpy string will return an array with a single empty element while the needle is not empty.
let s = "";
console.log(s.split(/\s+/));
>> (1) [""]
let s = "";
console.log(s.split(/a/));
>> (1) [""]
let s = "";
console.log(s.split(" "));
>> (1) [""]
If the needle happens to be at the start or at the end of the string, the string is still being partially splitted and an empty element is pushed. It happens with any kind of characters (not just whitespaces). Check these examples:
let s = " a ";
console.log(s.split(/\s+/));
>> (3) ["", "a", ""]
let s = "aaaaaaaaaa";
console.log(s.split(/\a+/));
>> (2) ["", ""]
let s = "a a";
console.log(s.split(/\a+/));
>> (3) ["", " ", ""]
Returning two empty elements from a non-empty string consisting of just whitespaces is just a consequence of that, since the whitespace sequence is starting and ending the string:
let s = " ";
console.log(s.split(""));
>> (2) ["", ""]
let s = " ";
console.log(s.split(/\s+/));
>> (2) ["", ""]
If you want to return an empty string, consider filtering the empty elements in the resulting array:
let s = " ";
console.log(s.split(/\s+/).filter(b => b != ''));
>> []
or simply
let s = " ";
console.log(s.split(/\s+/).filter(b => b));
>> []
The empty array will be returned only if the needle is empty:
let s = "";
console.log(s.split(/()/));
>> []
let s = "";
console.log(s.split(""));
>> []
Your regex \s+ will match 1 or more space characters in a row and SPLIT the string when it finds a match. Your string contains 1 or more space characters, so it will remove those characters and SPLIT the string into 2 halves. With one half being the part of the string before the match, and the other being the part of the string behind the match. In your case, both of those parts end up having no characters in them. However the split function doesn't care, it found the space characters it was looking for, so it did it's job and SPLIT the string.
The regular expression you're using, \s+, matches any non-zero number of "space" characters, and the entirety of the string assigned to the variable s matches that regular expression. Thus the entirety of s splits its left-end, "", from its right-end, "".

Split string to CamelCase words and also uppercase acronymns

Given a string containing CamelCase and also uppercase acronymns e.g. 'ManualABCTask';
How can it be split to a string with a space between all words and acronyms in a less wordy way?
I had the following process:
let initial = 'ManualABCTask'
//Split on single upper case followed by any number of lower case:
.split(/(['A-Z'][a-z]*)/g)
//the returned array includes empty string entries e.g. ["", "", "Manual", "A", "", "B", "", "C","", "Task", ""] so remove these:
.filter(x => x != '');
//When joining the array, the acronymn uppercase single letters have a space e.g. 'Manual A B C Task' so instead, reduce and add space only if array entry has more than one character
let word = initial.reduce((prevVal,currVal) => {
return (currVal.length == 1) ? prevVal + currVal : prevVal + ' ' + currVal + ' ';
}, '');
This does the job on the combinations it needs to e.g:
'ManualABCTask' => 'Manual ABC Task'
'ABCManualTask' => 'ABC Manual Task'
'ABCManualDEFTask' => 'ABC Manual DEF Task'
But it was a lot of code for the job done and surely could be handled in the initial regex.
I was experimenting while writing the question and with a tweak to the regex, got it down to one line, big improvement! So posting anyway with solution.
My regex know how isn't great so this could maybe be improved on still.
I know near to nothing about JavaScript but i had a bash at it:
let initial = 'ManualABCTask'
initial = initial.replace(/([A-Z][a-z]+)/g, ' $1 ').trim();
There 2 groups: starting from head letter with following lowercases, and starting from head letter until next letter isn't lowercase:
find = new RegExp(
"(" +
"[A-Z][a-z]+" + // Group starting from head letter with following lowercases
"|" +
"[A-Z]+(?![a-z])" + // Group with head letters until next letter isn't lowercase:
")",
"g"
)
initial = 'ManualABCTask'.split(find)
As mentioned in post, changed to handle in regex:
initial = 'ManualABCTask'.split(/(['A-Z']{2,99})(['A-Z'][a-z]*)/g).join(' ');
Group any concurrent upper characters with length of 2 to 99 to get the acronyms, and any single upper character followed by any number of lower to get the other words. Join with space.

How to change given string to regex modified string using javascript

Example strings :
2222
333333
12345
111
123456789
12345678
Expected result:
2#222
333#333
12#345
111
123#456#789
12#345#678
i.e. '#' should be inserted at the 4th,8th,12th etc last position from the end of the string.
I believe this can be done using replace and some other methods in JavaScript.
for validation of output string i have made the regex :
^(\d{1,3})(\.\d{3})*?$
You can use this regular expression:
/(\d)(\d{3})$/
this will match and group the first digit \d and group the last three \d{3} which are then grouped in their own group. Using the matched groups, you can then reference them in your replacement string using $1 and $2.
See example below:
const transform = str => str.replace(/(\d)(\d{3})$/, '$1#$2');
console.log(transform("2222")); // 2#222
console.log(transform("333333")); // 333#333
console.log(transform("12345")); // 12#345
console.log(transform("111")); // 111
For larger strings of size N, you could use other methods such as .match() and reverse the string like so:
const reverse = str => Array.from(str).reverse().join('');
const transform = str => {
return reverse(reverse(str).match(/(\d{1,3})/g).join('#'));
}
console.log(transform("2222")); // 2#222
console.log(transform("333333")); // 333#333
console.log(transform("12345")); // 12#345
console.log(transform("111")); // 111
console.log(transform("123456789")); // 123#456#789
console.log(transform("12345678")); // 12#345#678
var test = [
'111',
'2222',
'333333',
'12345',
'123456789',
'1234567890123456'
];
console.log(test.map(function (a) {
return a.replace(/(?=(?:\B\d{3})+$)/g, '#');
}));
You could match all the digits. In the replacement insert an # after every third digit from the right using a positive lookahead.
(?=(?:\B\d{3})+$)
(?= Positive lookahead, what is on the right is
(?:\B\d{3})+ Repeat 1+ times not a word boundary and 3 digits
$ Assert end of string
) Close lookahead
Regex demo
const regex = /^\d+$/;
["2222",
"333333",
"12345",
"111",
"123456789",
"12345678"
].forEach(s => console.log(
s.replace(/(?=(?:\B\d{3})+$)/g, "#")
));

JS string replace only replacing every other occurence

I have the following JS:
"a a a a".replace(/(^|\s)a(\s|$)/g, '$1')
I expect the result to be '', but am instead getting 'a a'. Can anyone explain to me what I am doing wrong?
Clarification: What I am trying to do is remove all occurrences of 'a' that are surronded by whitespace (i.e. a whole token)
It's because this regex /(^|\s)a(\s|$)/g match the previous char and the next char to each a
in string "a a a a" the regex matches :
"a " , then the string to check become "a a a"$ (but now the start of the string is not the beginning and there is not space before)
" a " (the third a) , then become "a"$ (that not match because no space before)
Edit:
Little bit tricky but working (without regex):
var a = "a a a a";
// Handle beginning case 'a '
var startI = a.indexOf("a ");
if (startI === 0){
var off = a.charAt(startI + 2) !== "a" ? 2 : 1; // test if "a" come next to keep the space before
a = a.slice(startI + off);
}
// Handle middle case ' a '
var iOf = -1;
while ((iOf = a.indexOf(" a ")) > -1){
var off = a.charAt(iOf + 3) !== "a" ? 3 : 2; // same here
a = a.slice(0, iOf) + a.slice(iOf+off, a.length);
}
// Handle end case ' a'
var endI = a.indexOf(" a");
if (endI === a.length - 2){
a = a.slice(0, endI);
}
a; // ""
First "a " matches.
Then it will try to match against "a a a", which will skip first a, and then match "a ".
Then it will try to match against "a", which will not match.
First match will be replaced to beginning of line. => "^"
Then we have "a" that didn't match => "a"
Second match will be replaced to " " => " "
Then we have "a" that didn't match => "a"
The result will be "a a".
To get your desired result you can do this:
"a a a a".replace(/(?:\s+a(?=\s))+\s+|^a\s+(?=[^a]|$|a\S)|^a|\s*a$/g, '')
As others have tried to point out, the issue is that the regex consumes the surrounding spaces as part of the match. Here's a [hopefully] more straight forward explanation of why that regex doesn't work as you expect:
First let's breakdown the regex, it says match the a space or start of string, followed by an 'a' followed by a space or the end of the string.
Now let's apply it to the string. I've added character indexes beneath the string to make things easier to talk about:
a a a a
0123456
The regex looks at the 0 index char, and finds an 'a' at that location, followed by a space at index 2. This is a match because it is the start of the string, followed by an a followed by a space. The length of our match is 2 (the 'a' and the space), so we consume two characters and start our next search at index 2.
Character 2 ('a') is neither a space nor the start of the string, and therefore it doesn't match the start of our regular expression, so we consume that character (without replacing it) and move on to the next.
Character 3 is a space, followed by an 'a' followed by another space, which is a match for our regex. We replace it with an empty string, consume the length of the match (3 characters - " a ") and move on to index 6.
Character 6 ('a') is neither a space nor the start of the string, and therefore it doesn't match the start of our regular expression, so we consume that character (without replacing it) and move on to the next.
Now we're at the end of the string, so we're done.
The reason why the regex #caeth suggested (/(^|\s+)a(?=\s|$)/g) works is because of the ?= quantifier. From the MDN Regexp Documentation:
Matches x only if x is followed by y. For example, /Jack(?=Sprat)/ matches "Jack" only if it is followed by "Sprat". /Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost". However, neither "Sprat" nor "Frost" is part of the match results.
So, in this case, the ?= quantifier checks to see if the following character is a space, without actually consuming that character.
(^|\s)a(?=\s|$)
Try this.Replace by $1.See demo.
https://regex101.com/r/gQ3kS4/3
Use this instead:
"a a a a".replace(/(^|\s*)a(\s|$)/g, '$1')
With "* this you replace all the "a" occurrences
Greetings
Or you can just split the string up, filter it and glue it back:
"a ba sl lf a df a a df r a".split(/\s+/).filter(function (x) { return x != "a" }).join(" ")
>>> "ba sl lf df df r"
"a a a a".split(/\s+/).filter(function (x) { return x != "a" }).join(" ")
>>> ""
Or in ECMAScript 6:
"a ba sl lf a df a a df r a".split(/\s+/).filter(x => x != "a").join(" ")
>>> "ba sl lf df df r"
"a a a a".split(/\s+/).filter(x => x != "a").join(" ")
>>> ""
I assume that there is no leading and trailing spaces. You can change the filter to x && x != 'a' if you want to remove the assumption.

Categories