How do I split a string with multiple separators in JavaScript?
I'm trying to split on both commas and spaces, but AFAIK JavaScript's split() function only supports one separator.
Pass in a regexp as the parameter:
js> "Hello awesome, world!".split(/[\s,]+/)
Hello,awesome,world!
Edited to add:
You can get the last element by selecting the length of the array minus 1:
>>> bits = "Hello awesome, world!".split(/[\s,]+/)
["Hello", "awesome", "world!"]
>>> bit = bits[bits.length - 1]
"world!"
... and if the pattern doesn't match:
>>> bits = "Hello awesome, world!".split(/foo/)
["Hello awesome, world!"]
>>> bits[bits.length - 1]
"Hello awesome, world!"
You can pass a regex into JavaScript's split() method. For example:
"1,2 3".split(/,| /)
["1", "2", "3"]
Or, if you want to allow multiple separators together to act as one only:
"1, 2, , 3".split(/(?:,| )+/)
["1", "2", "3"]
You have to use the non-capturing (?:) parenthesis, because
otherwise it gets spliced back into the result. Or you can be smart
like Aaron and use a character class.
Examples tested in Safari and Firefox.
Another simple but effective method is to use split + join repeatedly.
"a=b,c:d".split('=').join(',').split(':').join(',').split(',')
Essentially doing a split followed by a join is like a global replace so this replaces each separator with a comma then once all are replaced it does a final split on comma
The result of the above expression is:
['a', 'b', 'c', 'd']
Expanding on this you could also place it in a function:
function splitMulti(str, tokens){
var tempChar = tokens[0]; // We can use the first token as a temporary join character
for(var i = 1; i < tokens.length; i++){
str = str.split(tokens[i]).join(tempChar);
}
str = str.split(tempChar);
return str;
}
Usage:
splitMulti('a=b,c:d', ['=', ',', ':']) // ["a", "b", "c", "d"]
If you use this functionality a lot it might even be worth considering wrapping String.prototype.split for convenience (I think my function is fairly safe - the only consideration is the additional overhead of the conditionals (minor) and the fact that it lacks an implementation of the limit argument if an array is passed).
Be sure to include the splitMulti function if using this approach to the below simply wraps it :). Also worth noting that some people frown on extending built-ins (as many people do it wrong and conflicts can occur) so if in doubt speak to someone more senior before using this or ask on SO :)
var splitOrig = String.prototype.split; // Maintain a reference to inbuilt fn
String.prototype.split = function (){
if(arguments[0].length > 0){
if(Object.prototype.toString.call(arguments[0]) == "[object Array]" ) { // Check if our separator is an array
return splitMulti(this, arguments[0]); // Call splitMulti
}
}
return splitOrig.apply(this, arguments); // Call original split maintaining context
};
Usage:
var a = "a=b,c:d";
a.split(['=', ',', ':']); // ["a", "b", "c", "d"]
// Test to check that the built-in split still works (although our wrapper wouldn't work if it didn't as it depends on it :P)
a.split('='); // ["a", "b,c:d"]
Enjoy!
Lets keep it simple: (add a "[ ]+" to your RegEx means "1 or more")
This means "+" and "{1,}" are the same.
var words = text.split(/[ .:;?!~,`"&|()<>{}\[\]\r\n/\\]+/); // note ' and - are kept
Tricky method:
var s = "dasdnk asd, (naks) :d skldma";
var a = s.replace('(',' ').replace(')',' ').replace(',',' ').split(' ');
console.log(a);//["dasdnk", "asd", "naks", ":d", "skldma"]
I'm suprised no one has suggested it yet, but my hack-ey (and crazy fast) solution was to just append several 'replace' calls before splitting by the same character.
i.e. to remove a, b, c, d, and e:
let str = 'afgbfgcfgdfgefg'
let array = str.replace('a','d').replace('b','d').replace('c','d').replace('e','d').split('d')
this can be conveniently generalized for an array of splitters as follows:
function splitByMany( manyArgs, string ) {
do {
let arg = manyArgs.pop()
string = string.replace(arg, manyArgs[0])
} while (manyArgs.length > 2)
return string.split(manyArgs[0])
}
So, in your case, you could then call
let array = splitByMany([" ", ","], 'My long string containing commas, and spaces, and more commas');
For those of you who want more customization in their splitting function, I wrote a recursive algorithm that splits a given string with a list of characters to split on. I wrote this before I saw the above post. I hope it helps some frustrated programmers.
splitString = function(string, splitters) {
var list = [string];
for(var i=0, len=splitters.length; i<len; i++) {
traverseList(list, splitters[i], 0);
}
return flatten(list);
}
traverseList = function(list, splitter, index) {
if(list[index]) {
if((list.constructor !== String) && (list[index].constructor === String))
(list[index] != list[index].split(splitter)) ? list[index] = list[index].split(splitter) : null;
(list[index].constructor === Array) ? traverseList(list[index], splitter, 0) : null;
(list.constructor === Array) ? traverseList(list, splitter, index+1) : null;
}
}
flatten = function(arr) {
return arr.reduce(function(acc, val) {
return acc.concat(val.constructor === Array ? flatten(val) : val);
},[]);
}
var stringToSplit = "people and_other/things";
var splitList = [" ", "_", "/"];
splitString(stringToSplit, splitList);
Example above returns: ["people", "and", "other", "things"]
Note: flatten function was taken from Rosetta Code
You could just lump all the characters you want to use as separators either singularly or collectively into a regular expression and pass them to the split function. For instance you could write:
console.log( "dasdnk asd, (naks) :d skldma".split(/[ \(,\)]+/) );
And the output will be:
["dasdnk", "asd", "naks", ":d", "skldma"]
Here are some cases that may help by using Regex:
\W to match any character else word character [a-zA-Z0-9_]. Example:
("Hello World,I-am code").split(/\W+/); // would return [ 'Hello', 'World', 'I', 'am', 'code' ]
\s+ to match One or more spaces
\d to match a digit
if you want to split by some characters only let us say , and - you can use str.split(/[,-]+/)...etc
My refactor of #Brian answer
var string = 'and this is some kind of information and another text and simple and some egample or red or text';
var separators = ['and', 'or'];
function splitMulti(str, separators){
var tempChar = 't3mp'; //prevent short text separator in split down
//split by regex e.g. \b(or|and)\b
var re = new RegExp('\\b(' + separators.join('|') + ')\\b' , "g");
str = str.replace(re, tempChar).split(tempChar);
// trim & remove empty
return str.map(el => el.trim()).filter(el => el.length > 0);
}
console.log(splitMulti(string, separators))
Here is a new way to achieving same in ES6:
function SplitByString(source, splitBy) {
var splitter = splitBy.split('');
splitter.push([source]); //Push initial value
return splitter.reduceRight(function(accumulator, curValue) {
var k = [];
accumulator.forEach(v => k = [...k, ...v.split(curValue)]);
return k;
});
}
var source = "abc,def#hijk*lmn,opq#rst*uvw,xyz";
var splitBy = ",*#";
console.log(SplitByString(source, splitBy));
Please note in this function:
No Regex involved
Returns splitted value in same order as it appears in source
Result of above code would be:
Hi for example if you have split and replace in String 07:05:45PM
var hour = time.replace("PM", "").split(":");
Result
[ '07', '05', '45' ]
I will provide a classic implementation for a such function. The code works in almost all versions of JavaScript and is somehow optimum.
It doesn't uses regex, which is hard to maintain
It doesn't uses new features of JavaScript
It doesn't uses multiple .split() .join() invocation which require more computer memory
Just pure code:
var text = "Create a function, that will return an array (of string), with the words inside the text";
println(getWords(text));
function getWords(text)
{
let startWord = -1;
let ar = [];
for(let i = 0; i <= text.length; i++)
{
let c = i < text.length ? text[i] : " ";
if (!isSeparator(c) && startWord < 0)
{
startWord = i;
}
if (isSeparator(c) && startWord >= 0)
{
let word = text.substring(startWord, i);
ar.push(word);
startWord = -1;
}
}
return ar;
}
function isSeparator(c)
{
var separators = [" ", "\t", "\n", "\r", ",", ";", ".", "!", "?", "(", ")"];
return separators.includes(c);
}
You can see the code running in playground:
https://codeguppy.com/code.html?IJI0E4OGnkyTZnoszAzf
Splitting URL by .com/ or .net/
url.split(/\.com\/|\.net\//)
a = "a=b,c:d"
array = ['=',',',':'];
for(i=0; i< array.length; i++){ a= a.split(array[i]).join(); }
this will return the string without a special charecter.
I ran into this question wile looking for a replacement for the C# string.Split() function which splits a string using the characters in its argument.
In JavaScript you can do the same using map an reduce to iterate over the splitting characters and the intermediate values:
let splitters = [",", ":", ";"]; // or ",:;".split("");
let start= "a,b;c:d";
let values = splitters.reduce((old, c) => old.map(v => v.split(c)).flat(), [start]);
// values is ["a", "b", "c", "d"]
flat() is used to flatten the intermediate results so each iteration works on a list of strings without nested arrays. Each iteration applies split to all of the values in old and then returns the list of intermediate results to be split by the next value in splitters. reduce() is initialized with an array containing the initial string value.
I find that one of the main reasons I need this is to split file paths on both / and \. It's a bit of a tricky regex so I'll post it here for reference:
var splitFilePath = filePath.split(/[\/\\]/);
I think it's easier if you specify what you wanna leave, instead of what you wanna remove.
As if you wanna have only English words, you can use something like this:
text.match(/[a-z'\-]+/gi);
Examples (run snippet):
var R=[/[a-z'\-]+/gi,/[a-z'\-\s]+/gi];
var s=document.getElementById('s');
for(var i=0;i<R.length;i++)
{
var o=document.createElement('option');
o.innerText=R[i]+'';
o.value=i;
s.appendChild(o);
}
var t=document.getElementById('t');
var r=document.getElementById('r');
s.onchange=function()
{
r.innerHTML='';
var x=s.value;
if((x>=0)&&(x<R.length))
x=t.value.match(R[x]);
for(i=0;i<x.length;i++)
{
var li=document.createElement('li');
li.innerText=x[i];
r.appendChild(li);
}
}
<textarea id="t" style="width:70%;height:12em">even, test; spider-man
But saying o'er what I have said before:
My child is yet a stranger in the world;
She hath not seen the change of fourteen years,
Let two more summers wither in their pride,
Ere we may think her ripe to be a bride.
—Shakespeare, William. The Tragedy of Romeo and Juliet</textarea>
<p><select id="s">
<option selected>Select a regular expression</option>
<!-- option value="1">/[a-z'\-]+/gi</option>
<option value="2">/[a-z'\-\s]+/gi</option -->
</select></p>
<ol id="r" style="display:block;width:auto;border:1px inner;overflow:scroll;height:8em;max-height:10em;"></ol>
</div>
I don't know the performance of RegEx, but here is another alternative for RegEx leverages native HashSet and works in O( max(str.length, delimeter.length) ) complexity instead:
var multiSplit = function(str,delimiter){
if (!(delimiter instanceof Array))
return str.split(delimiter);
if (!delimiter || delimiter.length == 0)
return [str];
var hashSet = new Set(delimiter);
if (hashSet.has(""))
return str.split("");
var lastIndex = 0;
var result = [];
for(var i = 0;i<str.length;i++){
if (hashSet.has(str[i])){
result.push(str.substring(lastIndex,i));
lastIndex = i+1;
}
}
result.push(str.substring(lastIndex));
return result;
}
multiSplit('1,2,3.4.5.6 7 8 9',[',','.',' ']);
// Output: ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
multiSplit('1,2,3.4.5.6 7 8 9',' ');
// Output: ["1,2,3.4.5.6", "7", "8", "9"]
I solved this with reduce and filter. It might not be the most readable solution, or the fastest, and in real life I would probably use Aarons answere here, but it was fun to write.
[' ','_','-','.',',',':','#'].reduce(
(segs, sep) => segs.reduce(
(out, seg) => out.concat(seg.split(sep)), []),
['E-mail Address: user#domain.com, Phone Number: +1-800-555-0011']
).filter(x => x)
Or as a function:
function msplit(str, seps) {
return seps.reduce((segs, sep) => segs.reduce(
(out, seg) => out.concat(seg.split(sep)), []
), [str]).filter(x => x);
}
This will output:
['E','mail','Address','user','domain','com','0','Phone','Number','+1','800','555','0011']
Without the filter at the end you would get empty strings in the array where two different separators are next to each other.
Not the best way but works to Split with Multiple and Different seperators/delimiters
html
<button onclick="myFunction()">Split with Multiple and Different seperators/delimiters</button>
<p id="demo"></p>
javascript
<script>
function myFunction() {
var str = "How : are | you doing : today?";
var res = str.split(' | ');
var str2 = '';
var i;
for (i = 0; i < res.length; i++) {
str2 += res[i];
if (i != res.length-1) {
str2 += ",";
}
}
var res2 = str2.split(' : ');
//you can add countless options (with or without space)
document.getElementById("demo").innerHTML = res2;
}
</script>
Starting from #stephen-sweriduk solution (that was the more interesting to me!), I have slightly modified it to make more generic and reusable:
/**
* Adapted from: http://stackoverflow.com/questions/650022/how-do-i-split-a-string-with-multiple-separators-in-javascript
*/
var StringUtils = {
/**
* Flatten a list of strings
* http://rosettacode.org/wiki/Flatten_a_list
*/
flatten : function(arr) {
var self=this;
return arr.reduce(function(acc, val) {
return acc.concat(val.constructor === Array ? self.flatten(val) : val);
},[]);
},
/**
* Recursively Traverse a list and apply a function to each item
* #param list array
* #param expression Expression to use in func
* #param func function of (item,expression) to apply expression to item
*
*/
traverseListFunc : function(list, expression, index, func) {
var self=this;
if(list[index]) {
if((list.constructor !== String) && (list[index].constructor === String))
(list[index] != func(list[index], expression)) ? list[index] = func(list[index], expression) : null;
(list[index].constructor === Array) ? self.traverseListFunc(list[index], expression, 0, func) : null;
(list.constructor === Array) ? self.traverseListFunc(list, expression, index+1, func) : null;
}
},
/**
* Recursively map function to string
* #param string
* #param expression Expression to apply to func
* #param function of (item, expressions[i])
*/
mapFuncToString : function(string, expressions, func) {
var self=this;
var list = [string];
for(var i=0, len=expressions.length; i<len; i++) {
self.traverseListFunc(list, expressions[i], 0, func);
}
return self.flatten(list);
},
/**
* Split a string
* #param splitters Array of characters to apply the split
*/
splitString : function(string, splitters) {
return this.mapFuncToString(string, splitters, function(item, expression) {
return item.split(expression);
})
},
}
and then
var stringToSplit = "people and_other/things";
var splitList = [" ", "_", "/"];
var splittedString=StringUtils.splitString(stringToSplit, splitList);
console.log(splitList, stringToSplit, splittedString);
that gives back as the original:
[ ' ', '_', '/' ] 'people and_other/things' [ 'people', 'and', 'other', 'things' ]
An easy way to do this is to process each character of the string with each delimiter and build an array of the splits:
splix = function ()
{
u = [].slice.call(arguments); v = u.slice(1); u = u[0]; w = [u]; x = 0;
for (i = 0; i < u.length; ++i)
{
for (j = 0; j < v.length; ++j)
{
if (u.slice(i, i + v[j].length) == v[j])
{
y = w[x].split(v[j]); w[x] = y[0]; w[++x] = y[1];
};
};
};
return w;
};
console.logg = function ()
{
document.body.innerHTML += "<br>" + [].slice.call(arguments).join();
}
splix = function() {
u = [].slice.call(arguments);
v = u.slice(1);
u = u[0];
w = [u];
x = 0;
console.logg("Processing: <code>" + JSON.stringify(w) + "</code>");
for (i = 0; i < u.length; ++i) {
for (j = 0; j < v.length; ++j) {
console.logg("Processing: <code>[\x22" + u.slice(i, i + v[j].length) + "\x22, \x22" + v[j] + "\x22]</code>");
if (u.slice(i, i + v[j].length) == v[j]) {
y = w[x].split(v[j]);
w[x] = y[0];
w[++x] = y[1];
console.logg("Currently processed: " + JSON.stringify(w) + "\n");
};
};
};
console.logg("Return: <code>" + JSON.stringify(w) + "</code>");
};
setTimeout(function() {
console.clear();
splix("1.23--4", ".", "--");
}, 250);
#import url("http://fonts.googleapis.com/css?family=Roboto");
body {font: 20px Roboto;}
Usage: splix(string, delimiters...)
Example: splix("1.23--4", ".", "--")
Returns: ["1", "23", "4"]
Check out my simple library on Github
If you really do not want to visit or interact with the repo, here is the working code:
/**
*
* #param {type} input The string input to be split
* #param {type} includeTokensInOutput If true, the tokens are retained in the splitted output.
* #param {type} tokens The tokens to be employed in splitting the original string.
* #returns {Scanner}
*/
function Scanner(input, includeTokensInOutput, tokens) {
this.input = input;
this.includeTokensInOutput = includeTokensInOutput;
this.tokens = tokens;
}
Scanner.prototype.scan = function () {
var inp = this.input;
var parse = [];
this.tokens.sort(function (a, b) {
return b.length - a.length; //ASC, For Descending order use: b - a
});
for (var i = 0; i < inp.length; i++) {
for (var j = 0; j < this.tokens.length; j++) {
var token = this.tokens[j];
var len = token.length;
if (len > 0 && i + len <= inp.length) {
var portion = inp.substring(i, i + len);
if (portion === token) {
if (i !== 0) {//avoid empty spaces
parse[parse.length] = inp.substring(0, i);
}
if (this.includeTokensInOutput) {
parse[parse.length] = token;
}
inp = inp.substring(i + len);
i = -1;
break;
}
}
}
}
if (inp.length > 0) {
parse[parse.length] = inp;
}
return parse;
};
The usage is very straightforward:
var tokens = new Scanner("ABC+DE-GHIJK+LMNOP", false , new Array('+','-')).scan();
console.log(tokens);
Gives:
['ABC', 'DE', 'GHIJK', 'LMNOP']
And if you wish to include the splitting tokens (+ and -) in the output, set the false to true and voila! it still works.
The usage would now be:
var tokens = new Scanner("ABC+DE-GHIJK+LMNOP", true , new Array('+','-')).scan();
and
console.log(tokens);
would give:
['ABC', '+', 'DE', '-', 'GHIJK', '+', 'LMNOP']
ENJOY!
I use regexp:
str = 'Write a program that extracts from a given text all palindromes, e.g. "ABBA", "lamal", "exe".';
var strNew = str.match(/\w+/g);
// Output: ["Write", "a", "program", "that", "extracts", "from", "a", "given", "text", "all", "palindromes", "e", "g", "ABBA", "lamal", "exe"]
Related
How do I split a string with multiple separators in JavaScript?
I'm trying to split on both commas and spaces, but AFAIK JavaScript's split() function only supports one separator.
Pass in a regexp as the parameter:
js> "Hello awesome, world!".split(/[\s,]+/)
Hello,awesome,world!
Edited to add:
You can get the last element by selecting the length of the array minus 1:
>>> bits = "Hello awesome, world!".split(/[\s,]+/)
["Hello", "awesome", "world!"]
>>> bit = bits[bits.length - 1]
"world!"
... and if the pattern doesn't match:
>>> bits = "Hello awesome, world!".split(/foo/)
["Hello awesome, world!"]
>>> bits[bits.length - 1]
"Hello awesome, world!"
You can pass a regex into JavaScript's split() method. For example:
"1,2 3".split(/,| /)
["1", "2", "3"]
Or, if you want to allow multiple separators together to act as one only:
"1, 2, , 3".split(/(?:,| )+/)
["1", "2", "3"]
You have to use the non-capturing (?:) parenthesis, because
otherwise it gets spliced back into the result. Or you can be smart
like Aaron and use a character class.
Examples tested in Safari and Firefox.
Another simple but effective method is to use split + join repeatedly.
"a=b,c:d".split('=').join(',').split(':').join(',').split(',')
Essentially doing a split followed by a join is like a global replace so this replaces each separator with a comma then once all are replaced it does a final split on comma
The result of the above expression is:
['a', 'b', 'c', 'd']
Expanding on this you could also place it in a function:
function splitMulti(str, tokens){
var tempChar = tokens[0]; // We can use the first token as a temporary join character
for(var i = 1; i < tokens.length; i++){
str = str.split(tokens[i]).join(tempChar);
}
str = str.split(tempChar);
return str;
}
Usage:
splitMulti('a=b,c:d', ['=', ',', ':']) // ["a", "b", "c", "d"]
If you use this functionality a lot it might even be worth considering wrapping String.prototype.split for convenience (I think my function is fairly safe - the only consideration is the additional overhead of the conditionals (minor) and the fact that it lacks an implementation of the limit argument if an array is passed).
Be sure to include the splitMulti function if using this approach to the below simply wraps it :). Also worth noting that some people frown on extending built-ins (as many people do it wrong and conflicts can occur) so if in doubt speak to someone more senior before using this or ask on SO :)
var splitOrig = String.prototype.split; // Maintain a reference to inbuilt fn
String.prototype.split = function (){
if(arguments[0].length > 0){
if(Object.prototype.toString.call(arguments[0]) == "[object Array]" ) { // Check if our separator is an array
return splitMulti(this, arguments[0]); // Call splitMulti
}
}
return splitOrig.apply(this, arguments); // Call original split maintaining context
};
Usage:
var a = "a=b,c:d";
a.split(['=', ',', ':']); // ["a", "b", "c", "d"]
// Test to check that the built-in split still works (although our wrapper wouldn't work if it didn't as it depends on it :P)
a.split('='); // ["a", "b,c:d"]
Enjoy!
Lets keep it simple: (add a "[ ]+" to your RegEx means "1 or more")
This means "+" and "{1,}" are the same.
var words = text.split(/[ .:;?!~,`"&|()<>{}\[\]\r\n/\\]+/); // note ' and - are kept
Tricky method:
var s = "dasdnk asd, (naks) :d skldma";
var a = s.replace('(',' ').replace(')',' ').replace(',',' ').split(' ');
console.log(a);//["dasdnk", "asd", "naks", ":d", "skldma"]
I'm suprised no one has suggested it yet, but my hack-ey (and crazy fast) solution was to just append several 'replace' calls before splitting by the same character.
i.e. to remove a, b, c, d, and e:
let str = 'afgbfgcfgdfgefg'
let array = str.replace('a','d').replace('b','d').replace('c','d').replace('e','d').split('d')
this can be conveniently generalized for an array of splitters as follows:
function splitByMany( manyArgs, string ) {
do {
let arg = manyArgs.pop()
string = string.replace(arg, manyArgs[0])
} while (manyArgs.length > 2)
return string.split(manyArgs[0])
}
So, in your case, you could then call
let array = splitByMany([" ", ","], 'My long string containing commas, and spaces, and more commas');
For those of you who want more customization in their splitting function, I wrote a recursive algorithm that splits a given string with a list of characters to split on. I wrote this before I saw the above post. I hope it helps some frustrated programmers.
splitString = function(string, splitters) {
var list = [string];
for(var i=0, len=splitters.length; i<len; i++) {
traverseList(list, splitters[i], 0);
}
return flatten(list);
}
traverseList = function(list, splitter, index) {
if(list[index]) {
if((list.constructor !== String) && (list[index].constructor === String))
(list[index] != list[index].split(splitter)) ? list[index] = list[index].split(splitter) : null;
(list[index].constructor === Array) ? traverseList(list[index], splitter, 0) : null;
(list.constructor === Array) ? traverseList(list, splitter, index+1) : null;
}
}
flatten = function(arr) {
return arr.reduce(function(acc, val) {
return acc.concat(val.constructor === Array ? flatten(val) : val);
},[]);
}
var stringToSplit = "people and_other/things";
var splitList = [" ", "_", "/"];
splitString(stringToSplit, splitList);
Example above returns: ["people", "and", "other", "things"]
Note: flatten function was taken from Rosetta Code
You could just lump all the characters you want to use as separators either singularly or collectively into a regular expression and pass them to the split function. For instance you could write:
console.log( "dasdnk asd, (naks) :d skldma".split(/[ \(,\)]+/) );
And the output will be:
["dasdnk", "asd", "naks", ":d", "skldma"]
Here are some cases that may help by using Regex:
\W to match any character else word character [a-zA-Z0-9_]. Example:
("Hello World,I-am code").split(/\W+/); // would return [ 'Hello', 'World', 'I', 'am', 'code' ]
\s+ to match One or more spaces
\d to match a digit
if you want to split by some characters only let us say , and - you can use str.split(/[,-]+/)...etc
My refactor of #Brian answer
var string = 'and this is some kind of information and another text and simple and some egample or red or text';
var separators = ['and', 'or'];
function splitMulti(str, separators){
var tempChar = 't3mp'; //prevent short text separator in split down
//split by regex e.g. \b(or|and)\b
var re = new RegExp('\\b(' + separators.join('|') + ')\\b' , "g");
str = str.replace(re, tempChar).split(tempChar);
// trim & remove empty
return str.map(el => el.trim()).filter(el => el.length > 0);
}
console.log(splitMulti(string, separators))
Here is a new way to achieving same in ES6:
function SplitByString(source, splitBy) {
var splitter = splitBy.split('');
splitter.push([source]); //Push initial value
return splitter.reduceRight(function(accumulator, curValue) {
var k = [];
accumulator.forEach(v => k = [...k, ...v.split(curValue)]);
return k;
});
}
var source = "abc,def#hijk*lmn,opq#rst*uvw,xyz";
var splitBy = ",*#";
console.log(SplitByString(source, splitBy));
Please note in this function:
No Regex involved
Returns splitted value in same order as it appears in source
Result of above code would be:
Hi for example if you have split and replace in String 07:05:45PM
var hour = time.replace("PM", "").split(":");
Result
[ '07', '05', '45' ]
I will provide a classic implementation for a such function. The code works in almost all versions of JavaScript and is somehow optimum.
It doesn't uses regex, which is hard to maintain
It doesn't uses new features of JavaScript
It doesn't uses multiple .split() .join() invocation which require more computer memory
Just pure code:
var text = "Create a function, that will return an array (of string), with the words inside the text";
println(getWords(text));
function getWords(text)
{
let startWord = -1;
let ar = [];
for(let i = 0; i <= text.length; i++)
{
let c = i < text.length ? text[i] : " ";
if (!isSeparator(c) && startWord < 0)
{
startWord = i;
}
if (isSeparator(c) && startWord >= 0)
{
let word = text.substring(startWord, i);
ar.push(word);
startWord = -1;
}
}
return ar;
}
function isSeparator(c)
{
var separators = [" ", "\t", "\n", "\r", ",", ";", ".", "!", "?", "(", ")"];
return separators.includes(c);
}
You can see the code running in playground:
https://codeguppy.com/code.html?IJI0E4OGnkyTZnoszAzf
Splitting URL by .com/ or .net/
url.split(/\.com\/|\.net\//)
a = "a=b,c:d"
array = ['=',',',':'];
for(i=0; i< array.length; i++){ a= a.split(array[i]).join(); }
this will return the string without a special charecter.
I ran into this question wile looking for a replacement for the C# string.Split() function which splits a string using the characters in its argument.
In JavaScript you can do the same using map an reduce to iterate over the splitting characters and the intermediate values:
let splitters = [",", ":", ";"]; // or ",:;".split("");
let start= "a,b;c:d";
let values = splitters.reduce((old, c) => old.map(v => v.split(c)).flat(), [start]);
// values is ["a", "b", "c", "d"]
flat() is used to flatten the intermediate results so each iteration works on a list of strings without nested arrays. Each iteration applies split to all of the values in old and then returns the list of intermediate results to be split by the next value in splitters. reduce() is initialized with an array containing the initial string value.
I find that one of the main reasons I need this is to split file paths on both / and \. It's a bit of a tricky regex so I'll post it here for reference:
var splitFilePath = filePath.split(/[\/\\]/);
I think it's easier if you specify what you wanna leave, instead of what you wanna remove.
As if you wanna have only English words, you can use something like this:
text.match(/[a-z'\-]+/gi);
Examples (run snippet):
var R=[/[a-z'\-]+/gi,/[a-z'\-\s]+/gi];
var s=document.getElementById('s');
for(var i=0;i<R.length;i++)
{
var o=document.createElement('option');
o.innerText=R[i]+'';
o.value=i;
s.appendChild(o);
}
var t=document.getElementById('t');
var r=document.getElementById('r');
s.onchange=function()
{
r.innerHTML='';
var x=s.value;
if((x>=0)&&(x<R.length))
x=t.value.match(R[x]);
for(i=0;i<x.length;i++)
{
var li=document.createElement('li');
li.innerText=x[i];
r.appendChild(li);
}
}
<textarea id="t" style="width:70%;height:12em">even, test; spider-man
But saying o'er what I have said before:
My child is yet a stranger in the world;
She hath not seen the change of fourteen years,
Let two more summers wither in their pride,
Ere we may think her ripe to be a bride.
—Shakespeare, William. The Tragedy of Romeo and Juliet</textarea>
<p><select id="s">
<option selected>Select a regular expression</option>
<!-- option value="1">/[a-z'\-]+/gi</option>
<option value="2">/[a-z'\-\s]+/gi</option -->
</select></p>
<ol id="r" style="display:block;width:auto;border:1px inner;overflow:scroll;height:8em;max-height:10em;"></ol>
</div>
I don't know the performance of RegEx, but here is another alternative for RegEx leverages native HashSet and works in O( max(str.length, delimeter.length) ) complexity instead:
var multiSplit = function(str,delimiter){
if (!(delimiter instanceof Array))
return str.split(delimiter);
if (!delimiter || delimiter.length == 0)
return [str];
var hashSet = new Set(delimiter);
if (hashSet.has(""))
return str.split("");
var lastIndex = 0;
var result = [];
for(var i = 0;i<str.length;i++){
if (hashSet.has(str[i])){
result.push(str.substring(lastIndex,i));
lastIndex = i+1;
}
}
result.push(str.substring(lastIndex));
return result;
}
multiSplit('1,2,3.4.5.6 7 8 9',[',','.',' ']);
// Output: ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
multiSplit('1,2,3.4.5.6 7 8 9',' ');
// Output: ["1,2,3.4.5.6", "7", "8", "9"]
I solved this with reduce and filter. It might not be the most readable solution, or the fastest, and in real life I would probably use Aarons answere here, but it was fun to write.
[' ','_','-','.',',',':','#'].reduce(
(segs, sep) => segs.reduce(
(out, seg) => out.concat(seg.split(sep)), []),
['E-mail Address: user#domain.com, Phone Number: +1-800-555-0011']
).filter(x => x)
Or as a function:
function msplit(str, seps) {
return seps.reduce((segs, sep) => segs.reduce(
(out, seg) => out.concat(seg.split(sep)), []
), [str]).filter(x => x);
}
This will output:
['E','mail','Address','user','domain','com','0','Phone','Number','+1','800','555','0011']
Without the filter at the end you would get empty strings in the array where two different separators are next to each other.
Not the best way but works to Split with Multiple and Different seperators/delimiters
html
<button onclick="myFunction()">Split with Multiple and Different seperators/delimiters</button>
<p id="demo"></p>
javascript
<script>
function myFunction() {
var str = "How : are | you doing : today?";
var res = str.split(' | ');
var str2 = '';
var i;
for (i = 0; i < res.length; i++) {
str2 += res[i];
if (i != res.length-1) {
str2 += ",";
}
}
var res2 = str2.split(' : ');
//you can add countless options (with or without space)
document.getElementById("demo").innerHTML = res2;
}
</script>
Starting from #stephen-sweriduk solution (that was the more interesting to me!), I have slightly modified it to make more generic and reusable:
/**
* Adapted from: http://stackoverflow.com/questions/650022/how-do-i-split-a-string-with-multiple-separators-in-javascript
*/
var StringUtils = {
/**
* Flatten a list of strings
* http://rosettacode.org/wiki/Flatten_a_list
*/
flatten : function(arr) {
var self=this;
return arr.reduce(function(acc, val) {
return acc.concat(val.constructor === Array ? self.flatten(val) : val);
},[]);
},
/**
* Recursively Traverse a list and apply a function to each item
* #param list array
* #param expression Expression to use in func
* #param func function of (item,expression) to apply expression to item
*
*/
traverseListFunc : function(list, expression, index, func) {
var self=this;
if(list[index]) {
if((list.constructor !== String) && (list[index].constructor === String))
(list[index] != func(list[index], expression)) ? list[index] = func(list[index], expression) : null;
(list[index].constructor === Array) ? self.traverseListFunc(list[index], expression, 0, func) : null;
(list.constructor === Array) ? self.traverseListFunc(list, expression, index+1, func) : null;
}
},
/**
* Recursively map function to string
* #param string
* #param expression Expression to apply to func
* #param function of (item, expressions[i])
*/
mapFuncToString : function(string, expressions, func) {
var self=this;
var list = [string];
for(var i=0, len=expressions.length; i<len; i++) {
self.traverseListFunc(list, expressions[i], 0, func);
}
return self.flatten(list);
},
/**
* Split a string
* #param splitters Array of characters to apply the split
*/
splitString : function(string, splitters) {
return this.mapFuncToString(string, splitters, function(item, expression) {
return item.split(expression);
})
},
}
and then
var stringToSplit = "people and_other/things";
var splitList = [" ", "_", "/"];
var splittedString=StringUtils.splitString(stringToSplit, splitList);
console.log(splitList, stringToSplit, splittedString);
that gives back as the original:
[ ' ', '_', '/' ] 'people and_other/things' [ 'people', 'and', 'other', 'things' ]
An easy way to do this is to process each character of the string with each delimiter and build an array of the splits:
splix = function ()
{
u = [].slice.call(arguments); v = u.slice(1); u = u[0]; w = [u]; x = 0;
for (i = 0; i < u.length; ++i)
{
for (j = 0; j < v.length; ++j)
{
if (u.slice(i, i + v[j].length) == v[j])
{
y = w[x].split(v[j]); w[x] = y[0]; w[++x] = y[1];
};
};
};
return w;
};
console.logg = function ()
{
document.body.innerHTML += "<br>" + [].slice.call(arguments).join();
}
splix = function() {
u = [].slice.call(arguments);
v = u.slice(1);
u = u[0];
w = [u];
x = 0;
console.logg("Processing: <code>" + JSON.stringify(w) + "</code>");
for (i = 0; i < u.length; ++i) {
for (j = 0; j < v.length; ++j) {
console.logg("Processing: <code>[\x22" + u.slice(i, i + v[j].length) + "\x22, \x22" + v[j] + "\x22]</code>");
if (u.slice(i, i + v[j].length) == v[j]) {
y = w[x].split(v[j]);
w[x] = y[0];
w[++x] = y[1];
console.logg("Currently processed: " + JSON.stringify(w) + "\n");
};
};
};
console.logg("Return: <code>" + JSON.stringify(w) + "</code>");
};
setTimeout(function() {
console.clear();
splix("1.23--4", ".", "--");
}, 250);
#import url("http://fonts.googleapis.com/css?family=Roboto");
body {font: 20px Roboto;}
Usage: splix(string, delimiters...)
Example: splix("1.23--4", ".", "--")
Returns: ["1", "23", "4"]
Check out my simple library on Github
If you really do not want to visit or interact with the repo, here is the working code:
/**
*
* #param {type} input The string input to be split
* #param {type} includeTokensInOutput If true, the tokens are retained in the splitted output.
* #param {type} tokens The tokens to be employed in splitting the original string.
* #returns {Scanner}
*/
function Scanner(input, includeTokensInOutput, tokens) {
this.input = input;
this.includeTokensInOutput = includeTokensInOutput;
this.tokens = tokens;
}
Scanner.prototype.scan = function () {
var inp = this.input;
var parse = [];
this.tokens.sort(function (a, b) {
return b.length - a.length; //ASC, For Descending order use: b - a
});
for (var i = 0; i < inp.length; i++) {
for (var j = 0; j < this.tokens.length; j++) {
var token = this.tokens[j];
var len = token.length;
if (len > 0 && i + len <= inp.length) {
var portion = inp.substring(i, i + len);
if (portion === token) {
if (i !== 0) {//avoid empty spaces
parse[parse.length] = inp.substring(0, i);
}
if (this.includeTokensInOutput) {
parse[parse.length] = token;
}
inp = inp.substring(i + len);
i = -1;
break;
}
}
}
}
if (inp.length > 0) {
parse[parse.length] = inp;
}
return parse;
};
The usage is very straightforward:
var tokens = new Scanner("ABC+DE-GHIJK+LMNOP", false , new Array('+','-')).scan();
console.log(tokens);
Gives:
['ABC', 'DE', 'GHIJK', 'LMNOP']
And if you wish to include the splitting tokens (+ and -) in the output, set the false to true and voila! it still works.
The usage would now be:
var tokens = new Scanner("ABC+DE-GHIJK+LMNOP", true , new Array('+','-')).scan();
and
console.log(tokens);
would give:
['ABC', '+', 'DE', '-', 'GHIJK', '+', 'LMNOP']
ENJOY!
I use regexp:
str = 'Write a program that extracts from a given text all palindromes, e.g. "ABBA", "lamal", "exe".';
var strNew = str.match(/\w+/g);
// Output: ["Write", "a", "program", "that", "extracts", "from", "a", "given", "text", "all", "palindromes", "e", "g", "ABBA", "lamal", "exe"]
I'm starting my adventure with javascript and i got one of first tasks.
I must create function that count letter that most occur in string and write this in console.
For example:
var string = "assssssadaaaAAAasadaaab";
and in console.log should be (7,a) <---
the longest string is 7 consecutive identical characters (yes, before count i use .toLowerCase();, because the task requires it)
So far I have it and I don't know what to do next.
Someone want to help?
var string = "assssssadaaaAAAasadaaab";
var string = string.toLowerCase();
function writeInConsole(){
console.log(string);
var count = (string.match(/a/g) || []).length;
console.log(count);
}
writeInConsole();
One option could be matching all consecutive characters using (.)\1* and sort the result by character length.
Then return an array with the length of the string and the character.
Note that this will take the first longest occurrence in case of multiple characters with the same length.
function writeInConsole(s) {
var m = s.match(/(.)\1*/g);
if (m) {
var res = m.reduce(function(a, b) {
return b.length > a.length ? b : a;
})
return [res.length, res.charAt(0)];
}
return [];
}
["assssssadaaaAAAasadaaab", "a", ""].forEach(s => {
s = s.toLowerCase();
console.log(writeInConsole(s))
});
Another example when you have multiple consecutive characters with the same length
function writeInConsole(s) {
let m = s.match(/(.)\1*/g);
if (m) {
let sorted = m.sort((a, b) => b.length - a.length)
let maxLength = sorted[0].length;
let result = [];
for (let i = 0; i < sorted.length; i++) {
if (sorted[i].length === maxLength) {
result.push([maxLength, sorted[i].charAt(0)]);
continue;
}
break;
}
return result;
}
return [];
}
[
"assssssadaaaAAAasadaaab",
"aaabccc",
"abc",
"yyzzz",
"aa",
""
].forEach(s => {
s = s.toLowerCase();
console.log(writeInConsole(s))
});
I'm no sure if this works for you:
string source = "/once/upon/a/time/";
int count = 0;
foreach (char c in source)
if (c == '/') count++;
The answer given by using regular expressions is more succinct, but since you say you are just starting out with programming, I will offer a verbose one that might be easier to follow.
var string = "assssssadaaaAAAasadaaab";
var string = string.toLowerCase();
function computeLongestRun(s) {
// we set up for the computation at the first character in the string
var longestRunLetter = currentLetter = string[0]
var longestRunLength = currentRunLength = 1
// loop through the string considering one character at a time
for (i = 1; i < s.length; i++) {
if (s[i] == currentLetter) { // is this letter the same as the last one?
currentRunLength++ // if yes, reflect that
} else { // otherwise, check if the current run
// is the longest
if (currentRunLength > longestRunLength) {
longestRunLetter = currentLetter
longestRunLength = currentRunLength
}
// reset to start counting a new run
currentRunLength = 1
currentLetter = s[i]
}
}
return [longestRunLetter, longestRunLength]
}
console.log(computeLongestRun(string))
Let say I have these two examples
(A = 1) and ( B = 2)
(A = 1)(B = 2 ()).
I need a way to get the following array:
[(],[A][=][1],[)],[and],[(],[B],[=],[2],[)]
[(],[A][=][1],[)],[(],[B],[=],[2],[(],,[)][)]
What I tried to do is the following
Find the delimiters using the following function (in this case the delimiters are the space "" and any brackets ( or ) )
function findExpressionDelimeter (textAreaValue){
var delimiterPositions = [];
var bracesDepth = 0;
var squareBracketsDepth = 0;
var bracketsDepth = 0;
for (var i = 0; i < textAreaValue.length; i++) {
switch (textAreaValue[i]) {
case '(':
bracketsDepth++;
delimiterPositions.push(i);
break;
case ')':
bracketsDepth--;
delimiterPositions.push(i);
break;
case '[':
squareBracketsDepth++;
break;
case ']':
squareBracketsDepth--;
break;
default:
if (squareBracketsDepth == 0 && textAreaValue[i] == ' ') {
delimiterPositions.push(i);
}
}
}
return delimiterPositions;
}
Then I tried to loop trough the values returned and extract the values using substring. The issue is that when I have a ( or ) I need to get the next substring as well as the bracket. This is where I am stuck.
function getTextByDelimeter(delimiterPositions, value) {
var output = [];
var index = 0;
var length = 0;
var string = "";
for (var j = 0; j < delimiterPositions.length; j++) {
if (j == 0) {
index = 0;
} else {
index = delimiterPositions[j - 1] + 1;
}
length = delimiterPositions[j];
string = value.substring(index, length);
output.push(string);
}
string = value.substring(length, value.length);
output.push(string);
return output;
}
Any help would be appreciated.
You could just match the tokens you are interested in:
var str = "(A = 1) and ( B = 2)";
var arr = str.match(/[()]|[^()\s]+/g);
Result:
["(", "A", "=", "1", ")", "and", "(", "B", "=", "2", ")"]
The regex with some comments:
[()] # match a single character token
| # or
[^()\s]+ # match everything else except spaces
If you would like to add more single character tokens, like for example a =, just add it to both character classes. Ie: [()=]|[^()=\s]+
What you want to do is a lexical analyser.
Regular expressions won't allow you to parse a language (a mathematical expression is one). The tree decomposition of the formula cannot be done with it.
However, regex can allow you to discriminate tokens. This is usually done by reading the stream of character. Once you've detect a lexeme, you generate the token.
If you want to check the validity of the formula, or compute the value: you need a parser (semantic analyser). This can't be done using regex.
The similar question with the answer is here.
You can split your string(string.split('')) And then delete whitespaces from array or just check if array[i] != ' ' before your switch block.
I have a string:
var string = "aaaaaa<br />† bbbb<br />‡ cccc"
And I would like to split this string with the delimiter <br /> followed by a special character.
To do that, I am using this:
string.split(/<br \/>&#?[a-zA-Z0-9]+;/g);
I am getting what I need, except that I am losing the delimiter.
Here is the example: http://jsfiddle.net/JwrZ6/1/
How can I keep the delimiter?
I was having similar but slight different problem. Anyway, here are examples of three different scenarios for where to keep the deliminator.
"1、2、3".split("、") == ["1", "2", "3"]
"1、2、3".split(/(、)/g) == ["1", "、", "2", "、", "3"]
"1、2、3".split(/(?=、)/g) == ["1", "、2", "、3"]
"1、2、3".split(/(?!、)/g) == ["1、", "2、", "3"]
"1、2、3".split(/(.*?、)/g) == ["", "1、", "", "2、", "3"]
Warning: The fourth will only work to split single characters. ConnorsFan presents an alternative:
// Split a path, but keep the slashes that follow directories
var str = 'Animation/rawr/javascript.js';
var tokens = str.match(/[^\/]+\/?|\//g);
Use (positive) lookahead so that the regular expression asserts that the special character exists, but does not actually match it:
string.split(/<br \/>(?=&#?[a-zA-Z0-9]+;)/g);
See it in action:
var string = "aaaaaa<br />† bbbb<br />‡ cccc";
console.log(string.split(/<br \/>(?=&#?[a-zA-Z0-9]+;)/g));
If you wrap the delimiter in parantheses it will be part of the returned array.
string.split(/(<br \/>&#?[a-zA-Z0-9]+);/g);
// returns ["aaaaaa", "<br />†", "bbbb", "<br />‡", "cccc"]
Depending on which part you want to keep change which subgroup you match
string.split(/(<br \/>)&#?[a-zA-Z0-9]+;/g);
// returns ["aaaaaa", "<br />", "bbbb", "<br />", "cccc"]
You could improve the expression by ignoring the case of letters
string.split(/()&#?[a-z0-9]+;/gi);
And you can match for predefined groups like this: \d equals [0-9] and \w equals [a-zA-Z0-9_]. This means your expression could look like this.
string.split(/<br \/>(&#?[a-z\d]+;)/gi);
There is a good Regular Expression Reference on JavaScriptKit.
If you group the split pattern, its match will be kept in the output and it is by design:
If separator is a regular expression with capturing parentheses, then
each time separator matches, the results (including any undefined
results) of the capturing parentheses are spliced into the output
array.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split#description
You don't need a lookahead or global flag unless your search pattern uses one.
const str = `How much wood would a woodchuck chuck, if a woodchuck could chuck wood?`
const result = str.split(/(\s+)/);
console.log(result);
// We can verify the result
const isSame = result.join('') === str;
console.log({ isSame });
You can use multiple groups. You can be as creative as you like and what remains outside the groups will be removed:
const str = `How much wood would a woodchuck chuck, if a woodchuck could chuck wood?`
const result = str.split(/(\s+)(\w{1,2})\w+/);
console.log(result, result.join(''));
answered it here also JavaScript Split Regular Expression keep the delimiter
use the (?=pattern) lookahead pattern in the regex
example
var string = '500x500-11*90~1+1';
string = string.replace(/(?=[$-/:-?{-~!"^_`\[\]])/gi, ",");
string = string.split(",");
this will give you the following result.
[ '500x500', '-11', '*90', '~1', '+1' ]
Can also be directly split
string = string.split(/(?=[$-/:-?{-~!"^_`\[\]])/gi);
giving the same result
[ '500x500', '-11', '*90', '~1', '+1' ]
I made a modification to jichi's answer, and put it in a function which also supports multiple letters.
String.prototype.splitAndKeep = function(separator, method='seperate'){
var str = this;
if(method == 'seperate'){
str = str.split(new RegExp(`(${separator})`, 'g'));
}else if(method == 'infront'){
str = str.split(new RegExp(`(?=${separator})`, 'g'));
}else if(method == 'behind'){
str = str.split(new RegExp(`(.*?${separator})`, 'g'));
str = str.filter(function(el){return el !== "";});
}
return str;
};
jichi's answers 3rd method would not work in this function, so I took the 4th method, and removed the empty spaces to get the same result.
edit:
second method which excepts an array to split char1 or char2
String.prototype.splitAndKeep = function(separator, method='seperate'){
var str = this;
function splitAndKeep(str, separator, method='seperate'){
if(method == 'seperate'){
str = str.split(new RegExp(`(${separator})`, 'g'));
}else if(method == 'infront'){
str = str.split(new RegExp(`(?=${separator})`, 'g'));
}else if(method == 'behind'){
str = str.split(new RegExp(`(.*?${separator})`, 'g'));
str = str.filter(function(el){return el !== "";});
}
return str;
}
if(Array.isArray(separator)){
var parts = splitAndKeep(str, separator[0], method);
for(var i = 1; i < separator.length; i++){
var partsTemp = parts;
parts = [];
for(var p = 0; p < partsTemp.length; p++){
parts = parts.concat(splitAndKeep(partsTemp[p], separator[i], method));
}
}
return parts;
}else{
return splitAndKeep(str, separator, method);
}
};
usage:
str = "first1-second2-third3-last";
str.splitAndKeep(["1", "2", "3"]) == ["first", "1", "-second", "2", "-third", "3", "-last"];
str.splitAndKeep("-") == ["first1", "-", "second2", "-", "third3", "-", "last"];
An extension function splits string with substring or RegEx and the delimiter is putted according to second parameter ahead or behind.
String.prototype.splitKeep = function (splitter, ahead) {
var self = this;
var result = [];
if (splitter != '') {
var matches = [];
// Getting mached value and its index
var replaceName = splitter instanceof RegExp ? "replace" : "replaceAll";
var r = self[replaceName](splitter, function (m, i, e) {
matches.push({ value: m, index: i });
return getSubst(m);
});
// Finds split substrings
var lastIndex = 0;
for (var i = 0; i < matches.length; i++) {
var m = matches[i];
var nextIndex = ahead == true ? m.index : m.index + m.value.length;
if (nextIndex != lastIndex) {
var part = self.substring(lastIndex, nextIndex);
result.push(part);
lastIndex = nextIndex;
}
};
if (lastIndex < self.length) {
var part = self.substring(lastIndex, self.length);
result.push(part);
};
// Substitution of matched string
function getSubst(value) {
var substChar = value[0] == '0' ? '1' : '0';
var subst = '';
for (var i = 0; i < value.length; i++) {
subst += substChar;
}
return subst;
};
}
else {
result.add(self);
};
return result;
};
The test:
test('splitKeep', function () {
// String
deepEqual("1231451".splitKeep('1'), ["1", "231", "451"]);
deepEqual("123145".splitKeep('1', true), ["123", "145"]);
deepEqual("1231451".splitKeep('1', true), ["123", "145", "1"]);
deepEqual("hello man how are you!".splitKeep(' '), ["hello ", "man ", "how ", "are ", "you!"]);
deepEqual("hello man how are you!".splitKeep(' ', true), ["hello", " man", " how", " are", " you!"]);
// Regex
deepEqual("mhellommhellommmhello".splitKeep(/m+/g), ["m", "hellomm", "hellommm", "hello"]);
deepEqual("mhellommhellommmhello".splitKeep(/m+/g, true), ["mhello", "mmhello", "mmmhello"]);
});
I've been using this:
String.prototype.splitBy = function (delimiter) {
var
delimiterPATTERN = '(' + delimiter + ')',
delimiterRE = new RegExp(delimiterPATTERN, 'g');
return this.split(delimiterRE).reduce((chunks, item) => {
if (item.match(delimiterRE)){
chunks.push(item)
} else {
chunks[chunks.length - 1] += item
};
return chunks
}, [])
}
Except that you shouldn't mess with String.prototype, so here's a function version:
var splitBy = function (text, delimiter) {
var
delimiterPATTERN = '(' + delimiter + ')',
delimiterRE = new RegExp(delimiterPATTERN, 'g');
return text.split(delimiterRE).reduce(function(chunks, item){
if (item.match(delimiterRE)){
chunks.push(item)
} else {
chunks[chunks.length - 1] += item
};
return chunks
}, [])
}
So you could do:
var haystack = "aaaaaa<br />† bbbb<br />‡ cccc"
var needle = '<br \/>&#?[a-zA-Z0-9]+;';
var result = splitBy(haystack , needle)
console.log( JSON.stringify( result, null, 2) )
And you'll end up with:
[
"<br />† bbbb",
"<br />‡ cccc"
]
Most of the existing answers predate the introduction of lookbehind assertions in JavaScript in 2018. You didn't specify how you wanted the delimiters to be included in the result. One typical use case would be sentences delimited by punctuation ([.?!]), where one would want the delimiters to be included at the ends of the resulting strings. This corresponds to the fourth case in the accepted answer, but as noted there, that solution only works for single characters. Arbitrary strings with the delimiters appended at the end can be formed with a lookbehind assertion:
'It is. Is it? It is!'.split(/(?<=[.?!])/)
/* [ 'It is.', ' Is it?', ' It is!' ] */
I know that this is a bit late but you could also use lookarounds
var string = "aaaaaa<br />† bbbb<br />‡ cccc";
var array = string.split(/(?<=<br \/>)/);
console.log(array);
I've also came up with this solution. No regex needed, very readable.
const str = "hello world what a great day today balbla"
const separatorIndex = str.indexOf("great")
const parsedString = str.slice(separatorIndex)
console.log(parsedString)
I have a string, let's say Hello world and I need to replace the char at index 3. How can I replace a char by specifying a index?
var str = "hello world";
I need something like
str.replaceAt(0,"h");
In JavaScript, strings are immutable, which means the best you can do is to create a new string with the changed content and assign the variable to point to it.
You'll need to define the replaceAt() function yourself:
String.prototype.replaceAt = function(index, replacement) {
return this.substring(0, index) + replacement + this.substring(index + replacement.length);
}
And use it like this:
var hello = "Hello World";
alert(hello.replaceAt(2, "!!")); // He!!o World
There is no replaceAt function in JavaScript. You can use the following code to replace any character in any string at specified position:
function rep() {
var str = 'Hello World';
str = setCharAt(str,4,'a');
alert(str);
}
function setCharAt(str,index,chr) {
if(index > str.length-1) return str;
return str.substring(0,index) + chr + str.substring(index+1);
}
<button onclick="rep();">click</button>
You can't. Take the characters before and after the position and concat into a new string:
var s = "Hello world";
var index = 3;
s = s.substring(0, index) + 'x' + s.substring(index + 1);
str = str.split('');
str[3] = 'h';
str = str.join('');
There are lot of answers here, and all of them are based on two methods:
METHOD1: split the string using two substrings and stuff the character between them
METHOD2: convert the string to character array, replace one array member and join it
Personally, I would use these two methods in different cases. Let me explain.
#FabioPhms: Your method was the one I initially used and I was afraid that it is bad on string with lots of characters. However, question is what's a lot of characters? I tested it on 10 "lorem ipsum" paragraphs and it took a few milliseconds. Then I tested it on 10 times larger string - there was really no big difference. Hm.
#vsync, #Cory Mawhorter: Your comments are unambiguous; however, again, what is a large string? I agree that for 32...100kb performance should better and one should use substring-variant for this one operation of character replacement.
But what will happen if I have to make quite a few replacements?
I needed to perform my own tests to prove what is faster in that case. Let's say we have an algorithm that will manipulate a relatively short string that consists of 1000 characters. We expect that in average each character in that string will be replaced ~100 times. So, the code to test something like this is:
var str = "... {A LARGE STRING HERE} ...";
for(var i=0; i<100000; i++)
{
var n = '' + Math.floor(Math.random() * 10);
var p = Math.floor(Math.random() * 1000);
// replace character *n* on position *p*
}
I created a fiddle for this, and it's here.
There are two tests, TEST1 (substring) and TEST2 (array conversion).
Results:
TEST1: 195ms
TEST2: 6ms
It seems that array conversion beats substring by 2 orders of magnitude! So - what the hell happened here???
What actually happens is that all operations in TEST2 are done on array itself, using assignment expression like strarr2[p] = n. Assignment is really fast compared to substring on a large string, and its clear that it's going to win.
So, it's all about choosing the right tool for the job. Again.
Work with vectors is usually most effective to contact String.
I suggest the following function:
String.prototype.replaceAt=function(index, char) {
var a = this.split("");
a[index] = char;
return a.join("");
}
Run this snippet:
String.prototype.replaceAt=function(index, char) {
var a = this.split("");
a[index] = char;
return a.join("");
}
var str = "hello world";
str = str.replaceAt(3, "#");
document.write(str);
In Javascript strings are immutable so you have to do something like
var x = "Hello world"
x = x.substring(0, i) + 'h' + x.substring(i+1);
To replace the character in x at i with 'h'
function dothis() {
var x = document.getElementById("x").value;
var index = document.getElementById("index").value;
var text = document.getElementById("text").value;
var length = document.getElementById("length").value;
var arr = x.split("");
arr.splice(index, length, text);
var result = arr.join("");
document.getElementById('output').innerHTML = result;
console.log(result);
}
dothis();
<input id="x" type="text" value="White Dog" placeholder="Enter Text" />
<input id="index" type="number" min="0"value="6" style="width:50px" placeholder="index" />
<input id="length" type="number" min="0"value="1" style="width:50px" placeholder="length" />
<input id="text" type="text" value="F" placeholder="New character" />
<br>
<button id="submit" onclick="dothis()">Run</button>
<p id="output"></p>
This method is good for small length strings but may be slow for larger text.
var x = "White Dog";
var arr = x.split(""); // ["W", "h", "i", "t", "e", " ", "D", "o", "g"]
arr.splice(6, 1, 'F');
/*
Here 6 is starting index and 1 is no. of array elements to remove and
final argument 'F' is the new character to be inserted.
*/
var result = arr.join(""); // "White Fog"
One-liner using String.replace with callback (no emoji support):
// 0 - index to replace, 'f' - replacement string
'dog'.replace(/./g, (c, i) => i == 0? 'f': c)
// "fog"
Explained:
//String.replace will call the callback on each pattern match
//in this case - each character
'dog'.replace(/./g, function (character, index) {
if (index == 0) //we want to replace the first character
return 'f'
return character //leaving other characters the same
})
Generalizing Afanasii Kurakin's answer, we have:
function replaceAt(str, index, ch) {
return str.replace(/./g, (c, i) => i == index ? ch : c);
}
let str = 'Hello World';
str = replaceAt(str, 1, 'u');
console.log(str); // Hullo World
Let's expand and explain both the regular expression and the replacer function:
function replaceAt(str, index, newChar) {
function replacer(origChar, strIndex) {
if (strIndex === index)
return newChar;
else
return origChar;
}
return str.replace(/./g, replacer);
}
let str = 'Hello World';
str = replaceAt(str, 1, 'u');
console.log(str); // Hullo World
The regular expression . matches exactly one character. The g makes it match every character in a for loop. The replacer function is called given both the original character and the index of where that character is in the string. We make a simple if statement to determine if we're going to return either origChar or newChar.
var str = "hello world";
console.log(str);
var arr = [...str];
arr[0] = "H";
str = arr.join("");
console.log(str);
This works similar to Array.splice:
String.prototype.splice = function (i, j, str) {
return this.substr(0, i) + str + this.substr(j, this.length);
};
You could try
var strArr = str.split("");
strArr[0] = 'h';
str = strArr.join("");
this is easily achievable with RegExp!
const str = 'Hello RegEx!';
const index = 11;
const replaceWith = 'p';
//'Hello RegEx!'.replace(/^(.{11})(.)/, `$1p`);
str.replace(new RegExp(`^(.{${ index }})(.)`), `$1${ replaceWith }`);
//< "Hello RegExp"
Using the spread syntax, you may convert the string to an array, assign the character at the given position, and convert back to a string:
const str = "hello world";
function replaceAt(s, i, c) {
const arr = [...s]; // Convert string to array
arr[i] = c; // Set char c at pos i
return arr.join(''); // Back to string
}
// prints "hallo world"
console.log(replaceAt(str, 1, 'a'));
You could try
var strArr = str.split("");
strArr[0] = 'h';
str = strArr.join("");
Check out this function for printing steps
steps(3)
// '# '
// '## '
// '###'
function steps(n, i = 0, arr = Array(n).fill(' ').join('')) {
if (i === n) {
return;
}
str = arr.split('');
str[i] = '#';
str = str.join('');
console.log(str);
steps(n, (i = i + 1), str);
}
#CemKalyoncu: Thanks for the great answer!
I also adapted it slightly to make it more like the Array.splice method (and took #Ates' note into consideration):
spliceString=function(string, index, numToDelete, char) {
return string.substr(0, index) + char + string.substr(index+numToDelete);
}
var myString="hello world!";
spliceString(myString,myString.lastIndexOf('l'),2,'mhole'); // "hello wormhole!"
If you want to replace characters in string, you should create mutable strings. These are essentially character arrays. You could create a factory:
function MutableString(str) {
var result = str.split("");
result.toString = function() {
return this.join("");
}
return result;
}
Then you can access the characters and the whole array converts to string when used as string:
var x = MutableString("Hello");
x[0] = "B"; // yes, we can alter the character
x.push("!"); // good performance: no new string is created
var y = "Hi, "+x; // converted to string: "Hi, Bello!"
You can extend the string type to include the inset method:
String.prototype.append = function (index,value) {
return this.slice(0,index) + value + this.slice(index);
};
var s = "New string";
alert(s.append(4,"complete "));
Then you can call the function:
You can concatenate using sub-string function at first select text before targeted index and after targeted index then concatenate with your potential char or string. This one is better
const myString = "Hello world";
const index = 3;
const stringBeforeIndex = myString.substring(0, index);
const stringAfterIndex = myString.substring(index + 1);
const replaceChar = "X";
myString = stringBeforeIndex + replaceChar + stringAfterIndex;
console.log("New string - ", myString)
or
const myString = "Hello world";
let index = 3;
myString = myString.substring(0, index) + "X" + myString.substring(index + 1);
I did a function that does something similar to what you ask, it checks if a character in string is in an array of not allowed characters if it is it replaces it with ''
var validate = function(value){
var notAllowed = [";","_",">","<","'","%","$","&","/","|",":","=","*"];
for(var i=0; i<value.length; i++){
if(notAllowed.indexOf(value.charAt(i)) > -1){
value = value.replace(value.charAt(i), "");
value = validate(value);
}
}
return value;
}
Here is a version I came up with if you want to style words or individual characters at their index in react/javascript.
replaceAt( yourArrayOfIndexes, yourString/orArrayOfStrings )
Working example: https://codesandbox.io/s/ov7zxp9mjq
function replaceAt(indexArray, [...string]) {
const replaceValue = i => string[i] = <b>{string[i]}</b>;
indexArray.forEach(replaceValue);
return string;
}
And here is another alternate method
function replaceAt(indexArray, [...string]) {
const startTag = '<b>';
const endTag = '</b>';
const tagLetter = i => string.splice(i, 1, startTag + string[i] + endTag);
indexArray.forEach(tagLetter);
return string.join('');
}
And another...
function replaceAt(indexArray, [...string]) {
for (let i = 0; i < indexArray.length; i++) {
string = Object.assign(string, {
[indexArray[i]]: <b>{string[indexArray[i]]}</b>
});
}
return string;
}
Here is my solution using the ternary and map operator. More readable, maintainable end easier to understand if you ask me.
It is more into es6 and best practices.
function replaceAt() {
const replaceAt = document.getElementById('replaceAt').value;
const str = 'ThisIsATestStringToReplaceCharAtSomePosition';
const newStr = Array.from(str).map((character, charIndex) => charIndex === (replaceAt - 1) ? '' : character).join('');
console.log(`New string: ${newStr}`);
}
<input type="number" id="replaceAt" min="1" max="44" oninput="replaceAt()"/>
My safe approach with negative indexes
/**
* #param {string} str
* #param {number} index
* #param {string} replacement
* #returns {string}
*/
static replaceAt (str, index, replacement)
{
if (index < 0) index = str.length + index
if (index < 0 || index >= str.length) throw new Error(`Index (${index}) out of bounds "${str}"`)
return str.substring(0, index) + replacement + str.substring(index + 1)
}
Use it like that:
replaceAt('my string', -1, 'G') // 'my strinG'
replaceAt('my string', 2, 'yy') // 'myyystring'
replaceAt('my string', 22, 'yy') // Uncaught Error: Index (22) out of bounds "my string"
Lets say you want to replace Kth index (0-based index) with 'Z'.
You could use Regex to do this.
var re = var re = new RegExp("((.){" + K + "})((.){1})")
str.replace(re, "$1A$`");
You can use the following function to replace Character or String at a particular position of a String. To replace all the following match cases use String.prototype.replaceAllMatches() function.
String.prototype.replaceMatch = function(matchkey, replaceStr, matchIndex) {
var retStr = this, repeatedIndex = 0;
for (var x = 0; (matchkey != null) && (retStr.indexOf(matchkey) > -1); x++) {
if (repeatedIndex == 0 && x == 0) {
repeatedIndex = retStr.indexOf(matchkey);
} else { // matchIndex > 0
repeatedIndex = retStr.indexOf(matchkey, repeatedIndex + 1);
}
if (x == matchIndex) {
retStr = retStr.substring(0, repeatedIndex) + replaceStr + retStr.substring(repeatedIndex + (matchkey.length));
matchkey = null; // To break the loop.
}
}
return retStr;
};
Test:
var str = "yash yas $dfdas.**";
console.log('Index Matched replace : ', str.replaceMatch('as', '*', 2) );
console.log('Index Matched replace : ', str.replaceMatch('y', '~', 1) );
Output:
Index Matched replace : yash yas $dfd*.**
Index Matched replace : yash ~as $dfdas.**
I se this to make a string proper case, that is, the first letter is Upper Case and all the rest are lower case:
function toProperCase(someString){
return someString.charAt(0).toUpperCase().concat(someString.toLowerCase().substring(1,someString.length));
};
This first thing done is to ensure ALL the string is lower case - someString.toLowerCase()
then it converts the very first character to upper case -someString.charAt(0).toUpperCase()
then it takes a substring of the remaining string less the first character -someString.toLowerCase().substring(1,someString.length))
then it concatenates the two and returns the new string -someString.charAt(0).toUpperCase().concat(someString.toLowerCase().substring(1,someString.length))
New parameters could be added for the replacement character index and the replacement character, then two substrings formed and the indexed character replaced then concatenated in much the same way.
The solution does not work for negative index so I add a patch to it.
String.prototype.replaceAt=function(index, character) {
if(index>-1) return this.substr(0, index) + character + this.substr(index+character.length);
else return this.substr(0, this.length+index) + character + this.substr(index+character.length);
}
"hello world".replace(/(.{3})./, "$1h")
// 'helho world'
The methods on here are complicated.
I would do it this way:
var myString = "this is my string";
myString = myString.replace(myString.charAt(number goes here), "insert replacement here");
This is as simple as it gets.