Fast group items in an array in javascript

Fast group items in an array in javascript - javascript

I have an array which is returned from an API in the format of [a, b, c, d, e, f ... ], where a,c,e and b,d,f are of the same type, respectively. Now I want to group the array into [ [a,b], [c,d], [e,f] ...]. It's fairly easy by creating a new array, but the array is large so that could be slow.
So I'm wondering if there're any methods that can do it in-place?

Do you want it in 2 section chunks?
var o = ['a', 'b', 'c', 'd', 'e', 'f'],
size = 2, i, ar = []; // The new array
for (i = 0; i < o.length; i += size) ar.push(o.slice(i,i + size));
Now, ar is:
[
['a', 'b'],
['c', 'd'],
['e', 'f']
]
No matter how you do it, there is alway going to be some looping. The compiler has to go through all the array elements to make the new array.
Speed Tests
So I'll create an array with this:
var l = 10000000, // The length
o = [], j;
for (j = 0; j < l; j += 1) o.push(j);
So that will make an array with l items now to test the speed:
var start = performance.now(),
size = 2, ar = [];
for (i = 0; i < o.length; i += size) ar.push(o.slice(i,i + size));
console.log(performance.now() - start);
Tests:
100 Thousand: 0.092909533996135 seconds
1 Million: 0.359059600101318 seconds
10 Million: 10.138852232019417 seconds
The 10 million time might surprise but if you have that big of an array you have bigger problems such as memory issues. And if this array is coming from a server you are probably going to be putting excessive strain on the server.

This is wanton use of a library even though op is concerned about performance, but I like using lodash/underscore for easily-comprehensible code:
_.partition('a,b,c,d,e,f'.split(','), function(_, idx) {return !(idx % 2);})

An in place solution is to just iterate as normal, building arrays and 'skipping' elements by splicing them before you reach them.
DEMO
var arr = ['a', 'b', 'c', 'd', 'e', 'f'];
function compact (arr) {
for(var i = 0; i < arr.length; i++) {
arr[i] = [arr[i], arr[i + 1]];
arr.splice(i + 1, 1);
}
return arr; // debug only
}
console.log(compact(arr.slice()));
// >> [["a", "b"], ["c", "d"], ["e", "f"]]
Untested as far as performance goes. I would agree with the comments that it's most likely slower to manipulate the array in place, as apposed to building a new array.

Related

How to group every 2nd and 3rd items of an array into sub-arrays?

I have an array of objects
const objects = [a, b, c, d, e, f, g ... ]
and I want them to turn into
const result = [a, [b, c], d, [e, f], g ... ]
Any ideas?
[Edit] My apologies. This is my first post, didn't know I have to show my attempts. I don't think I deserve the mean comments either, be nice people. I solved it after a head-banging 4 hours. Here is my solution:
const result = []
const method = array => {
for (let i = 0; i < array.length; i += 3) {
const set = new Set([array[i + 1], array[i + 2]])
if (i !== array.length - 1) {
result.push(array[i])
result.push(Array.from(set))
} else {
result.push(array[i])
}
}
}
Thanks for the responses guys! I read every single one of them.

You could take a while loop and push either an item or a pair of items.
var array = ['a', 'b', 'c', 'd', 'e', 'f', 'g'],
grouped = [],
i = 0;
while (i < array.length) {
grouped.push(array[i++]);
if (i >= array.length) break;
grouped.push(array.slice(i, i += 2));
}
console.log(grouped);

You can do this with plain for loop and % modulo operator.
const objects = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
const result = []
for(let i = 0; i < objects.length; i++) {
if(i % 3 === 0) {
const arr = objects.slice(i + 1, i + 3)
result.push(objects[i])
if(arr.length) result.push(arr)
}
}
console.log(result)

this is my solution:
const objects = ["a", "b", "c", "d", "e", "f", "g"];
let result = [];
let toGroup = false;
for(let i = 0; i < objects.length ; i++){
if(toGroup){
result.push([objects[i], objects[++i]]);
}
else result.push(objects[i]);
toGroup = !toGroup;
}
this has a particular case that you have not specified, where it doesn't work, for example if inside objects there are 2 elements, and so i don't know what you would like to do in that case

spread operator vs array.concat()

What is the difference between spread operator and array.concat()
let parts = ['four', 'five'];
let numbers = ['one', 'two', 'three'];
console.log([...numbers, ...parts]);
Array.concat() function
let parts = ['four', 'five'];
let numbers = ['one', 'two', 'three'];
console.log(numbers.concat(parts));
Both results are same. So, what kind of scenarios we want to use them? And which one is best for performance?

concat and spreads are very different when the argument is not an array.
When the argument is not an array, concat adds it as a whole, while ... tries to iterate it and fails if it can't. Consider:
a = [1, 2, 3]
x = 'hello';
console.log(a.concat(x)); // [ 1, 2, 3, 'hello' ]
console.log([...a, ...x]); // [ 1, 2, 3, 'h', 'e', 'l', 'l', 'o' ]
Here, concat treats the string atomically, while ... uses its default iterator, char-by-char.
Another example:
x = 99;
console.log(a.concat(x)); // [1, 2, 3, 99]
console.log([...a, ...x]); // TypeError: x is not iterable
Again, for concat the number is an atom, ... tries to iterate it and fails.
Finally:
function* gen() { yield *'abc' }
console.log(a.concat(gen())); // [ 1, 2, 3, Object [Generator] {} ]
console.log([...a, ...gen()]); // [ 1, 2, 3, 'a', 'b', 'c' ]
concat makes no attempt to iterate the generator and appends it as a whole, while ... nicely fetches all values from it.
To sum it up, when your arguments are possibly non-arrays, the choice between concat and ... depends on whether you want them to be iterated.
The above describes the default behaviour of concat, however, ES6 provides a way to override it with Symbol.isConcatSpreadable. By default, this symbol is true for arrays, and false for everything else. Setting it to true tells concat to iterate the argument, just like ... does:
str = 'hello'
console.log([1,2,3].concat(str)) // [1,2,3, 'hello']
str = new String('hello');
str[Symbol.isConcatSpreadable] = true;
console.log([1,2,3].concat(str)) // [ 1, 2, 3, 'h', 'e', 'l', 'l', 'o' ]
Performance-wise concat is faster, probably because it can benefit from array-specific optimizations, while ... has to conform to the common iteration protocol. Timings:
let big = (new Array(1e5)).fill(99);
let i, x;
console.time('concat-big');
for(i = 0; i < 1e2; i++) x = [].concat(big)
console.timeEnd('concat-big');
console.time('spread-big');
for(i = 0; i < 1e2; i++) x = [...big]
console.timeEnd('spread-big');
let a = (new Array(1e3)).fill(99);
let b = (new Array(1e3)).fill(99);
let c = (new Array(1e3)).fill(99);
let d = (new Array(1e3)).fill(99);
console.time('concat-many');
for(i = 0; i < 1e2; i++) x = [1,2,3].concat(a, b, c, d)
console.timeEnd('concat-many');
console.time('spread-many');
for(i = 0; i < 1e2; i++) x = [1,2,3, ...a, ...b, ...c, ...d]
console.timeEnd('spread-many');

Well console.log(['one', 'two', 'three', 'four', 'five']) has the same result as well, so why use either here? :P
In general you would use concat when you have two (or more) arrays from arbitrary sources, and you would use the spread syntax in the array literal if the additional elements that are always part of the array are known before. So if you would have an array literal with concat in your code, just go for spread syntax, and just use concat otherwise:
[...a, ...b] // bad :-(
a.concat(b) // good :-)
[x, y].concat(a) // bad :-(
[x, y, ...a] // good :-)
Also the two alternatives behave quite differently when dealing with non-array values.

I am replying just to the performance question since there are already good answers regarding the scenarios. I wrote a test and executed it on the most recent browsers. Below the results and the code.
/*
* Performance results.
* Browser Spread syntax concat method
* --------------------------------------------------
* Chrome 75 626.43ms 235.13ms
* Firefox 68 928.40ms 821.30ms
* Safari 12 165.44ms 152.04ms
* Edge 18 1784.72ms 703.41ms
* Opera 62 590.10ms 213.45ms
* --------------------------------------------------
*/
Below the code I wrote and used.
const array1 = [];
const array2 = [];
const mergeCount = 50;
let spreadTime = 0;
let concatTime = 0;
// Used to popolate the arrays to merge with 10.000.000 elements.
for (let i = 0; i < 10000000; ++i) {
array1.push(i);
array2.push(i);
}
// The spread syntax performance test.
for (let i = 0; i < mergeCount; ++i) {
const startTime = performance.now();
const array3 = [ ...array1, ...array2 ];
spreadTime += performance.now() - startTime;
}
// The concat performance test.
for (let i = 0; i < mergeCount; ++i) {
const startTime = performance.now();
const array3 = array1.concat(array2);
concatTime += performance.now() - startTime;
}
console.log(spreadTime / mergeCount);
console.log(concatTime / mergeCount);

The one difference I think is valid is that using spread operator for large array size will give you error of Maximum call stack size exceeded which you can avoid using the concat operator.
var someArray = new Array(600000);
var newArray = [];
var tempArray = [];
someArray.fill("foo");
try {
newArray.push(...someArray);
} catch (e) {
console.log("Using spread operator:", e.message)
}
tempArray = newArray.concat(someArray);
console.log("Using concat function:", tempArray.length)

There is one very important difference between concat and push in that the former does not mutate the underlying array, requiring you to assign the result to the same or different array:
let things = ['a', 'b', 'c'];
let moreThings = ['d', 'e'];
things.concat(moreThings);
console.log(things); // [ 'a', 'b', 'c' ]
things.push(...moreThings);
console.log(things); // [ 'a', 'b', 'c', 'd', 'e' ]
I've seen bugs caused by the assumption that concat changes the array (talking for a friend ;).

Update:
Concat is now always faster than spread. The following benchmark shows both small and large-size arrays being joined: https://jsbench.me/nyla6xchf4/1
// preparation
const a = Array.from({length: 1000}).map((_, i)=>`${i}`);
const b = Array.from({length: 2000}).map((_, i)=>`${i}`);
const aSmall = ['a', 'b', 'c', 'd'];
const bSmall = ['e', 'f', 'g', 'h', 'i'];
const c = [...a, ...b];
// vs
const c = a.concat(b);
const c = [...aSmall, ...bSmall];
// vs
const c = aSmall.concat(bSmall)
Previous:
Although some of the replies are correct when it comes to performance on big arrays, the performance is quite different when you are dealing with small arrays.
You can check the results for yourself at https://jsperf.com/spread-vs-concat-size-agnostic.
As you can see, spread is 50% faster for smaller arrays, while concat is multiple times faster on large arrays.

The answer by #georg was helpful to see the comparison. I was also curious about how .flat() would compare in the running and it was by far the worst. Don't use .flat() if speed is a priority. (Something I wasn't aware of until now)
let big = new Array(1e5).fill(99);
let i, x;
console.time("concat-big");
for (i = 0; i < 1e2; i++) x = [].concat(big);
console.timeEnd("concat-big");
console.time("spread-big");
for (i = 0; i < 1e2; i++) x = [...big];
console.timeEnd("spread-big");
console.time("flat-big");
for (i = 0; i < 1e2; i++) x = [[], big].flat();
console.timeEnd("flat-big");
let a = new Array(1e3).fill(99);
let b = new Array(1e3).fill(99);
let c = new Array(1e3).fill(99);
let d = new Array(1e3).fill(99);
console.time("concat-many");
for (i = 0; i < 1e2; i++) x = [1, 2, 3].concat(a, b, c, d);
console.timeEnd("concat-many");
console.time("spread-many");
for (i = 0; i < 1e2; i++) x = [1, 2, 3, ...a, ...b, ...c, ...d];
console.timeEnd("spread-many");
console.time("flat-many");
for (i = 0; i < 1e2; i++) x = [1, 2, 3, a, b, c, d].flat();
console.timeEnd("flat-many");

How can I make a random set generator with equal possibility to generate uniform sets and non-uniform sets?

So let's say I have a set of items:
['a', 'b', 'c', 'd', 'e']
and I want to generate a random set (order does not matter) from those choices:
['a', 'e', 'd', 'c']
which is child's play, but instead of it being unlikely to generate a uniform result:
['c', 'c', 'c', 'c']
compared to something less uniform like:
['a', 'b', 'e', 'd']
I want to make it equally likely that a uniform set can be generated as it is likely that a non-uniform set can be generated.
Edit:
The result I'm trying to express is not just ['c', 'c', 'c', 'c', 'c', 'c'] or ['d', 'd', 'd', 'd', 'd', 'd'] but also the areas in between those uniformities like ['c', 'c', 'c', 'c', 'c', 'a'] or ['d', 'd', 'd', 'd', 'd', 'b'] or ['c', 'c', 'c', 'c', 'b', 'a']. Making all of those uniform sets and the areas in-between equally likely as non-uniform results is what I find challenging to create. I'm at a loss for where to even begin creating a set generator that does that.
Further clarification:
So if I generate a set of 1000 items, I want it to be equally likely that the set is 90% uniform or 100% uniform or 80% uniform or 20% uniform.
How can/should this be done?

From what you're saying, you want to ignore the order of the elements in your random set, so if your original set was ab then the possible outcomes (ignoring order) would be aa, ab, bb, and you'd like to see each of those appearing with equal probability (of 1/3), no?
A brute-force solution to this would be:
generate all outcomes (see Finding All Combinations of JavaScript array values),
sort each of the results so the elements appear alphabetically,
remove duplicates (see Remove duplicates from an array of objects in javascript)
select one of the remaining results at random
So for example, with abc:
all combinations = [`aaa`, `aab`, `aac`
`aba`, `abb`, `abc`
`aca`, `acb`, `acc`
`baa`, `bab`, `bac`
`bba`, `bbb`, `bbc`
`bca`, `bcb`, `bcc`
`caa`, `cab`, `cac`
`cba`, `cbb`, `cbc`
`cca`, `ccb`, `ccc`]
sorted combinations = [`aaa`, `aab`, `aac`
`aab`, `abb`, `abc`
`aac`, `abc`, `acc`
`aab`, `abb`, `abc`
`abb`, `bbb`, `bbc`
`abc`, `bbc`, `bcc`
`aac`, `abc`, `acc`
`abc`, `bbc`, `bcc`
`acc`, `bcc`, `ccc`]
remove duplicates = [`aaa`, `aab`, `aac`,
`abb`, `abc`, `acc`,
`bbb`, `bbc`, `bcc`,
`ccc`]
then choose from these with equal probability of 1/10
EDIT The final output above gives a clue to a non-brute-force solution: for each item the letters that follow each letter are of equal or 'higher' value (alphabetically speaking). So 'b' will never be followed by 'a', and 'c' will never be followed by 'a' or 'b' and so on.
So we can recursively generate all the combinations like this (sorry it's in python, you'll have to translate to javascript):
r=['a','b','c','d']
def get_combos(bound, n):
global r
if n == 1:
return r[bound:]
result=[]
for i in range(bound,len(r)):
for combo in get_combos(i, n-1):
result.append(r[i]+combo)
return result
x = get_combos(0,len(r))
print(x) # ['aaaa', 'aaab', 'aaac', 'aaad', 'aabb', 'aabc', 'aabd', 'aacc', 'aacd', 'aadd', 'abbb', 'abbc', 'abbd', 'abcc', 'abcd', 'abdd', 'accc', 'accd', 'acdd', 'addd', 'bbbb', 'bbbc', 'bbbd', 'bbcc', 'bbcd', 'bbdd', 'bccc', 'bccd', 'bcdd', 'bddd', 'cccc', 'cccd', 'ccdd', 'cddd', 'dddd']
print(len(x)) # 35

This can in fact be done. Just get a random number, and if it is over 0.5 then generate a random set, otherwise generate an extreme set from a random index. See the following code:
function generateRandomOrExtremeSet(a) {
var n = Math.random();
var set = [];
if (n > 0.5)
for (var i = 0; i < a.length; ++i)
set[i] = a[Math.round(Math.random()*(a.length-1))];
else {
var index = Math.round(Math.random() * (a.length-1));
for (var i = 0; i < a.length; ++i) {
if (Math.random() > 0.8) // change to adjust extremeness
set[i] = a[Math.round(Math.random()*(a.length-1))];
else
set[i] = a[index];
}
}
return set;
}

Here's a simple snippet that gets it done, using the strategy of first deciding whether to generate an extreme or non-extreme set.
const choices = ['a', 'b', 'c', 'd', 'e']
const repeatN = (times, x) => {
const ret = [];
while (times--) {
ret.push(x);
}
return ret;
}
const chooseN = (n, choices) => {
let list = choices.slice();
let ret = [];
while (n-- && list.length) {
let i = Math.floor(Math.random() * list.length);
ret.push(list[i]);
}
return ret;
};
const set = Math.random() > 0.5 ?
chooseN(5, choices) :
repeatN(5, chooseN(1, choices)[0]);
console.log(set);

Your original question seems to be stated incorrectly, which is why you are getting so many incorrect responses. In fact, the simple approach will give you the result that you want. You should do this by choosing a random value from your original set, like this:
function randomSet(set, size) {
let result = []
for (let i = 0; i < size; i++) {
// get a random index from the original set
let index = Math.floor(Math.random() * set.length)
result.push(set[index])
}
return result
}
console.log(randomSet(['a', 'b', 'c'], 3))
// These are all equally likely:
// ['a','a','a']
// ['a','b','b']
// ['a','b','c']
// ['c','b','a']
// ['a','b','a']
// ['a','a','b']
// ['b','a','a']
// ['b','a','b']
// ['b','b','b']
// etc.
Alternatively, it's possible that you are misunderstanding the definition of a set. Many of your examples like ['c', 'c', 'c', 'c', 'b', 'a'] are not sets, because they contain repeat characters. Proper sets cannot contain repeat characters and the order of their contents does not matter. If you want to generate a random set from your initial set (in other words, generate a subset), you can do that by picking a size less than or equal to your initial set size, and filling a new set of that size with random elements from the initial set:
function randomSet(set) {
let result = []
let size = Math.floor(Math.random() * set.length)
while(result.length !== size) {
// get a random index from the original set
let index = Math.floor(Math.random() * set.length)
// in this case, we construct the new set simply by removing items from the original set
result.splice(index, 1)
}
return result
}
console.log(randomSet(['a', 'b', 'c']))
// These are all equally likely:
// ['a','b','c']
// ['a','b']
// ['b','c']
// ['a','c']
// ['a']
// ['b']
// ['c']
// no other sets are possible

Javascript, mechanics behind pop() and push() in a array sort function

I'm working on improving my javascript skills and I want to understand the mechanics behind pop() and push(). I'm reading Marijn Haverbeke's Eloquent Javascript book and I'm working on chapter 4's Reversing Array exercise. I was able to solve the problem; however, I ran into an interesting quirk. My first code attempt was:
var arr = ['a', 'b', 'c', 'd'];
function reverseArray(array){
var newArray = [];
console.log(array.length);
for(var i = 0; i <= array.length; i++){
newArray[i] = array.pop();
};
return newArray;
};
reverseArray(arr);
This result was ['d', 'c', 'b'] and the the 'a' was not resolving. I don't understand why? Can someone explain?
My second code attempt was:
var arr = ['a', 'b', 'c', 'd'];
function reverseArray(array){
var newArray = [];
console.log(array.length);
for(var i = array.length - 1; i >= 0; i--){
newArray.push(array[i]);
};
return newArray;
};
console.log(reverseArray(arr));
This resulted in the correct reversal of the array: ['d', 'c', 'b', 'a']. Can someone explain why this worked?

Here lies your problem:
for(var i = 0; i <= array.length; i++){
newArray[i] = array.pop();
};
for each iteration:
i = 0 array.length: 4 //d
i = 1 array.length: 3 //c
i = 2 array.length: 2 //b
i = 3 array.length: 1 //a -- wont print
now your loop stops working because you told so in:-
i <= array.length
//3 <= 1 will return false so for loop stops
Not sure if you noticed, but push() and pop() change .length property of an Array

The first function doesn't return the expected result because as you pop its elements it gets shorter. Each turn the loop compares the arrays new length.
You could store the arrays initial length and compare against it:
var arr = ['a', 'b', 'c', 'd'];
function reverseArray(array){
var newArray = [],
length = array.length; // save the initial length
for(var i = 0; i < length; i++){
newArray[i] = array.pop();
};
return newArray;
};
console.log(reverseArray(arr));

Remove an array of indices from a JavaScript array

I have an array:
var arr = ['A', 'B', 'C', 'D', 'E', 'F', 'G']
and I have an array of indices which I wish to remove:
var remove = [1, 3, 5]
so that the result is :
arr ==== ['A', 'C', 'E', 'G']
I can't do it with splice in a loop:
// WRONG
for (i = 0, l = remove.length; i < l; i++) {
arr.splice(remove[i]);
}
because after every iteration the index of each element has changed.
So how can I do this?

> arr.filter(function(x,i){return remove.indexOf(i)==-1})
["A", "C", "E", "G"]
To be more efficient, convert remove into an object/hashtable first, like so:
var removeTable = {}
remove.forEach(function(x){removeTable[x]=true})
> arr.filter(function(x,i){return removeTable[i]})
["A", "C", "E", "G"]

To not change your thinking too much- Start at the end.
A B C D E F..
When you remove element 5, it becomes..
A B C D E
Then you remove element 3, it becomes..
A B C E
Which is exactly what you want.

Count backwards:
// RIGHT
for (i = (remove.length-1); i >= 0; i--) {
arr.splice(remove[i]);
}

start the loop from last and remove the elements from highest index first.

As an alternative suggestion, you could use .push() to send the items you want to keep to a third array. See here for the basics. This would allow you to keep the original array intact, although it seems you don't want/need to do this.

We Keep Coding

JavaScript is the programming language of the Web.

Fast group items in an array in javascript - javascript

This is wanton use of a library even though op is concerned about performance, but I like using lodash/underscore for easily-comprehensible code: _.partition('a,b,c,d,e,f'.split(','), function(_, idx) {return !(idx % 2);})

Related

How to group every 2nd and 3rd items of an array into sub-arrays?

spread operator vs array.concat()

How can I make a random set generator with equal possibility to generate uniform sets and non-uniform sets?

Javascript, mechanics behind pop() and push() in a array sort function

Remove an array of indices from a JavaScript array

Categories

Resources