I have a sorted JavaScript array, and want to insert one more item into the array such the resulting array remains sorted. I could certainly implement a simple quicksort-style insertion function:
var array = [1,2,3,4,5,6,7,8,9];
var element = 3.5;
function insert(element, array) {
array.splice(locationOf(element, array) + 1, 0, element);
return array;
}
function locationOf(element, array, start, end) {
start = start || 0;
end = end || array.length;
var pivot = parseInt(start + (end - start) / 2, 10);
if (end-start <= 1 || array[pivot] === element) return pivot;
if (array[pivot] < element) {
return locationOf(element, array, pivot, end);
} else {
return locationOf(element, array, start, pivot);
}
}
console.log(insert(element, array));
[WARNING] this code has a bug when trying to insert to the beginning of the array, e.g. insert(2, [3, 7 ,9]) produces incorrect [ 3, 2, 7, 9 ].
However, I noticed that implementations of the Array.sort function might potentially do this for me, and natively:
var array = [1,2,3,4,5,6,7,8,9];
var element = 3.5;
function insert(element, array) {
array.push(element);
array.sort(function(a, b) {
return a - b;
});
return array;
}
console.log(insert(element, array));
Is there a good reason to choose the first implementation over the second?
Edit: Note that for the general case, an O(log(n)) insertion (as implemented in the first example) will be faster than a generic sorting algorithm; however this is not necessarily the case for JavaScript in particular. Note that:
Best case for several insertion algorithms is O(n), which is still significantly different from O(log(n)), but not quite as bad as O(n log(n)) as mentioned below. It would come down to the particular sorting algorithm used (see Javascript Array.sort implementation?)
The sort method in JavaScript is a native function, so potentially realizing huge benefits -- O(log(n)) with a huge coefficient can still be much worse than O(n) for reasonably sized data sets.
Simple (Demo):
function sortedIndex(array, value) {
var low = 0,
high = array.length;
while (low < high) {
var mid = (low + high) >>> 1;
if (array[mid] < value) low = mid + 1;
else high = mid;
}
return low;
}
Just as a single data point, for kicks I tested this out inserting 1000 random elements into an array of 100,000 pre-sorted numbers using the two methods using Chrome on Windows 7:
First Method:
~54 milliseconds
Second Method:
~57 seconds
So, at least on this setup, the native method doesn't make up for it. This is true even for small data sets, inserting 100 elements into an array of 1000:
First Method:
1 milliseconds
Second Method:
34 milliseconds
Very good and remarkable question with a very interesting discussion! I also was using the Array.sort() function after pushing a single element in an array with some thousands of objects.
I had to extend your locationOf function for my purpose because of having complex objects and therefore the need for a compare function like in Array.sort():
function locationOf(element, array, comparer, start, end) {
if (array.length === 0)
return -1;
start = start || 0;
end = end || array.length;
var pivot = (start + end) >> 1; // should be faster than dividing by 2
var c = comparer(element, array[pivot]);
if (end - start <= 1) return c == -1 ? pivot - 1 : pivot;
switch (c) {
case -1: return locationOf(element, array, comparer, start, pivot);
case 0: return pivot;
case 1: return locationOf(element, array, comparer, pivot, end);
};
};
// sample for objects like {lastName: 'Miller', ...}
var patientCompare = function (a, b) {
if (a.lastName < b.lastName) return -1;
if (a.lastName > b.lastName) return 1;
return 0;
};
There's a bug in your code. It should read:
function locationOf(element, array, start, end) {
start = start || 0;
end = end || array.length;
var pivot = parseInt(start + (end - start) / 2, 10);
if (array[pivot] === element) return pivot;
if (end - start <= 1)
return array[pivot] > element ? pivot - 1 : pivot;
if (array[pivot] < element) {
return locationOf(element, array, pivot, end);
} else {
return locationOf(element, array, start, pivot);
}
}
Without this fix the code will never be able to insert an element at the beginning of the array.
I know this is an old question that has an answer already, and there are a number of other decent answers. I see some answers that propose that you can solve this problem by looking up the correct insertion index in O(log n) - you can, but you can't insert in that time, because the array needs to be partially copied out to make space.
Bottom line: If you really need O(log n) inserts and deletes into a sorted array, you need a different data structure - not an array. You should use a B-Tree. The performance gains you will get from using a B-Tree for a large data set, will dwarf any of the improvements offered here.
If you must use an array. I offer the following code, based on insertion sort, which works, if and only if the array is already sorted. This is useful for the case when you need to resort after every insert:
function addAndSort(arr, val) {
arr.push(val);
for (i = arr.length - 1; i > 0 && arr[i] < arr[i-1]; i--) {
var tmp = arr[i];
arr[i] = arr[i-1];
arr[i-1] = tmp;
}
return arr;
}
It should operate in O(n), which I think is the best you can do. Would be nicer if js supported multiple assignment.
here's an example to play with:
Update:
this might be faster:
function addAndSort2(arr, val) {
arr.push(val);
i = arr.length - 1;
item = arr[i];
while (i > 0 && item < arr[i-1]) {
arr[i] = arr[i-1];
i -= 1;
}
arr[i] = item;
return arr;
}
Update 2
#terrymorse pointed out in the comments that javascripts Array.splice method is crazy fast, and it's more than just constant improvement in the time complexity. It seems some linked list magic is being used. It means you still do need a different data structure than a plain array - just that javascript arrays might provide that different data structure natively.
Updated JS Bin link
Your insertion function assumes that the given array is sorted, it searches directly for the location where the new element can be inserted, usually by just looking at a few of the elements in the array.
The general sort function of an array can't take these shortcuts. Obviously it at least has to inspect all elements in the array to see if they are already correctly ordered. This fact alone makes the general sort slower than the insertion function.
A generic sort algorithm is usually on average O(n ⋅ log(n)) and depending on the implementation it might actually be the worst case if the array is already sorted, leading to complexities of O(n2). Directly searching for the insertion position instead has just a complexity of O(log(n)), so it will always be much faster.
Here's a version that uses lodash.
const _ = require('lodash');
sortedArr.splice(_.sortedIndex(sortedArr,valueToInsert) ,0,valueToInsert);
note: sortedIndex does a binary search.
For a small number of items, the difference is pretty trivial. However, if you're inserting a lot of items, or working with a very large array, calling .sort() after each insertion will cause a tremendous amount of overhead.
I ended up writing a pretty slick binary search/insert function for this exact purpose, so I thought I'd share it. Since it uses a while loop instead of recursion, there is no overheard for extra function calls, so I think the performance will be even better than either of the originally posted methods. And it emulates the default Array.sort() comparator by default, but accepts a custom comparator function if desired.
function insertSorted(arr, item, comparator) {
if (comparator == null) {
// emulate the default Array.sort() comparator
comparator = function(a, b) {
if (typeof a !== 'string') a = String(a);
if (typeof b !== 'string') b = String(b);
return (a > b ? 1 : (a < b ? -1 : 0));
};
}
// get the index we need to insert the item at
var min = 0;
var max = arr.length;
var index = Math.floor((min + max) / 2);
while (max > min) {
if (comparator(item, arr[index]) < 0) {
max = index;
} else {
min = index + 1;
}
index = Math.floor((min + max) / 2);
}
// insert the item
arr.splice(index, 0, item);
};
If you're open to using other libraries, lodash provides sortedIndex and sortedLastIndex functions, which could be used in place of the while loop. The two potential downsides are 1) performance isn't as good as my method (thought I'm not sure how much worse it is) and 2) it does not accept a custom comparator function, only a method for getting the value to compare (using the default comparator, I assume).
Here are a few thoughts:
Firstly, if you're genuinely concerned about the runtime of your code, be sure to know what happens when you call the built-in functions! I don't know up from down in javascript, but a quick google of the splice function returned this, which seems to indicate that you're creating a whole new array each call! I don't know if it actually matters, but it is certainly related to efficiency. I see that Breton, in the comments, has already pointed this out, but it certainly holds for whatever array-manipulating function you choose.
Anyways, onto actually solving the problem.
When I read that you wanted to sort, my first thought is to use insertion sort!. It is handy because it runs in linear time on sorted, or nearly-sorted lists. As your arrays will have only 1 element out of order, that counts as nearly-sorted (except for, well, arrays of size 2 or 3 or whatever, but at that point, c'mon). Now, implementing the sort isn't too too bad, but it is a hassle you may not want to deal with, and again, I don't know a thing about javascript and if it will be easy or hard or whatnot. This removes the need for your lookup function, and you just push (as Breton suggested).
Secondly, your "quicksort-esque" lookup function seems to be a binary search algorithm! It is a very nice algorithm, intuitive and fast, but with one catch: it is notoriously difficult to implement correctly. I won't dare say if yours is correct or not (I hope it is, of course! :)), but be wary if you want to use it.
Anyways, summary: using "push" with insertion sort will work in linear time (assuming the rest of the array is sorted), and avoid any messy binary search algorithm requirements. I don't know if this is the best way (underlying implementation of arrays, maybe a crazy built-in function does it better, who knows), but it seems reasonable to me. :)
- Agor.
Here's a comparison of four different algorithms for accomplishing this:
https://jsperf.com/sorted-array-insert-comparison/1
Algorithms
Naive: just push and sort() afterwards
Linear: iterate over array and insert where appropriate
Binary Search: taken from https://stackoverflow.com/a/20352387/154329
"Quick Sort Like": the refined solution from syntheticzero (https://stackoverflow.com/a/18341744/154329)
Naive is always horrible. It seems for small array sizes, the other three dont differ too much, but for larger arrays, the last 2 outperform the simple linear approach.
The best data structure I can think of is an indexed skip list which maintains the insertion properties of linked lists with a hierarchy structure that enables log time operations. On average, search, insertion, and random access lookups can be done in O(log n) time.
An order statistic tree enables log time indexing with a rank function.
If you do not need random access but you need O(log n) insertion and searching for keys, you can ditch the array structure and use any kind of binary search tree.
None of the answers that use array.splice() are efficient at all since that is on average O(n) time. What's the time complexity of array.splice() in Google Chrome?
Here is my function, uses binary search to find item and then inserts appropriately:
function binaryInsert(val, arr){
let mid,
len=arr.length,
start=0,
end=len-1;
while(start <= end){
mid = Math.floor((end + start)/2);
if(val <= arr[mid]){
if(val >= arr[mid-1]){
arr.splice(mid,0,val);
break;
}
end = mid-1;
}else{
if(val <= arr[mid+1]){
arr.splice(mid+1,0,val);
break;
}
start = mid+1;
}
}
return arr;
}
console.log(binaryInsert(16, [
5, 6, 14, 19, 23, 44,
35, 51, 86, 68, 63, 71,
87, 117
]));
Don't re-sort after every item, its overkill..
If there is only one item to insert, you can find the location to insert using binary search. Then use memcpy or similar to bulk copy the remaining items to make space for the inserted one. The binary search is O(log n), and the copy is O(n), giving O(n + log n) total. Using the methods above, you are doing a re-sort after every insertion, which is O(n log n).
Does it matter? Lets say you are randomly inserting k elements, where k = 1000. The sorted list is 5000 items.
Binary search + Move = k*(n + log n) = 1000*(5000 + 12) = 5,000,012 = ~5 million ops
Re-sort on each = k*(n log n) = ~60 million ops
If the k items to insert arrive whenever, then you must do search+move. However, if you are given a list of k items to insert into a sorted array - ahead of time - then you can do even better. Sort the k items, separately from the already sorted n array. Then do a scan sort, in which you move down both sorted arrays simultaneously, merging one into the other.
- One-step Merge sort = k log k + n = 9965 + 5000 = ~15,000 ops
Update: Regarding your question.
First method = binary search+move = O(n + log n). Second method = re-sort = O(n log n) Exactly explains the timings you're getting.
TypeScript version with custom compare method:
const { compare } = new Intl.Collator(undefined, {
numeric: true,
sensitivity: "base"
});
const insert = (items: string[], item: string) => {
let low = 0;
let high = items.length;
while (low < high) {
const mid = (low + high) >> 1;
compare(items[mid], item) > 0
? (high = mid)
: (low = mid + 1);
}
items.splice(low, 0, item);
};
Use:
const items = [];
insert(items, "item 12");
insert(items, "item 1");
insert(items, "item 2");
insert(items, "item 22");
console.log(items);
// ["item 1", "item 2", "item 12", "item 22"]
Had your first code been bug free, my best guess is, it would have been how you do this job in JS. I mean;
Make a binary search to find the index of insertion
Use splice to perform your insertion.
This is almost always 2x faster than a top down or bottom up linear search and insert as mentioned in domoarigato's answer which i liked very much and took it as a basis to my benchmark and finally push and sort.
Of course under many cases you are probably doing this job on some objects in real life and here i have generated a benchmark test for these three cases for an array of size 100000 holding some objects. Feel free to play with it.
function insertElementToSorted(arr, ele, start=0,end=null) {
var n , mid
if (end == null) {
end = arr.length-1;
}
n = end - start
if (n%2 == 0) {
mid = start + n/2;
} else {
mid = start + (n-1)/2
}
if (start == end) {
return start
}
if (arr[0] > ele ) return 0;
if (arr[end] < ele) return end+2;
if (arr[mid] >= ele && arr[mid-1] <= ele) {
return mid
}
if (arr[mid] > ele && arr[mid-1] > ele) {
return insertElementToSorted(arr,ele,start,mid-1)
}
if (arr[mid] <= ele && arr[mid+1] >= ele) {
return mid + 1
}
if (arr[mid] < ele && arr[mid-1] < ele) {
return insertElementToSorted(arr,ele,mid,end)
}
if(arr[mid] < ele && arr[mid+1] < ele) {
console.log("mid+1", mid+1, end)
return insertElementToSorted(arr,ele,mid+1,end)
}
}
// Example
var test = [1,2,5,9, 10, 14, 17,21, 35, 38,54, 78, 89,102];
insertElementToSorted(test,6)
As a memo to my future self, here is yet another version, findOrAddSorted with some optimizations for corner cases and a rudimentary test.
// returns BigInt(index) if the item has been found
// or BigInt(index) + BigInt(MAX_SAFE_INTEGER) if it has been inserted
function findOrAddSorted(items, newItem) {
let from = 0;
let to = items.length;
let item;
// check if the array is empty
if (to === 0) {
items.push(newItem);
return BigInt(Number.MAX_SAFE_INTEGER);
}
// compare with the first item
item = items[0];
if (newItem === item) {
return 0;
}
if (newItem < item) {
items.splice(0, 0, newItem);
return BigInt(Number.MAX_SAFE_INTEGER);
}
// compare with the last item
item = items[to-1];
if (newItem === item) {
return BigInt(to-1);
}
if (newItem > item) {
items.push(newItem);
return BigInt(to) + BigInt(Number.MAX_SAFE_INTEGER);
}
// binary search
let where;
for (;;) {
where = (from + to) >> 1;
if (from >= to) {
break;
}
item = items[where];
if (item === newItem) {
return BigInt(where);
}
if (item < newItem) {
from = where + 1;
}
else {
to = where;
}
}
// insert newItem
items.splice(where, 0, newItem);
return BigInt(where) + BigInt(Number.MAX_SAFE_INTEGER);
}
// generate a random integer < MAX_SAFE_INTEGER
const generateRandomInt = () => Math.floor(Math.random() * Number.MAX_SAFE_INTEGER);
// fill the array with random numbers
const items = new Array();
const amount = 1000;
let i = 0;
let where = 0;
for (i = 0; i < amount; i++) {
where = findOrAddSorted(items, generateRandomInt());
if (where < BigInt(Number.MAX_SAFE_INTEGER)) {
break;
}
}
if (where < BigInt(Number.MAX_SAFE_INTEGER)) {
console.log(`items: ${i}, repeated at ${where}: ${items[Number(where)]}`)
}
else {
const at = Number(where - BigInt(Number.MAX_SAFE_INTEGER));
console.log(`items: ${i}, last insert at: ${at}: ${items[at]}`);
}
console.log(items);
function insertOrdered(array, elem) {
let _array = array;
let i = 0;
while ( i < array.length && array[i] < elem ) {i ++};
_array.splice(i, 0, elem);
return _array;
}
I'm trying to work on this problem where we compute Nested weight sum for a given array of numbers.
Given a nested list of integers, return the sum of all integers in the
list weighted by their depth.
For example for:
[[1,1],2,[1,1]] ====> solution is 10.
Four 1's at depth 2, one 2 at depth 1.
Here's the code i wrote:
var depthSum = function (nestedList, sum=0, depth=1) {
for(let i=0; i<nestedList.length; i++){
let val = nestedList[i];
if (Array.isArray(val)) {
return depthSum(val, sum, depth+1);
} else {
sum += val * depth;
}
};
return sum;
};
I'm trying to work on the converse problem. i.e
Given a nested list of integers, return the sum of all integers in the
list weighted by their depth. Where weight is increasing from root to
leaf, now the weight is defined from bottom up. i.e., the leaf level
integers have weight 1, and the root level integers have the largest
weight.
Example:
[[1,1],2,[1,1]] ===> Solution is 8.
How can I use the same approach and solve this problem?
(https://leetcode.com/problems/nested-list-weight-sum-ii/description/)
This should do the job, but I wish I had a premium leetcode account to verify that. The idea is to do a search to find the maximum depth in the structure, then use your previous algorithm but with the depth calculation inverted. Also, doing it without recursion means less chance of timing out and no chance of blowing the stack. I added a few basic test cases, but again, no guarantees.
const search = a => {
let sum = 0;
let depth = 0;
const stack = [[a, 0]];
while (stack.length) {
const curr = stack.pop();
if (curr[1] > depth) {
depth = curr[1];
}
for (const e of curr[0]) {
if (Array.isArray(e)) {
stack.push([e, curr[1] + 1]);
}
}
}
stack.push([a, ++depth]);
while (stack.length) {
const curr = stack.pop();
for (const e of curr[0]) {
if (Array.isArray(e)) {
stack.push([e, curr[1] - 1]);
}
else {
sum += e * curr[1];
}
}
}
return sum;
};
console.log(search([[1,1],2,[1,1]]));
console.log(search([]));
console.log(search([6]));
console.log(search([[[[3]]]]));
console.log(search([[2],1]));
A basic recursive solution alone like your original depthSum probably won't work for the second requirement, because you need to figure out the total depth before you know the multiplier for the items on the top level of the array. One option is to figure out the depth of the deepest array first, and then use something similar to your original depthSum.
You can use reduce (which is the appropriate method to use to convert an object into a single value) and the conditional (ternary) operator to make your code concise and less repetitive:
const depthCheck = (item) => (
Array.isArray(item)
? 1 + Math.max(...item.map(depthCheck))
: 0
);
// verification:
console.log(depthCheck([[1,1],2,[1,1]])); // total depth 2
console.log(depthCheck([[1,1],2,[1,1,[2,2]]])) // total depth 3
console.log(depthCheck([[1,1,[2,[3,3]]],2,[1,1,[2,2]]])) // total depth 4
console.log('-----')
const depthSum = (nestedList, weight=depthCheck(nestedList)) => (
nestedList.reduce((a, val) => a + (
Array.isArray(val)
? depthSum(val, weight - 1)
: val * weight
), 0)
);
console.log(depthSum([[1,1],2,[1,1]])) // (2)*2 + (1+1+1+1)*1
console.log(depthSum([[1,1],2,[1,1,[2,2]]])) // (2)*3 + (1+1+1+1)*2 + (2+2)*1
console.log(depthSum([[1,1,[2,[3,3]]],2,[1,1,[2,2]]])) // (2)*4 + (1+1+1+1)*3 + (2)*2 + (3+3)*1
You can do this without needing two traversals of the nested array, if you store the sums of the elements per depth in an array during the traversal. Afterwards, you know that the length of this array is the maximum depth, and you can multiply the sums by their correct weight.
The traversal can be done using recursion or a stack, as explained in the other answers. Here's an example using recursion:
function weightedSum(array) {
var sums = [], total = 0;
traverse(array, 0);
for (var i in sums)
total += sums[i] * (sums.length - i);
return total;
function traverse(array, depth) {
if (sums[depth] === undefined)
sums[depth] = 0;
for (var i in array) {
if (typeof array[i] === "number")
sums[depth] += array[i];
else traverse(array[i], depth + 1);
}
}
}
console.log(weightedSum([[],[]]));
console.log(weightedSum([[1,1],2,[1,1]]));
console.log(weightedSum([1,[[],2,2],1,[[3,3,[[5]]],[3]],[]]));
May be you can do as follows with a simple recursive reducer.
var weightOfNested = (a,d=1) => a.reduce((w,e) => Array.isArray(e) ? w + weightOfNested(e,d+1)
: w + d*e, 0);
console.log(weightOfNested([[1,1,[3]],2,[1,1]]));
So OK as mentioned in the comment the above code is weighing the deeper elements more. In order to weigh the shallow ones more we need to know the depth of the array in advance. I believe this way or that way you end up traversing the array twice... once for the depth and once for calculating the weighted sum.
var weightOfNested = (a, d = getDepth(a)) => a.reduce((w,e) => Array.isArray(e) ? w + weightOfNested(e,d-1)
: w + d*e, 0),
getDepth = (a, d = 1, t = 1) => a.reduce((r,e) => Array.isArray(e) ? r === t ? getDepth(e,++r,t+1)
: getDepth(e,r,t+1)
: r, d);
console.log(weightOfNested([[1,1,[3]],2,[1,1]])); // depth is 3
I entered a coding test where one of the questions was this: given an array A of integers of any length, and then two numbers N and Z, say whether there are Z (distinct) numbers in A such as their sum is N.
So for example (in the format A N Z):
for [1,2,3] 5 2 the answer is YES because 2+3=5
for [1,2,3] 6 2 the answer is NO because there are no two numbers in A that can be added to make 6
My solution (below) first enumerates every (unordered) combination of Z numbers in A, then sums it, and then searches for N in the list of sums.
Although this solution works fine (passed all test cases, with no timeout), I was told the score was too low for me to continue in the test.
So the question is, what can be improved?
An obvious optimization would be to calculate the sum of each combination immediately, and then stop when a match with N is found; but since I didn't run into time issues I don't think this is the problem. What is the better, more elegant/efficient solution?
function main(a, n, z) {
var spacea = [], // array of unordered combinations of z integers from a
space = [], // array of unique sums of combinations from spacea
res=0; // result (1 or 0)
// produce combination
spacea = combo(a,z);
// put unique sums in space
spacea.forEach(function(arr) {
var s = arr.reduce(function(a,b) {
return a+b;
});
if (space.indexOf(s)<0) space.push(s);
});
// is n in space?
res = space.indexOf(n) === -1 ? "NO" :"YES";
return res;
}
// produces combinations (outputs array of arrays)
function combo(a, z) {
var i,
r = [],
head,
right;
if (z > a.length || z <= 0) {
// do nothing, r is already set to []
}
else if (a.length === z) {
r = [a];
}
else if (1 === z) {
// r = array of array of values from a
a.forEach(function(e) {
r.push([e]);
});
}
else { // by virtue of above tests, z>1 and z<a.length
for (i=0; i<a.length-z+1; i++) {
head = a.slice(i, i+1);
right = combo(a.slice(i+1), z-1);
right.forEach(function(e) {
r.push(head.concat(e));
});
}
}
return r;
}
This is a variation of the subset sum problem, which can be solved with Dynamic Programming for more efficient solution.
The main difference here, is you have an extra restriction - the number of elements that must be used. This extra restriction can be handled by adding another variable (dimension) - the number of already used elements.
The recursive formulas (which you will build the DP solution from) should be:
D(0,0,0) = true
D(i,k,x) = false if i < 0 or k < 0
D(i,k,x) = D(i-1, k, x) OR D(i-1, k-1, x - arr[i])
In the above, D(i,k,x) is true if and only if there is a solution that uses k exactly k numbers, from the first i elements, and sums to x.
Complexity of this solution is O(n*N*Z) where n - number of elements in the array, N - number of distinct elements you can use, Z - target sum.