Here's a simple JavaScript performance test:
const iterations = new Array(10 ** 7);
var x = 0;
var i = iterations.length + 1;
console.time('negative');
while (--i) {
x += iterations[-i];
}
console.timeEnd('negative');
var y = 0;
var j = iterations.length;
console.time('positive');
while (j--) {
y += iterations[j];
}
console.timeEnd('positive');
The first loop counts from 10,000,000 down to 1 and accesses an array with a length of 10 million using a negative index on each iteration. So it goes through the array from beginning to end.
The second loop counts from 9,999,999 down to 0 and accesses the same array using a positive index on each iteration. So it goes through the array in reverse.
On my PC, the first loop takes longer than 6 seconds to complete, but the second one only takes ~400ms.
Why is the second loop faster than the first?
Because iterations[-1] will evaluate to undefined (which is slow as it has to go up the whole prototype chain and can't take a fast path) also doing math with NaN will always be slow as it is the non common case.
Also initializing iterations with numbers will make the whole test more useful.
Pro Tip: If you try to compare the performance of two codes, they should both result in the same operation at the end ...
Some general words about performance tests:
Performance is the compiler's job these days, code optimized by the compiler will always be faster than code you are trying to optimize through some "tricks". Therefore you should write code that is likely by the compiler to optimize, and that is in every case, the code that everyone else writes (also your coworkers will love you if you do so). Optimizing that is the most benefitial from the engine's view. Therefore I'd write:
let acc = 0;
for(const value of array) acc += value;
// or
const acc = array.reduce((a, b) => a + b, 0);
However in the end it's just a loop, you won't waste much time if the loop is performing bad, but you will if the whole algorithm performs bad (time complexity of O(n²) or more). Focus on the important things, not the loops.
To elaborate on Jonas Wilms' answer, Javascript does not work with negative indice (unlike languages like Python).
iterations[-1] is equal to iteration["-1"], which look for the property named -1 in the array object. That's why iterations[-1] will evaluate to undefined.
This is a follow-up question to: Finding max value of a weighted subset sum of a power set
Whereas the previous question solves (to optimality) problems of size <= 15 in reasonable time, I would like to solve problems of size ~2000 to near-optimality.
As a small example problem, I am given a certain range of nodes:
var range = [0,1,2,3,4];
A function creates a power set for all the nodes in the range and assigns each combination a numeric score. Negative scores are removed, resulting in the following array S. S[n][0] is the bitwise OR of all included nodes, and S[n][1] is the score:
var S = [
[1,0], //0
[2,0], //1
[4,0], //2
[8,0], //3
[16,0], //4
[3,50], //0-1
[5,100], //0-2
[6,75], //1-2
[20,130], //2-4
[7,179] //0-1-2 e.g. combining 0-1-2 has a key of 7 (bitwise `1|2|4`) and a score of 179.
];
The optimal solution, maximizing the score, would be:
var solution = [[8,3,20],180]
Where solution[0] is an array of combinations from S. and solution[1] is the resulting score. Note that bitwise 8 & 3 & 20 == 0 signifying that each node is used only once.
Problem specifics: Each node must be used exactly once and the score for the single-node combinations will always be 0, as shown in the S array above.
My current solution (seen here) uses dynamic programming and works for small problems. I have seen heuristics involving dynamic programming, such as https://youtu.be/ze1Xa28h_Ns, but can't figure out how I'd apply that to a multi-dimensional problem. Given the problem constraints, what would be a reasonable heuristic to apply?
EDIT:
Things I've tried
Greedy approach (sort score greatest to least, pick the next viable candidate)
Same as above, but sort by score/cardinality(combo)
GRASP (edit each score by up to 10%, then sort, repeat until a better solution hasn't been found in x seconds)
This problem is really an integer optimization problem, with binary variables x_i indicating if the i^th element of S is selected and constraints indicating that each bit is used exactly once. The objective is to maximize the score attained across the selected elements. If we defined S_i to be the i^th element of S, L_b to be the indices of elements in S with bit b set, w_i to be the score associated with element i, and assumed there were n elements in set S and k bits, we could write this in mathematical notation as:
min_{x} \sum_{i=1..n} w_i*x_i
s.t. \sum_{i \in L_b} x_i = 1 \forall b = 1..k
x_i \in {0, 1} \forall i = 1..n
In many cases, linear programming solvers are much (much, much) more effective than exhaustive search at solving these sorts of problems. Unfortunately I am not aware of any javascript linear programming libraries (a Google query turned up SimplexJS and glpk.js and node-lp_solve -- I have no experience with any of these and couldn't immediately get any to work). As a result I will do the implementation in R using the lpSolve package.
w <- c(0, 0, 0, 0, 0, 50, 100, 75, 130, 179)
elt <- c(1, 2, 4, 8, 16, 3, 5, 6, 20, 7)
k <- 5
library(lpSolve)
mod <- lp(direction = "max",
objective.in = w,
const.mat = t(sapply(1:k, function(b) 1*(bitwAnd(elt, 2^(b-1)) > 0))),
const.dir = rep("=", k),
const.rhs = rep(1, k),
all.bin = TRUE)
elt[mod$solution > 0.999]
# [1] 8 3 20
mod$objval
# [1] 180
As you'll note, this is an exact formulation of your problem. However, by setting a timeout (you'd actually need to use the lpSolveAPI package in R instead of the lpSolve package to do this), you can get the best solution found by the solver before reaching your specified timeout. This may not be an optimal solution, and you can control how long before the heuristic stops trying to find better solutions. If the solver terminates before the timeout, the solution is guaranteed to be optimal.
A reasonable heuristic (the first that comes to my mind) would be one that iteratively took the feasible element with the largest score, eliminating all elements that have overlapping bits with the selected element.
I would implement this by first sorting in decreasing order by score and then then iteratively add the first element and filter the list, removing any element that overlaps the selected element.
In javascript:
function comp(a, b) {
if (a[1] < b[1]) return 1;
if (a[1] > b[1]) return -1;
return 0;
}
S.sort(comp); // Sort descending by score
var output = []
var score = 0;
while (S.length > 0) {
output.push(S[0][0]);
score += S[0][1];
newS = [];
for (var i=0; i < S.length; i++) {
if ((S[i][0] & S[0][0]) == 0) {
newS.push(S[i]);
}
}
S = newS;
}
alert(JSON.stringify([output, score]));
This selects elements 7, 8, and 16, with score 179 (as opposed to the optimal score of 180).
This code takes 3 seconds on Chrome and 6s on Firefox.
If I write the code in Java and run it under Java 7.0 it takes only 10ms.
Chrome's JS engine is usually very fast. Why is it so slow here?
btw. this code is just for testing. I know it's not very practical way to write a fibonacci function
fib = function(n) {
if (n < 2) {
return n;
} else {
return fib(n - 1) + fib(n - 2);
}
};
console.log(fib(32));
This isn't fault of javascript, but your algorithm. You're recomputing same subproblems over and over again, and it gets worse when N is bigger. This is call graph for a single call:
F(32)
/ \
F(31) F(30)
/ \ / \
F(30) F(29) F(29) F(28)
/ \ / \ / \ | \
F(29) F(28) F(28) F(27) F(28) F(27) F(27) F(26)
... deeper and deeper
As you can see from this tree, you're computing some fibonacci numbers several times, for example F(28) is computed 4 times. From the "Algorithm Design Manual" book:
How much time does this algorithm take to compute F(n)? Since F(n+1)
/F(n) ≈ φ = (1 + sqrt(5))/2 ≈ 1.61803, this means that F(n) > 1.6^n . Since our
recursion tree has only 0 and 1 as leaves, summing up to such a large
number means we must have at least 1.6^n leaves or procedure calls!
This humble little program takes exponential time to run!
You have to use memoization or build solution bottom up (i.e. small subproblems first).
This solution uses memoization (thus, we're computing each Fibonacci number only once):
var cache = {};
function fib(n) {
if (!cache[n]) {
cache[n] = (n < 2) ? n : fib(n - 1) + fib(n - 2);
}
return cache[n];
}
This one solves it bottom up:
function fib(n) {
if (n < 2) return n;
var a = 0, b = 1;
while (--n) {
var t = a + b;
a = b;
b = t;
}
return b;
}
As is fairly well known, the implementation of the fibonacci function you gave in your question requires a lot of steps if implemented naively. In particular, it takes 7,049,155 calls.
However, these kinds of algorithms can be greatly sped up with a technique known as memoization. If you see the function call fib(32) taking several seconds, the function is being implemented naively. If it returns instantly, there is a high probability that the implementation is using memoization.
Based on the evidence already provided the conclusion I draw is:
When the code is not run from the console (like in the jsFiddle where my machine, a Sandy Bridge Macbook Air, computes it in 55ms) the JS engine is able to JIT and possibly automatically memoize the algorithm.
When run from the js console none of this occurs. On my machine it was only under 10x slower: 460ms.
I then edited the code to look for F(38) which bumped the times up to 967ms and 9414ms so it has maintained a similar speedup factor. This indicates that no memoization is being performed and the speedup is probably due to JITting.
Just a comment...
Function calls are relatively expensive, recursion is very expensive and always slower than an equivalent using an efficient loop. e.g the following is thousands of times faster than the recursive alternative in IE:
function fib2(n) {
var arr = [0, 1];
var len = 2;
while (len <= n) {
arr[len] = arr[len-1] + arr[len-2];
++len;
}
return arr[n];
}
And as noted in other answers, it seems the OP algorithm is also inherently slow, but I guess that isn't really the issue.
In addition to the memoization approach recommended by #galymzhan, you could also use another algorithm all together. Traditionally, the formula for nth Fibonacci number is F(n) = F(n-1) + F(n-2). This has time complexity that is directly proportional to n.
Dijkstra came up with an algorithm to derive Fibonacci numbers in less than half the steps as specified by the conventional formula. This was outlined in his writing EDW #654. It goes:
For even numbers, F(2n) = (F(n))2 + (F(n+1))2
For odd numbers, F(2n+1) = (2F(n) + F(n+1)) * F(n+1) OR F(2n-1) = (2F(n+1) - F(n)) * F(n)
I was helping somebody out with his JavaScript code and my eyes were caught by a section that looked like that:
function randOrd(){
return (Math.round(Math.random())-0.5);
}
coords.sort(randOrd);
alert(coords);
My first though was: hey, this can't possibly work! But then I did some experimenting and found that it indeed at least seems to provide nicely randomized results.
Then I did some web search and almost at the top found an article from which this code was most ceartanly copied. Looked like a pretty respectable site and author...
But my gut feeling tells me, that this must be wrong. Especially as the sorting algorithm is not specified by ECMA standard. I think different sorting algoritms will result in different non-uniform shuffles. Some sorting algorithms may probably even loop infinitely...
But what do you think?
And as another question... how would I now go and measure how random the results of this shuffling technique are?
update: I did some measurements and posted the results below as one of the answers.
After Jon has already covered the theory, here's an implementation:
function shuffle(array) {
var tmp, current, top = array.length;
if(top) while(--top) {
current = Math.floor(Math.random() * (top + 1));
tmp = array[current];
array[current] = array[top];
array[top] = tmp;
}
return array;
}
The algorithm is O(n), whereas sorting should be O(n log n). Depending on the overhead of executing JS code compared to the native sort() function, this might lead to a noticable difference in performance which should increase with array sizes.
In the comments to bobobobo's answer, I stated that the algorithm in question might not produce evenly distributed probabilities (depending on the implementation of sort()).
My argument goes along these lines: A sorting algorithm requires a certain number c of comparisons, eg c = n(n-1)/2 for Bubblesort. Our random comparison function makes the outcome of each comparison equally likely, ie there are 2^c equally probable results. Now, each result has to correspond to one of the n! permutations of the array's entries, which makes an even distribution impossible in the general case. (This is a simplification, as the actual number of comparisons neeeded depends on the input array, but the assertion should still hold.)
As Jon pointed out, this alone is no reason to prefer Fisher-Yates over using sort(), as the random number generator will also map a finite number of pseudo-random values to the n! permutations. But the results of Fisher-Yates should still be better:
Math.random() produces a pseudo-random number in the range [0;1[. As JS uses double-precision floating point values, this corresponds to 2^x possible values where 52 ≤ x ≤ 63 (I'm too lazy to find the actual number). A probability distribution generated using Math.random() will stop behaving well if the number of atomic events is of the same order of magnitude.
When using Fisher-Yates, the relevant parameter is the size of the array, which should never approach 2^52 due to practical limitations.
When sorting with a random comparision function, the function basically only cares if the return value is positive or negative, so this will never be a problem. But there is a similar one: Because the comparison function is well-behaved, the 2^c possible results are, as stated, equally probable. If c ~ n log n then 2^c ~ n^(a·n) where a = const, which makes it at least possible that 2^c is of same magnitude as (or even less than) n! and thus leading to an uneven distribution, even if the sorting algorithm where to map onto the permutaions evenly. If this has any practical impact is beyond me.
The real problem is that the sorting algorithms are not guaranteed to map onto the permutations evenly. It's easy to see that Mergesort does as it's symmetric, but reasoning about something like Bubblesort or, more importantly, Quicksort or Heapsort, is not.
The bottom line: As long as sort() uses Mergesort, you should be reasonably safe except in corner cases (at least I'm hoping that 2^c ≤ n! is a corner case), if not, all bets are off.
It's never been my favourite way of shuffling, partly because it is implementation-specific as you say. In particular, I seem to remember that the standard library sorting from either Java or .NET (not sure which) can often detect if you end up with an inconsistent comparison between some elements (e.g. you first claim A < B and B < C, but then C < A).
It also ends up as a more complex (in terms of execution time) shuffle than you really need.
I prefer the shuffle algorithm which effectively partitions the collection into "shuffled" (at the start of the collection, initially empty) and "unshuffled" (the rest of the collection). At each step of the algorithm, pick a random unshuffled element (which could be the first one) and swap it with the first unshuffled element - then treat it as shuffled (i.e. mentally move the partition to include it).
This is O(n) and only requires n-1 calls to the random number generator, which is nice. It also produces a genuine shuffle - any element has a 1/n chance of ending up in each space, regardless of its original position (assuming a reasonable RNG). The sorted version approximates to an even distribution (assuming that the random number generator doesn't pick the same value twice, which is highly unlikely if it's returning random doubles) but I find it easier to reason about the shuffle version :)
This approach is called a Fisher-Yates shuffle.
I would regard it as a best practice to code up this shuffle once and reuse it everywhere you need to shuffle items. Then you don't need to worry about sort implementations in terms of reliability or complexity. It's only a few lines of code (which I won't attempt in JavaScript!)
The Wikipedia article on shuffling (and in particular the shuffle algorithms section) talks about sorting a random projection - it's worth reading the section on poor implementations of shuffling in general, so you know what to avoid.
I did some measurements of how random the results of this random sort are...
My technique was to take a small array [1,2,3,4] and create all (4! = 24) permutations of it. Then I would apply the shuffling function to the array a large number of times and count how many times each permutation is generated. A good shuffling algoritm would distribute the results quite evenly over all the permutations, while a bad one would not create that uniform result.
Using the code below I tested in Firefox, Opera, Chrome, IE6/7/8.
Surprisingly for me, the random sort and the real shuffle both created equally uniform distributions. So it seems that (as many have suggested) the main browsers are using merge sort. This of course doesn't mean, that there can't be a browser out there, that does differently, but I would say it means, that this random-sort-method is reliable enough to use in practice.
EDIT: This test didn't really measured correctly the randomness or lack thereof. See the other answer I posted.
But on the performance side the shuffle function given by Cristoph was a clear winner. Even for small four-element arrays the real shuffle performed about twice as fast as random-sort!
// The shuffle function posted by Cristoph.
var shuffle = function(array) {
var tmp, current, top = array.length;
if(top) while(--top) {
current = Math.floor(Math.random() * (top + 1));
tmp = array[current];
array[current] = array[top];
array[top] = tmp;
}
return array;
};
// the random sort function
var rnd = function() {
return Math.round(Math.random())-0.5;
};
var randSort = function(A) {
return A.sort(rnd);
};
var permutations = function(A) {
if (A.length == 1) {
return [A];
}
else {
var perms = [];
for (var i=0; i<A.length; i++) {
var x = A.slice(i, i+1);
var xs = A.slice(0, i).concat(A.slice(i+1));
var subperms = permutations(xs);
for (var j=0; j<subperms.length; j++) {
perms.push(x.concat(subperms[j]));
}
}
return perms;
}
};
var test = function(A, iterations, func) {
// init permutations
var stats = {};
var perms = permutations(A);
for (var i in perms){
stats[""+perms[i]] = 0;
}
// shuffle many times and gather stats
var start=new Date();
for (var i=0; i<iterations; i++) {
var shuffled = func(A);
stats[""+shuffled]++;
}
var end=new Date();
// format result
var arr=[];
for (var i in stats) {
arr.push(i+" "+stats[i]);
}
return arr.join("\n")+"\n\nTime taken: " + ((end - start)/1000) + " seconds.";
};
alert("random sort: " + test([1,2,3,4], 100000, randSort));
alert("shuffle: " + test([1,2,3,4], 100000, shuffle));
Interestingly, Microsoft used the same technique in their pick-random-browser-page.
They used a slightly different comparison function:
function RandomSort(a,b) {
return (0.5 - Math.random());
}
Looks almost the same to me, but it turned out to be not so random...
So I made some testruns again with the same methodology used in the linked article, and indeed - turned out that the random-sorting-method produced flawed results. New test code here:
function shuffle(arr) {
arr.sort(function(a,b) {
return (0.5 - Math.random());
});
}
function shuffle2(arr) {
arr.sort(function(a,b) {
return (Math.round(Math.random())-0.5);
});
}
function shuffle3(array) {
var tmp, current, top = array.length;
if(top) while(--top) {
current = Math.floor(Math.random() * (top + 1));
tmp = array[current];
array[current] = array[top];
array[top] = tmp;
}
return array;
}
var counts = [
[0,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0]
];
var arr;
for (var i=0; i<100000; i++) {
arr = [0,1,2,3,4];
shuffle3(arr);
arr.forEach(function(x, i){ counts[x][i]++;});
}
alert(counts.map(function(a){return a.join(", ");}).join("\n"));
I have placed a simple test page on my website showing the bias of your current browser versus other popular browsers using different methods to shuffle. It shows the terrible bias of just using Math.random()-0.5, another 'random' shuffle that isn't biased, and the Fisher-Yates method mentioned above.
You can see that on some browsers there is as high as a 50% chance that certain elements will not change place at all during the 'shuffle'!
Note: you can make the implementation of the Fisher-Yates shuffle by #Christoph slightly faster for Safari by changing the code to:
function shuffle(array) {
for (var tmp, cur, top=array.length; top--;){
cur = (Math.random() * (top + 1)) << 0;
tmp = array[cur]; array[cur] = array[top]; array[top] = tmp;
}
return array;
}
Test results: http://jsperf.com/optimized-fisher-yates
I think it's fine for cases where you're not picky about distribution and you want the source code to be small.
In JavaScript (where the source is transmitted constantly), small makes a difference in bandwidth costs.
It's been four years, but I'd like to point out that the random comparator method won't be correctly distributed, no matter what sorting algorithm you use.
Proof:
For an array of n elements, there are exactly n! permutations (i.e. possible shuffles).
Every comparison during a shuffle is a choice between two sets of permutations. For a random comparator, there is a 1/2 chance of choosing each set.
Thus, for each permutation p, the chance of ending up with permutation p is a fraction with denominator 2^k (for some k), because it is a sum of such fractions (e.g. 1/8 + 1/16 = 3/16).
For n = 3, there are six equally-likely permutations. The chance of each permutation, then, is 1/6. 1/6 can't be expressed as a fraction with a power of 2 as its denominator.
Therefore, the coin flip sort will never result in a fair distribution of shuffles.
The only sizes that could possibly be correctly distributed are n=0,1,2.
As an exercise, try drawing out the decision tree of different sort algorithms for n=3.
There is a gap in the proof: If a sort algorithm depends on the consistency of the comparator, and has unbounded runtime with an inconsistent comparator, it can have an infinite sum of probabilities, which is allowed to add up to 1/6 even if every denominator in the sum is a power of 2. Try to find one.
Also, if a comparator has a fixed chance of giving either answer (e.g. (Math.random() < P)*2 - 1, for constant P), the above proof holds. If the comparator instead changes its odds based on previous answers, it may be possible to generate fair results. Finding such a comparator for a given sorting algorithm could be a research paper.
It is a hack, certainly. In practice, an infinitely looping algorithm is not likely.
If you're sorting objects, you could loop through the coords array and do something like:
for (var i = 0; i < coords.length; i++)
coords[i].sortValue = Math.random();
coords.sort(useSortValue)
function useSortValue(a, b)
{
return a.sortValue - b.sortValue;
}
(and then loop through them again to remove the sortValue)
Still a hack though. If you want to do it nicely, you have to do it the hard way :)
If you're using D3 there is a built-in shuffle function (using Fisher-Yates):
var days = ['Lundi','Mardi','Mercredi','Jeudi','Vendredi','Samedi','Dimanche'];
d3.shuffle(days);
And here is Mike going into details about it:
http://bost.ocks.org/mike/shuffle/
No, it is not correct. As other answers have noted, it will lead to a non-uniform shuffle and the quality of the shuffle will also depend on which sorting algorithm the browser uses.
Now, that might not sound too bad to you, because even if theoretically the distribution is not uniform, in practice it's probably nearly uniform, right? Well, no, not even close. The following charts show heat-maps of which indices each element gets shuffled to, in Chrome and Firefox respectively: if the pixel (i, j) is green, it means the element at index i gets shuffled to index j too often, and if it's red then it gets shuffled there too rarely.
These screenshots are taken from Mike Bostock's page on this subject.
As you can see, shuffling using a random comparator is severely biased in Chrome and even more so in Firefox. In particular, both have a lot of green along the diagonal, meaning that too many elements get "shuffled" somewhere very close to where they were in the original sequence. In comparison, a similar chart for an unbiased shuffle (e.g. using the Fisher-Yates algorithm) would be all pale yellow with just a small amount of random noise.
Here's an approach that uses a single array:
The basic logic is:
Starting with an array of n elements
Remove a random element from the array and push it onto the array
Remove a random element from the first n - 1 elements of the array and push it onto the array
Remove a random element from the first n - 2 elements of the array and push it onto the array
...
Remove the first element of the array and push it onto the array
Code:
for(i=a.length;i--;) a.push(a.splice(Math.floor(Math.random() * (i + 1)),1)[0]);
Can you use the Array.sort() function to shuffle an array – Yes.
Are the results random enough – No.
Consider the following code snippet:
/*
* The following code sample shuffles an array using Math.random() trick
* After shuffling, the new position of each item is recorded
* The process is repeated 100 times
* The result is printed out, listing each item and the number of times
* it appeared on a given position after shuffling
*/
var array = ["a", "b", "c", "d", "e"];
var stats = {};
array.forEach(function(v) {
stats[v] = Array(array.length).fill(0);
});
var i, clone;
for (i = 0; i < 100; i++) {
clone = array.slice();
clone.sort(function() {
return Math.random() - 0.5;
});
clone.forEach(function(v, i) {
stats[v][i]++;
});
}
Object.keys(stats).forEach(function(v, i) {
console.log(v + ": [" + stats[v].join(", ") + "]");
});
Sample output:
a: [29, 38, 20, 6, 7]
b: [29, 33, 22, 11, 5]
c: [17, 14, 32, 17, 20]
d: [16, 9, 17, 35, 23]
e: [ 9, 6, 9, 31, 45]
Ideally, the counts should be evenly distributed (for the above example, all counts should be around 20). But they are not. Apparently, the distribution depends on what sorting algorithm is implemented by the browser and how it iterates the array items for sorting.
There is nothing wrong with it.
The function you pass to .sort() usually looks something like
function sortingFunc( first, second )
{
// example:
return first - second ;
}
Your job in sortingFunc is to return:
a negative number if first goes before second
a positive number if first should go after second
and 0 if they are completely equal
The above sorting function puts things in order.
If you return -'s and +'s randomly as what you have, you get a random ordering.
Like in MySQL:
SELECT * from table ORDER BY rand()