Why are negative array indices much slower than positive array indices? - javascript

Here's a simple JavaScript performance test:
const iterations = new Array(10 ** 7);
var x = 0;
var i = iterations.length + 1;
console.time('negative');
while (--i) {
x += iterations[-i];
}
console.timeEnd('negative');
var y = 0;
var j = iterations.length;
console.time('positive');
while (j--) {
y += iterations[j];
}
console.timeEnd('positive');
The first loop counts from 10,000,000 down to 1 and accesses an array with a length of 10 million using a negative index on each iteration. So it goes through the array from beginning to end.
The second loop counts from 9,999,999 down to 0 and accesses the same array using a positive index on each iteration. So it goes through the array in reverse.
On my PC, the first loop takes longer than 6 seconds to complete, but the second one only takes ~400ms.
Why is the second loop faster than the first?

Because iterations[-1] will evaluate to undefined (which is slow as it has to go up the whole prototype chain and can't take a fast path) also doing math with NaN will always be slow as it is the non common case.
Also initializing iterations with numbers will make the whole test more useful.
Pro Tip: If you try to compare the performance of two codes, they should both result in the same operation at the end ...
Some general words about performance tests:
Performance is the compiler's job these days, code optimized by the compiler will always be faster than code you are trying to optimize through some "tricks". Therefore you should write code that is likely by the compiler to optimize, and that is in every case, the code that everyone else writes (also your coworkers will love you if you do so). Optimizing that is the most benefitial from the engine's view. Therefore I'd write:
let acc = 0;
for(const value of array) acc += value;
// or
const acc = array.reduce((a, b) => a + b, 0);
However in the end it's just a loop, you won't waste much time if the loop is performing bad, but you will if the whole algorithm performs bad (time complexity of O(n²) or more). Focus on the important things, not the loops.

To elaborate on Jonas Wilms' answer, Javascript does not work with negative indice (unlike languages like Python).
iterations[-1] is equal to iteration["-1"], which look for the property named -1 in the array object. That's why iterations[-1] will evaluate to undefined.

Related

Time Complexity Hash Table

I implemented a simple algorithm in 2 ways. One using indexOf and other one using Hash Table.
The problem:
Given an arbitrary ransom note string and another string containing letters from all the magazines, write a function that will return true if the ransom note can be constructed from the magazines ; otherwise, it will return false.
First one. Is it Time Complexity O(N^2) because I have N letters in the ransomNote and I can do N searches in the indexOf?
var canConstruct = function(ransomNote, magazine) {
if(magazine.length < ransomNote.length) return false;
const arr = magazine.split("");
for(let i=0; i<ransomNote.length; i++) {
if(arr.indexOf(ransomNote[i]) < 0)
return false;
const index = arr.indexOf(ransomNote[i]);
arr.splice(index, 1);
}
return true;
};
Second one. What is the time complexity? Does the hash table makes it O(N)?
var canConstruct = function(ransomNote, magazine) {
if(magazine.length < ransomNote.length) return false;
const map = new Map();
for(let i =0; i<magazine.length; i++) {
if(map.has(magazine[i]))
map.set(magazine[i], map.get(magazine[i])+1);
else
map.set(magazine[i], 1);
}
for(let i=0; i<ransomNote.length; i++) {
if(!map.has(ransomNote[i]))
return false;
else {
const x = map.get(ransomNote[i]) - 1;
if(x > 0)
map.set(ransomNote[i], x)
else
map.delete(ransomNote[i]);
}
}
return true;
};
Thanks
FIRST solution
Well one thing you have to take in consideration, especially in the first solution is that split, slice and indexOf methods all have their own time complexity.
Let's say you have m letters in magazine. When you split it into array, you will already use O(m) time complexity there (and of course O(m) space complexity due to the fact you store it all in a new array, that has a size of m).
Now you enter a for loop that will run n times (where n is the number of letters in ransomNote). So right then and there you have O(m * n) time complexity. indexOf operation will also be called n times with the caveat that it runs O(m) every time it's called. You can see there how quickly you start adding up your time complexity.
I see it being like O(3 * m * n^2) which rounds up to O(m * n^n) time complexity. I would strongly suggest not calling indexOf multiple times, just call it once and store its result somewhere. You will either have an index or -1 meaning it wasn't found.
SECOND solution
Much better. Here you populate a hash map (so some extra memory used, but considering you also used a split in first, and stored it, it should roughly be the same).
And then you just simply go with one loop over a randomNote and find a letter in the hashMap. Finding a letter in map is O(1) Time so it's really useful for this type of algorithm.
I would argue the complexity would be O(n * m) in the end so much better than the first one.
Hope I made sense to you. IF you want we can dive a bit deeper in space analysis later in the comments if you respond
The version with indexOf() is O(N*M), where N is the length of the ransom note, and M is the length of the magazine. You perform up to N searches, and each search is linear through the magazine array that's N characters. Also, array.splice() is O(M) because it has to copy all the array elements after index to the lower index.
Hash table access and updates are generally considered to be O(1). The second version performs N hash table lookups and updates, so the overall complexity is O(N).

Question concerning efficiency in calculated array indices and repeated references to a specific array value

In trying to minimize the time it takes a JavaScript function to process, please consider this set up. In the function is a loop that operates on an array of similar objects. The index of the array is [4 + loop counter] and there are several references to array[4+i][various property names], such as a[4+i].x, a[4+i].y, a[4+i].z in each loop iteration.
Is it okay to keep calculating 4+i several times within each loop iteration, or would efficiency be gained by declaring a variable at the top of the loop to hold the value of 4+i and use that variable as the index, or declare a variable to be a reference to the a[4+i] object? Is it more work for the browser to declare a new variable or to add 4+i ten times? Does the browser work each time to find a[n] such that, if one needs to use the object in a[n] multiple times per loop iteration, it would be better to set x = a[n] and just reference x.property_names ten times?
Thank you for considering my very novice question.
Does the browser work each time to find a[n] such that, if one needs to use the object in a[n] multiple times per loop iteration, it would be better to set x = a[n] and just reference x.property_names ten times?
Yes. Although the JavaScript engine may be able to optimize away the repeated a[4+i] work, it also might not be able to, depending on what your code is doing. In contrast, creating a local variable to store the reference in is very, very little work.
Subjectively, it's also probably clearer to the reader and more maintainable to do x = a[4+i] once and then use x.
That said, the best way to know the answer to this question in your specific situation is to do it and see if there's an improvement.
This snippet runs for a bit more than half minutes...
function m(f){
const t=[Date.now()];
const s=[];
for(let r=0;r<10;r++){
s.push(f());
t.push(Date.now());
}
for(let i=0;i<t.length-1;i++)
t[i]=t[i+1]-t[i];
t.pop();
t.sort((a,b)=>a-b);
t.pop();
t.pop();
return t.reduce((a,b)=>a+b);
}
const times=1000;
const bignumber=100000;
const bigarray=new Array(bignumber);
for(let i=0;i<bignumber;i++)
bigarray[i]={x:Math.random(),y:Math.random(),z:Math.random()};
for(let i=0;i<4;i++)
bigarray.push(bigarray[i]);
console.log("idx",m(function(){
let sum=0;
for(let r=0;r<times;r++)
for(let i=0;i<bignumber;i++)
sum+=bigarray[i].x+bigarray[i].y+bigarray[i].z;
return sum;
}));
console.log("idx+4",m(function(){
let sum=0;
for(let r=0;r<times;r++)
for(let i=0;i<bignumber;i++)
sum+=bigarray[i+4].x+bigarray[i+4].y+bigarray[i+4].z;
return sum;
}));
console.log("item",m(function(){
let sum=0;
for(let r=0;r<times;r++)
for(let i=0;i<bignumber;i++){
let item=bigarray[i];
sum+=item.x+item.y+item.z;
}
return sum;
}));
console.log("item+4",m(function(){
let sum=0;
for(let r=0;r<times;r++)
for(let i=0;i<bignumber;i++){
let item=bigarray[i+4];
sum+=item.x+item.y+item.z;
}
return sum;
}));
... and produces output like
idx 2398
idx+4 2788
item 2252
item+4 2303
for me on Chrome. The numbers are runtime in milliseconds of 8 runs (8 best out of 10).
Where
idx is bigarray[b].x+bigarray[b].y+bigarray[b].z, repeated access to the same element with a named index (i)
idx+4 is bigarray[i+4].x+bigarray[i+4].y+bigarray[i+4].z, repeated access to the same element with a calculated index (i+4)
item is item.x+item.y+item.z, so an array element was stored in a variable
item+4 is item.x+item.y+item.z too, just the array element was picked from i+4
Your question is very visibly the outlier here. Repeated access to an element with a "fixed" index (idx case) is already a bit slower than getting out the element into a variable (item and item+4 cases, where +4 is the slower one of course, that addition is executed 800 million times after all). But the 3 times repeated access to an element with a calculated index (idx+4 case) is 15-20+% slower than any of the others.
Here the array is so small that it fits into the L3 cache. If you "move" a couple 0-s from times to bignumber, the overall difference decreases to 10-15%, and anything else than idx+4 performs practically the same.

Why is using a loop to iterate from start of array to end faster than iterating both start to end and end to start?

Given an array having .length 100 containing elements having values 0 to 99 at the respective indexes, where the requirement is to find element of of array equal to n : 51.
Why is using a loop to iterate from start of array to end faster than iterating both start to end and end to start?
const arr = Array.from({length: 100}, (_, i) => i);
const n = 51;
const len = arr.length;
console.time("iterate from start");
for (let i = 0; i < len; i++) {
if (arr[i] === n) break;
}
console.timeEnd("iterate from start");
const arr = Array.from({length: 100}, (_, i) => i);
const n = 51;
const len = arr.length;
console.time("iterate from start and end");
for (let i = 0, k = len - 1; i < len && k >= 0; i++, k--) {
if (arr[i] === n || arr[k] === n) break;
}
console.timeEnd("iterate from start and end");
jsperf https://jsperf.com/iterate-from-start-iterate-from-start-and-end/1
The answer is pretty obvious:
More operations take more time.
When judging the speed of code, you look at how many operations it will perform. Just step through and count them. Every instruction will take one or more CPU cycles, and the more there are the longer it will take to run. That different instructions take a different amount of cycles mostly does not matter - while an array lookup might be more costly than integer arithmetic, both of them basically take constant time and if there are too many, it dominates the cost of our algorithm.
In your example, there are few different types of operations that you might want to count individually:
comparisons
increments/decrements
array lookup
conditional jumps
(we could be more granular, such as counting variable fetch and store operations, but those hardly matter - everything is in registers anyway - and their number basically is linear to the others).
Now both of your code iterate about 50 times - they element on which they break the loop is in the middle of the array. Ignoring off-by-a-few errors, those are the counts:
| forwards | forwards and backwards
---------------+------------+------------------------
>=/===/< | 100 | 200
++/-- | 50 | 100
a[b] | 50 | 100
&&/||/if/for | 100 | 200
Given that, it's not unexpected that doing twice the works takes considerably longer.
I'll also answer a few questions from your comments:
Is additional time needed for the second object lookup?
Yes, every individual lookup counts. It's not like they could be performed at once, or optimised into a single lookup (imaginable if they had looked up the same index).
Should there be two separate loops for each start to end and end to start?
Doesn't matter for the number of operations, just for their order.
Or, put differently still, what is the fastest approach to find an element in an array?
There is no "fastest" regarding the order, if you don't know where the element is (and they are evenly distributed) you have to try every index. Any order - even random ones - would work the same. Notice however that your code is strictly worse, as it looks at each index twice when the element is not found - it does not stop in the middle.
But still, there are a few different approaches at micro-optimising such a loop - check these benchmarks.
let is (still?) slower than var, see Why is using `let` inside a `for` loop so slow on Chrome? and Why is let slower than var in a for loop in nodejs?. This tear-up and tear-down (about 50 times) of the loop body scope in fact does dominate your runtime - that's why your inefficient code isn't completely twice as slow.
comparing against 0 is marginally faster than comparing against the length, which puts looping backwards at an advantage. See Why is iterating through an array backwards faster than forwards, JavaScript loop performance - Why is to decrement the iterator toward 0 faster than incrementing and Are loops really faster in reverse?
in general, see What's the fastest way to loop through an array in JavaScript?: it changes from engine update to engine update. Don't do anything weird, write idiomatic code, that's what will get optimised better.
#Bergi is correct. More operations is more time. Why? More CPU clock cycles.
Time is really a reference to how many clock cycles it takes to execute the code.
In order to get to the nitty-gritty of that you need to look at the machine level code (like assembly level code) to find the true evidence. Each CPU (core?) clock cycle can execute one instruction, so how many instructions are you executing?
I haven't counted the clock cycles in a long time since programming Motorola CPUs for embedded applications. If your code is taking longer then it is in fact generating a larger instruction set of machine code, even if the loop is shorter or runs an equal amount of times.
Never forget that your code is actually getting compiled into a set of commands that the CPU is going to execute (memory pointers, instruction-code level pointers, interrupts, etc.). That is how computers work and its easier to understand at the micro controller level like an ARM or Motorola processor but the same is true for the sophisticated machines that we are running on today.
Your code simply does not run the way you write it (sounds crazy right?). It is run as it is compiled to run as machine level instructions (writing a compiler is no fun). Mathematical expression and logic can be compiled in to quite a heap of assembly, machine level code and that is up to how the compiler chooses to interpret it (it is bit shifting, etc, remember binary mathematics anyone?)
Reference:
https://software.intel.com/en-us/articles/introduction-to-x64-assembly
Your question is hard to answer but as #Bergi stated the more operations the longer, but why? The more clock cycles it takes to execute your code. Dual core, quad core, threading, assembly (machine language) it is complex. But no code gets executed as you have written it. C++, C, Pascal, JavaScript, Java, unless you are writing in assembly (even that compiles down to machine code) but it is closer to actual execution code.
A masters in CS and you will get to counting clock cycles and sort times. You will likely make you own language framed on machine instruction sets.
Most people say who cares? Memory is cheap today and CPUs are screaming fast and getting faster.
But there are some critical applications where 10 ms matters, where an immediate interrupt is needed, etc.
Commerce, NASA, a Nuclear power plant, Defense Contractors, some robotics, you get the idea . . .
I vote let it ride and keep moving.
Cheers,
Wookie
Since the element you're looking for is always roughly in the middle of the array, you should expect the version that walks inward from both the start and end of the array to take about twice as long as one that just starts from the beginning.
Each variable update takes time, each comparison takes time, and you're doing twice as many of them. Since you know it will take one or two less iterations of the loop to terminate in this version, you should reason it will cost about twice as much CPU time.
This strategy is still O(n) time complexity since it only looks at each item once, it's just specifically worse when the item is near the center of the list. If it's near the end, this approach will have a better expected runtime. Try looking for item 90 in both, for example.
Selected answer is excellent. I'd like to add another aspect: Try findIndex(), it's 2-3 times faster than using loops:
const arr = Array.from({length: 900}, (_, i) => i);
const n = 51;
const len = arr.length;
console.time("iterate from start");
for (let i = 0; i < len; i++) {
if (arr[i] === n) break;
}
console.timeEnd("iterate from start");
console.time("iterate using findIndex");
var i = arr.findIndex(function(v) {
return v === n;
});
console.timeEnd("iterate using findIndex");
The other answers here cover the main reasons, but I think an interesting addition could be mentioning cache.
In general, sequentially accessing an array will be more efficient, particularly with large arrays. When your CPU reads an array from memory, it also fetches nearby memory locations into cache. This means that when you fetch element n, element n+1 is also probably loaded into cache. Now, cache is relatively big these days, so your 100 int array can probably fit comfortably in cache. However, on an array of much larger size, reading sequentially will be faster than switching between the beginning and the end of the array.

Javascript performance: reduce() vs for-loop

I was attempting this Codewars challenge and the problem involves finding the divisors of a number and then calculating the sum of these divisors squared. I found two approaches to this problem.
The first approach is based on another Stackoverflow questions about finding the sum of all divisors and seems clever at first:
function divisorsSquared(n) {
// create a numeric sequence and then reduce it
return [...Array(n+1).keys()].slice(1)
.reduce((sum, num)=>sum+(!(n % (num)) && Math.pow(num,2)), 0);
}
The second approach I used was using a simple for-loop:
function divisorsSquared(n) {
var sum = 0;
for(var i = 1; i<= n; i++){
if(n % i === 0) sum += Math.pow(i,2);
}
return sum;
}
Now I noticed that the first approach is significantly slower than the second and a quick jsperf test confirms this.
My questions are: Why is the first approach so much slower and what approach is preferable in production code?
On Codewars I notice that for many challenges there are clever one-line solutions using similar array methods. As a beginner, may such solutions be considered better practice than for-loops, even if performance is worse?
Array(n+1) allocates an array with n + 1 elements, Array(n+1).keys() returns an iterator over the created array's indices, but the spread operator [...Iterator] helps "unwrap" this iterator into yet another array, then finally slice(1) comes along to copy the secondly created array starting at index 1 which allocates yet another array (third one) with the number 0 discarded. So that were 3 allocations but 2 were dicarded. Your for-loop does not allocate any arrays, it is a simple traversal in O(n) with only 2 allocations for i and sum, so it is more efficient
sum+(!(n % (num)) && Math.pow(num,2)) is essentially the same as if(n % i === 0) sum += Math.pow(i,2); but the if approach is way more readable. If I were the judge, I would pick the second approach because it is more memory efficient, yet it favors readability.
Looking into the code, for loop is obviously less complex and more readable.
Consider you are working within a team, maximum number of your team members will know what the code is doing right away.
Some will have to look up what the reduce() method is, but then they'll also know what's going on.
So here, a for loop is easier for others to read and understand.
On the other side, native array functions (filter(), map(), reduce()) will save you from writing some extra code
and also slower in performance.
For a beginner, I think for-loops should be better over native array functions.
Functional or imperative approaches makes a difference in JS. Imperative always wins.
Yet, the real thing is most of time a better algorithm is the winner. Your code is a naive approach. You can tune it to work much better just by checking the integers up until the square root of the target value and you will get two answers per check. If target is 100 if 2 is a dividend then 100/2 must be a dividend too.. So it's fair to check up to Math.sqrt(100) - 1 and handle 10 with care in order to not consider it twice.
Accordingly now the functional solution with reduce beats the imperative naive solution.
function divisorsSquared(n) {
var sn = Math.sqrt(n);
return Array.from({length:~~sn-1},(_,i) => i+1)
.reduce((s,m) => n%m ? s : s + m*m + (n/m)*(n/m), 0) + (n%sn ? 0 : sn*sn);
}
var result = 0;
console.time("functional and tuned");
result = divisorsSquared(1000000);
console.timeEnd("functional and tuned");
console.log("for input: 1000000 the result is:",result);
function dvssqr(n) {
var sum = 0;
for(var i = 1; i<= n; i++){
if(n % i === 0) sum += Math.pow(i,2);
}
return sum;
}
console.time("imperative and naive");
result = dvssqr(1000000);
console.timeEnd("imperative and naive");
console.log("for input: 1000000 the result is:",result);

Efficient computation of n choose k in Node.js

I have some performance sensitive code on a Node.js server that needs to count combinations. From this SO answer, I used this simple recursive function for computing n choose k:
function choose(n, k) {
if (k === 0) return 1;
return (n * choose(n-1, k-1)) / k;
}
Then since we all know iteration is almost always faster than recursion, I wrote this function based on the multiplicative formula:
function choosei(n,k){
var result = 1;
for(var i=1; i <= k; i++){
result *= (n+1-i)/i;
}
return result;
}
I ran a few benchmarks on my machine. Here are the results of just one of them:
Recursive x 178,836 ops/sec ±7.03% (60 runs sampled)
Iterative x 550,284 ops/sec ±5.10% (51 runs sampled)
Fastest is Iterative
The results consistently showed that the iterative method is indeed about 3 to 4 times faster than the recursive method in Node.js (at least on my machine).
This is probably fast enough for my needs, but is there any way to make it faster? My code has to call this function very frequently, sometimes with fairly large values of n and k, so the faster the better.
EDIT
After running a few more tests with le_m's and Mike's solutions, it turns out that while both are significantly faster than the iterative method I proposed, Mike's method using Pascal's triangle appears to be slightly faster than le_m's log table method.
Recursive x 189,036 ops/sec ±8.83% (58 runs sampled)
Iterative x 538,655 ops/sec ±6.08% (51 runs sampled)
LogLUT x 14,048,513 ops/sec ±9.03% (50 runs sampled)
PascalsLUT x 26,538,429 ops/sec ±5.83% (62 runs sampled)
Fastest is PascalsLUT
The logarithmic look up method has been around 26-28 times faster than the iterative method in my tests, and the method using Pascal's triangle has been about 1.3 to 1.8 times faster than the logarithmic look up method.
Note that I followed le_m's suggestion of pre-computing the logarithms with higher precision using mathjs, then converted them back to regular JavaScript Numbers (which are always double-precision 64 bit floats).
Never compute factorials, they grow too quickly. Instead compute the result you want. In this case, you want the binomial numbers, which have an incredibly simple geometric construction: you can build pascal's triangle, as you need it, and do it using plain arithmetic.
Start with [1] and [1,1]. The next row has [1] at the start, [1+1] in the middle, and [1] at the end: [1,2,1]. Next row: [1] at the start, the sum of the first two terms in spot 2, the sum of the next two terms in spot three, and [1] at the end: [1,3,3,1]. Next row: [1], then 1+3=4, then 3+3=6, then 3+1=4, then [1] at the end, and so on and so on. As you can see, no factorials, logarithms, or even multiplications: just super fast addition with clean integer numbers. So simple, you can build a massive lookup table by hand.
And you should.
Never compute in code what you can compute by hand and just include as constants for immediate lookup; in this case, writing out the table for up to something around n=20 is absolutely trivial, and you can then just use that as your "starting LUT" and probably never even access the high rows.
But, if you do need them, or more, then because you can't build an infinite lookup table you compromise: you start with a pre-specified LUT, and a function that can "fill it up" to some term you need that's not in it yet:
// step 1: a basic LUT with a few steps of Pascal's triangle
const binomials = [
[1],
[1,1],
[1,2,1],
[1,3,3,1],
[1,4,6,4,1],
[1,5,10,10,5,1],
[1,6,15,20,15,6,1],
[1,7,21,35,35,21,7,1],
[1,8,28,56,70,56,28,8,1],
// ...
];
// step 2: a function that builds out the LUT if it needs to.
module.exports = function binomial(n,k) {
while(n >= binomials.length) {
let s = binomials.length;
let nextRow = [];
nextRow[0] = 1;
for(let i=1, prev=s-1; i<s; i++) {
nextRow[i] = binomials[prev][i-1] + binomials[prev][i];
}
nextRow[s] = 1;
binomials.push(nextRow);
}
return binomials[n][k];
};
Since this is an array of ints, the memory footprint is tiny. For a lot of work involving binomials, we realistically don't even need more than two bytes per integer, making this a minuscule lookup table: we don't need more than 2 bytes until you need binomials higher than n=19, and the full lookup table up to n=19 takes up a measly 380 bytes. This is nothing compared to the rest of your program. Even if we allow for 32 bit ints, we can get up to n=35 in a mere 2380 bytes.
So the lookup is fast: either O(constant) for previously computed values, (n*(n+1))/2 steps if we have no LUT at all (in big O notation, that would be O(n²), but big O notation is almost never the right complexity measure), and somewhere in between for terms we need that aren't in the LUT yet. Run some benchmarks for your application, which will tell you how big your initial LUT should be, simply hard code that (seriously. these are constants, they're are exactly the kind of values that should be hard coded), and keep the generator around just in case.
However, do remember that you're in JavaScript land, and you are constrained by the JavaScript numerical type: integers only go up to 2^53, beyond that the integer property (every n has a distinct m=n+1 such that m-n=1) is not guaranteed. This should hardly ever be a problem, though: once we hit that limit, we're dealing with binomial coefficients that you should never even be using.
The following algorithm has a run-time complexity of O(1) given a linear look-up table of log-factorials with space-complexity O(n).
Limiting n and k to the range [0, 1000] makes sense since binomial(1000, 500) is already dangerously close to Number.MAX_VALUE. We would thus need a look-up table of size 1000.
On a modern JavaScript engine, a compact array of n numbers has a size of n * 8 bytes. A full look-up table would thus require 8 kilobytes of memory. If we limit our input to the range [0, 100], the table would only occupy 800 bytes.
var logf = [0, 0, 0.6931471805599453, 1.791759469228055, 3.1780538303479458, 4.787491742782046, 6.579251212010101, 8.525161361065415, 10.60460290274525, 12.801827480081469, 15.104412573075516, 17.502307845873887, 19.987214495661885, 22.552163853123425, 25.19122118273868, 27.89927138384089, 30.671860106080672, 33.50507345013689, 36.39544520803305, 39.339884187199495, 42.335616460753485, 45.38013889847691, 48.47118135183523, 51.60667556776438, 54.78472939811232, 58.00360522298052, 61.261701761002, 64.55753862700634, 67.88974313718154, 71.25703896716801, 74.65823634883016, 78.0922235533153, 81.55795945611504, 85.05446701758152, 88.58082754219768, 92.1361756036871, 95.7196945421432, 99.33061245478743, 102.96819861451381, 106.63176026064346, 110.32063971475739, 114.0342117814617, 117.77188139974507, 121.53308151543864, 125.3172711493569, 129.12393363912722, 132.95257503561632, 136.80272263732635, 140.67392364823425, 144.5657439463449, 148.47776695177302, 152.40959258449735, 156.3608363030788, 160.3311282166309, 164.32011226319517, 168.32744544842765, 172.3527971391628, 176.39584840699735, 180.45629141754378, 184.53382886144948, 188.6281734236716, 192.7390472878449, 196.86618167289, 201.00931639928152, 205.1681994826412, 209.34258675253685, 213.53224149456327, 217.73693411395422, 221.95644181913033, 226.1905483237276, 230.43904356577696, 234.70172344281826, 238.97838956183432, 243.2688490029827, 247.57291409618688, 251.8904022097232, 256.22113555000954, 260.5649409718632, 264.9216497985528, 269.2910976510198, 273.6731242856937, 278.0675734403661, 282.4742926876304, 286.893133295427, 291.3239500942703, 295.76660135076065, 300.22094864701415, 304.6868567656687, 309.1641935801469, 313.65282994987905, 318.1526396202093, 322.66349912672615, 327.1852877037752, 331.7178871969285, 336.26118197919845, 340.815058870799, 345.37940706226686, 349.95411804077025, 354.5390855194408, 359.1342053695754, 363.73937555556347];
function binomial(n, k) {
return Math.exp(logf[n] - logf[n-k] - logf[k]);
}
console.log(binomial(5, 3));
Explanation
Starting with the original iterative algorithm, we first replace the product with a sum of logarithms:
function binomial(n, k) {
var logresult = 0;
for (var i = 1; i <= k; i++) {
logresult += Math.log(n + 1 - i) - Math.log(i);
}
return Math.exp(logresult);
}
Our loop now sums over k terms. If we rearrange the sum, we can easily see that we sum over consecutive logarithms log(1) + log(2) + ... + log(k) etc. which we can replace by a sum_of_logs(k) which is actually identical to log(k!). Precomputing these values and storing them in our lookup-table logf then leads to the above one-liner algorithm.
Computing the look-up table:
I recommend precomputing the lookup-table with higher precision and converting the resulting elements to 64-bit floats. If you do not need that little bit of additional precision or want to run this code on the client side, use this:
var size = 1000, logf = new Array(size);
logf[0] = 0;
for (var i = 1; i <= size; ++i) logf[i] = logf[i-1] + Math.log(i);
Numerical precision:
By using log-factorials, we avoid precision problems inherent to storing raw factorials.
We could even use Stirling's approximation for log(n!) instead of a lookup-table and still get 12 significant figures for above computation in both run-time and space complexity O(1):
function logf(n) {
return n === 0 ? 0 : (n + .5) * Math.log(n) - n + 0.9189385332046728 + 0.08333333333333333 / n - 0.002777777777777778 * Math.pow(n, -3);
}
function binomial(n , k) {
return Math.exp(logf(n) - logf(n - k) - logf(k));
}
console.log(binomial(1000, 500)); // 2.7028824094539536e+299
Using Pascal's triangle is a fast method for calculating n choose k.
The fastest method I know of would be to make use of the results from "On the Complexity of Calculating Factorials". Just calculate all 3 factorials, then perform the two division operations, each with complexity M(n logn).

Categories