Interpreting performance tests for non-zero check

Interpreting performance tests for non-zero check - javascript

I've recently again fallen into the trap of premature optimization, and hopefully climbed back out. However, on my short intermission, i've encountered something, which i'd like to confirm.
My very basic performance test yielded similar results on chrome (all variables are just declared globally, the first takes ~9ms, second ~7.5ms):
input = Array.from({ length: 1000000 }, () => Math.random() > 0.8 ? 0 : Math.random() * 1000000000 );
start = performance.now();
for (let i = 0; i < 1000000; i++) {
input[i] = input[i] === 0 ? 0 : 1;
}
console.log(performance.now() - start);
and
input = Array.from({ length: 1000000 }, () => Math.random() > 0.8 ? 0 : Math.random() * 1000000000 );
start = performance.now();
for (let i = 0; i < 1000000; i++) {
input[i] = ((input[i] | (~input[i] + 1)) >>> 31) & 1;
}
console.log(performance.now() - start);
When taking into account, that the loop and assignment itself is already taking a lot of time (~4.5ms), the second potentially takes ~33% less time, which is however far smaller gain than what i'd expect (and within measuring inaccuracy, e.g. on FireFox, both take much longer, and the second is ~33% worse).
Can i at this point conclude, that an optimization of the condition is already taking place, and a similar code change already being done, or am i falling for some mirage of a microbenchmark? In my mind, a branch of this category should be excessively more time consuming than the calculation.
I am primarily skeptical of my own reasoning, because i know, that these kinds of tests can very easily have skewed results for unforeseen reasons.

Yes, it's most likely an effect of measurement and JIT-compiler effects combined.
Especially the heuristik-driven JIT-compiler in JS modern browsers have is that good, that you won't have any chance of visualizing the "real" performance benefit of the intuitively faster statement.
The variance in my browser is so high that both codes need approximately equal time (running in Opera), which underlines your effect in FF.
Be glad that you found that 1.5ms advantage, you won't see more.

Related

Deoptimizations kill the performance with binary trees

I need some help to optimize the code below. I can't understand how to rewrite the code to avoid the deoptimizations.
Below is the code that works very slow on the Node platform. I took it from benchmarksgame-team binary-trees benchmark and added minor changes.
When it is run with --trace-deopt it shows that the functions in the hot path are depotimized, i.e.
[bailout (kind: deopt-lazy, reason: (unknown)): begin. deoptimizing 0x02117de4aa11 <JSFunction bottomUpTree2 (sfi = 000002F24C7D7539)>, opt id 3, bytecode offset 9, deopt exit 17, FP to SP delta 80, caller SP 0x00807d7fe6f0, pc 0x7ff6afaca78d]
The benchmark, run it using node --trace-deopt a 20
function mainThread() {
const maxDepth = Math.max(6, parseInt(process.argv[2]));
const stretchDepth = maxDepth + 1;
const check = itemCheck(bottomUpTree(stretchDepth));
console.log(`stretch tree of depth ${stretchDepth}\t check: ${check}`);
const longLivedTree = bottomUpTree(maxDepth);
for (let depth = 4; depth <= maxDepth; depth += 2) {
const iterations = 1 << maxDepth - depth + 4;
work(iterations, depth);
}
console.log(`long lived tree of depth ${maxDepth}\t check: ${itemCheck(longLivedTree)}`);
}
function work(iterations, depth) {
let check = 0;
for (let i = 0; i < iterations; i++) {
check += itemCheck(bottomUpTree(depth));
}
console.log(`${iterations}\t trees of depth ${depth}\t check: ${check}`);
}
function TreeNode(left, right) {
return {left, right};
}
function itemCheck(node) {
if (node.left === null) {
return 1;
}
return 1 + itemCheck2(node);
}
function itemCheck2(node) {
return itemCheck(node.left) + itemCheck(node.right);
}
function bottomUpTree(depth) {
return depth > 0
? bottomUpTree2(depth)
: new TreeNode(null, null);
}
function bottomUpTree2(depth) {
return new TreeNode(bottomUpTree(depth - 1), bottomUpTree(depth - 1))
}
console.time();
mainThread();
console.timeEnd();

(V8 developer here.)
The premise of this question is incorrect: a few deopts don't matter, and don't move the needle regarding performance. Trying to avoid them is an exercise in futility.
The first step when trying to improve performance of something is to profile it. In this case, a profile reveals that the benchmark is spending:
about 46.3% of the time in optimized code (about 4/5 of that for tree creation and 1/5 for tree iteration)
about 0.1% of the time in unoptimized code
about 52.8% of the time in the garbage collector, tracing and freeing all those short-lived objects.
This is as artificial a microbenchmark as they come. 50% GC time never happens in real-world code that does useful things aside from allocating multiple gigabytes of short-lived objects as fast as possible.
In fact, calling them "short-lived objects" is a bit inaccurate in this case. While the vast majority of the individual trees being constructed are indeed short-lived, the code allocates one super-large long-lived tree early on. That fools V8's adaptive mechanisms into assuming that all future TreeNodes will be long-lived too, so it allocates them in "old space" right away -- which would save time if the guess was correct, but ends up wasting time because the TreeNodes that follow are actually short-lived and would be better placed in "new space" (which is optimized for quickly freeing short-lived objects). So just by reshuffling the order of operations, I can get a 3x speedup.
This is a typical example of one of the general problems with microbenchmarks: by doing something extreme and unrealistic, they often create situations that are not at all representative of typical real-world scenarios. If engine developers optimized for such microbenchmarks, engines would perform worse for real-world code. If JavaScript developers try to derive insights from microbenchmarks, they'll write code that performs worse under realistic conditions.
Anyway, if you want to optimize this code, avoid as many of those object allocations as you can.
Concretely:
An artificial microbenchmark like this, by its nature, intentionally does useless work (such as: computing the same value a million times). You said you wanted to optimize it, which means avoiding useless work, but you didn't specify which parts of the useless work you'd like to preserve, if any. So in the absence of a preference, I'll assume that all useless work is useless. So let's optimize!
Looking at the code, it creates perfect binary trees of a given depth and counts their nodes. In other words, it sums up the 1s in these examples:
depth=0:
1
depth=1:
1
/ \
1 1
depth=2:
1
/ \
1 1
/ \ / \
1 1 1 1
and so on. If you think about it for a bit, you'll realize that such a tree of depth N has (2 ** (N+1)) - 1 nodes. So we can replace:
itemCheck(bottomUpTree(depth));
with
(2 ** (depth+1)) - 1
(and analogously for the "stretchDepth" line).
Next, we can take care of the useless repetitions. Since x + x + x + x + ... N times is the same as x*N, we can replace:
let check = 0;
for (let i = 0; i < iterations; i++) {
check += (2 ** (depth + 1)) - 1;
}
with just:
let check = ((2 ** (depth + 1)) - 1) * iterations;
With that we're from 12 seconds down to about 0.1 seconds. Not bad for five minutes of work, eh?
And that remaining time is almost entirely due to the longLivedTree. To apply the same optimizations to the operations creating and iterating that tree, we'd have to move them together, getting rid of its "long-livedness". Would you find that acceptable? You could get the overall time down to less than a millisecond! Would that make the benchmark useless? Not actually any more useless than it was to begin with, just more obviously so.

Slow delete of object properties in JS in V8

Just to train myself a bit of Typescript I wrote a simple ES6 Map+Set-like implementation based on plain JS Object. It works only for primitive keys, so no buckets, no hash-codes, etc. The problem I encountered is implementing delete method. Using plain delete is just unacceptably slow. For large maps it's about 300-400x slower than ES6 Map delete. I noticed the huge performance degradation if size of the object is large. On Node JS 7.9.0 (and Chrome 57 for example) if object has 50855 properties delete performance is the same as ES6 Map. But for 50856 properties the ES6 Map is faster on 2 orders of magnitude. Here is the simple code to reproduce:
// for node 6: 76300
// for node 7: 50855
const N0 = 50855;
function fast() {
const N = N0
const o = {}
for ( let i = 0; i < N; i++ ) {
o[i] = i
}
const t1 = Date.now()
for ( let i = 0; i < N; i++ ) {
delete o[i]
}
const t2 = Date.now()
console.log( N / (t2 - t1) + ' KOP/S' )
}
function slow() {
const N = N0 + 1 // adding just 1
const o = {}
for ( let i = 0; i < N; i++ ) {
o[i] = i
}
const t1 = Date.now()
for ( let i = 0; i < N; i++ ) {
delete o[i]
}
const t2 = Date.now()
console.log( N / (t2 - t1) + ' KOP/S' )
}
fast()
slow()
I guess I could instead of delete properties just set them to undefined or some guard object, but this will mess the code, because hasOwnProperty will not work correctly, for...in loops will need additional check and so on. Are there more nice solutions?
P.S. I'm using node 7.9.0 on OSX Sierra
Edited
Thanks for comments guys, I fixed OP/S => KOP/S. I think I asked rather badly specified question, so I changed the title. After some investigation I found out that for example in Firefox there is no such problems -- deleting cost grows linearly. So it's problem of super smart V8. And I think it's just a bug:(

(V8 developer here.) Yes, this is a known issue. The underlying problem is that objects should switch their elements backing store from a flat array to a dictionary when they become too sparse, and the way this has historically been implemented was for every delete operation to check if enough elements were still present for that transition not to happen yet. The bigger the array, the more time this check took. Under certain conditions (recently created objects below a certain size), the check was skipped -- the resulting impressive speedup is what you're observing in the fast() case.
I've taken this opportunity to fix the (frankly quite silly) behavior of the regular/slow path. It should be sufficient to check every now and then, not on every single delete. The fix will be in V8 6.0, which should be picked up by Node in a few months (I believe Node 8 is supposed to get it at some point).
That said, using delete causes various forms and magnitudes of slowdown in many situations, because it tends to make things more complicated, forcing the engine (any engine) to perform more checks and/or fall off various fast paths. It is generally recommended to avoid using delete whenever possible. Since you have ES6 maps/sets, use them! :-)

To answer the question "why adding 1 to N slows the delete operation".
My guess: the slowness comes from the way the memory is allocated for your Object.
Try changing your code to this:
(() => {
const N = 50855
const o = {}
for ( let i = 0; i < N; i++ ) {
o[i] = i
}
// Show the heap memory allocated
console.log(process.memoryUsage().heapTotal);
const t1 = Date.now()
for ( let i = 0; i < N; i++ ) {
delete o[i]
}
const t2 = Date.now()
console.log( N / (t2 - t1) + ' OP/S' )
})();
Now, when you run with N = 50855 the memory allocated is: "8306688 bytes" (8.3MB)
When you run with N = 50856 the memory allocated is: "8929280 bytes" (8.9MB).
So you got a 600kb increase in the size of the memory allocated, only by adding one more key to your Object.
Now, I say I "guess" that this is where the slowness come from, but I think it makes sense for the delete function to be slower as the size of your Object increases.
If you try with N = 70855 you would still have same 8.9MB used. This is because usually memory allocators allocate memory in fixed "batches" while increasing the size of an Array/Object in order to reduce the number of memory allocations they do.
Now, same thing might happen with delete and the GC. The memory you delete has to be picked up by the GC, and if the Object size is larger the GC will be slower. Also the memory might be released if the number of keys goes under a specific number.
(You should read about memory allocation for dynamic arrays if you want to know more; there was a cool article on what increase rate you should use for memory allocation, I can't find it atm :( )
PS: delete is not "extremely slow", you just compute the op/s wrong. The time passed is in milliseconds, not in seconds so you have to multiply by 1000.

Which Boolean is Faster? < or <=

I'm doing some work involving processing an insane amount of data in browser. As a result I'm trying to optimize everything down to the nuts and bolts. I don't need anyone telling me that I'm wasting my time or that premature optimization is the root of all evil.
I would just like to know if anyone that understands how JS works would know whether or not a lesser than boolean runs faster than a lesser than equals boolean. What I mean by that is, would:
return (i<2? 0:1)
Be parsed and run faster than:
return (i<=1? 0:1)
In this example we're assuming that i is an integer. Thanks.

JavaScript standard desribes the steps that needs to be taken in order to evaluate those expressions. You can take a look at ECMAScript 2015 Language Specification, section 12.9.3.
Be aware that even if there is slightly difference between steps of those two operation, other stuff in your application will have much more influence on performance that these simple operations that you cannot control in JavaScript. For example work of garbage collector, just-in-time compiler, ...
Even if you try measuring time in JavaScript, this will not work as just taking time stamps has much bigger influence on the performance than the actual expression you want to measure. Also the code that you wrote might not be the one which is really evaluated as some preoptimizations might me taken by the engine prior to actual running the code.

I wouldn't call this micro-optimisation, but rather nano-optimisation.
Cases are so similar you'll most likely have a measure precision below the gain you can expect...
(Edit)
If this code is optimised, the generated assembly code will just change from JAto JAE (in (x86) , and they use the same cycle count. 0,0000% change.
If it is not, you might win one step within a selectof the engine...
The annoying thing being that it makes you miss the larger picture : unless i'm wrong, you need a branch here, and if you're that worried about time, the statistical distribution of your input will influence WAY more the execution time. (but still not that much...)
So walk a step back and compare :
if (i<2)
return 0;
else
return 1;
and :
if (i>=2)
return 1;
else
return 0;
You see that for ( 100, 20, 10, 1, 50, 10) (1) will branch way more and for (0, 1, 0, 0, 20, 1), (2) branches more.
That will make much more difference... that might just as well be very difficult to measure !!!
(As a question left to the reader, i wonder how return +(i>1) compiles, and if there's a trick to avoid branching... )
(By the way i'm not against early optimisation, i even posted some advices here, if it might interest you : https://gamealchemist.wordpress.com/2016/04/15/writing-efficient-javascript-a-few-tips/ )

I have created a fiddle using performance.now API and console.time API's
Both the API says how much ms of the time was taken to execute the functions/loops.
I feel the major difference is the result, performance.now gives more accurate value i.e. upto 1/1000th ms.
https://jsfiddle.net/ztacgxf1/
function lessThan(){
var t0 = performance.now();
console.time("lessThan");
for(var i = 0; i < 100000; i++){
if(i < 1000){}
}
console.timeEnd("lessThan");
var t1 = performance.now();
console.log("Perf -- >>" + (t1-t0));
}
function lessThanEq(){
var t0 = performance.now();
console.time("lessThanEq")
for(var i = 0; i < 100000; i++){
if(i <= 999){}
}
console.timeEnd("lessThanEq");
var t1 = performance.now();
console.log("Perf -- >>" + (t1-t0));
}
lessThan()
lessThanEq()
I haven't much difference. May be iterating more may give different result.
Hope this helps you.

Why is this js code so slow?

This code takes 3 seconds on Chrome and 6s on Firefox.
If I write the code in Java and run it under Java 7.0 it takes only 10ms.
Chrome's JS engine is usually very fast. Why is it so slow here?
btw. this code is just for testing. I know it's not very practical way to write a fibonacci function
fib = function(n) {
if (n < 2) {
return n;
} else {
return fib(n - 1) + fib(n - 2);
}
};
console.log(fib(32));

This isn't fault of javascript, but your algorithm. You're recomputing same subproblems over and over again, and it gets worse when N is bigger. This is call graph for a single call:
F(32)
/ \
F(31) F(30)
/ \ / \
F(30) F(29) F(29) F(28)
/ \ / \ / \ | \
F(29) F(28) F(28) F(27) F(28) F(27) F(27) F(26)
... deeper and deeper
As you can see from this tree, you're computing some fibonacci numbers several times, for example F(28) is computed 4 times. From the "Algorithm Design Manual" book:
How much time does this algorithm take to compute F(n)? Since F(n+1)
/F(n) ≈ φ = (1 + sqrt(5))/2 ≈ 1.61803, this means that F(n) > 1.6^n . Since our
recursion tree has only 0 and 1 as leaves, summing up to such a large
number means we must have at least 1.6^n leaves or procedure calls!
This humble little program takes exponential time to run!
You have to use memoization or build solution bottom up (i.e. small subproblems first).
This solution uses memoization (thus, we're computing each Fibonacci number only once):
var cache = {};
function fib(n) {
if (!cache[n]) {
cache[n] = (n < 2) ? n : fib(n - 1) + fib(n - 2);
}
return cache[n];
}
This one solves it bottom up:
function fib(n) {
if (n < 2) return n;
var a = 0, b = 1;
while (--n) {
var t = a + b;
a = b;
b = t;
}
return b;
}

As is fairly well known, the implementation of the fibonacci function you gave in your question requires a lot of steps if implemented naively. In particular, it takes 7,049,155 calls.
However, these kinds of algorithms can be greatly sped up with a technique known as memoization. If you see the function call fib(32) taking several seconds, the function is being implemented naively. If it returns instantly, there is a high probability that the implementation is using memoization.

Based on the evidence already provided the conclusion I draw is:
When the code is not run from the console (like in the jsFiddle where my machine, a Sandy Bridge Macbook Air, computes it in 55ms) the JS engine is able to JIT and possibly automatically memoize the algorithm.
When run from the js console none of this occurs. On my machine it was only under 10x slower: 460ms.
I then edited the code to look for F(38) which bumped the times up to 967ms and 9414ms so it has maintained a similar speedup factor. This indicates that no memoization is being performed and the speedup is probably due to JITting.

Just a comment...
Function calls are relatively expensive, recursion is very expensive and always slower than an equivalent using an efficient loop. e.g the following is thousands of times faster than the recursive alternative in IE:
function fib2(n) {
var arr = [0, 1];
var len = 2;
while (len <= n) {
arr[len] = arr[len-1] + arr[len-2];
++len;
}
return arr[n];
}
And as noted in other answers, it seems the OP algorithm is also inherently slow, but I guess that isn't really the issue.

In addition to the memoization approach recommended by #galymzhan, you could also use another algorithm all together. Traditionally, the formula for nth Fibonacci number is F(n) = F(n-1) + F(n-2). This has time complexity that is directly proportional to n.
Dijkstra came up with an algorithm to derive Fibonacci numbers in less than half the steps as specified by the conventional formula. This was outlined in his writing EDW #654. It goes:
For even numbers, F(2n) = (F(n))2 + (F(n+1))2
For odd numbers, F(2n+1) = (2F(n) + F(n+1)) * F(n+1) OR F(2n-1) = (2F(n+1) - F(n)) * F(n)

How is randomness achieved with Math.random in javascript?

How is randomness achieved with Math.random in javascript? I've made something that picks between around 50 different options randomly. I'm wondering how comfortable I should be with using Math.random to get my randomness.

From the specifications:
random():
Returns a Number value with positive
sign, greater than or equal to 0 but
less than 1, chosen randomly or pseudo
randomly with approximately uniform
distribution over that range, using an
implementation-dependent algorithm or
strategy. This function takes no
arguments.
So the answer is that it depends on what JavaScript engine you're using.
I'm not sure if all browsers use the same strategy or what that strategy is unfortunately
It should be fine for your purposes. Only if you're doing a large amount of numbers would you begin to see a pattern

Using Math.random() is fine if you're not centrally pooling & using the results, i.e. for OAuth.
For example, our site used Math.random() to generate random "nonce" strings for use with OAuth. The original JavaScript library did this by choosing a character from a predetermined list using Math.random(): i.e.
for (var i = 0; i < length; ++i) {
var rnum = Math.floor(Math.random() * chars.length);
result += chars.substring(rnum, rnum+1);
}
The problem is, users were getting duplicate nonce strings (even using a 10 character length - theoretically ~10^18 combinations), usually within a few seconds of each other. My guess this is due to Math.random() seeding from the timestamp, as one of the other posters mentioned.

The exact implementation can of course differ somewhat depending on the browser, but they all use some kind of pseudo random number generator. Although it's not really random, it's certainly good enough for all general purposes.
You should only be worried about the randomness if you are using it for something that needs exceptionally good randomness, like encryption or simulating a game of chance in play for money, but then you would hardly use Javascript anyway.

It's 100% random enough for your purposes. It's seeded by time, so every time you run it, you'll get different results.
Paste this into your browsers address bar...
javascript:alert(Math.random() * 2 > 1);
and press [Enter] a few times... I got "true, false, false, true" - random enough :)

This is a little overkill...but, I couldn't resist doing this :)
You can execute this in your browser address bar. It generates a random number between 0 and 4, 100000 times. And outputs the number of times each number was generated and the number of times one random number followed the other.
I executed this in Firefox 3.5.2. All the numbers seem to be about equal - indicating no bias, and no obvious pattern in the way the numbers are generated.
javascript:
var max = 5;
var transitions = new Array(max);
var frequency = new Array(max);
for (var i = 0; i < max; i++)
{
transitions[i] = new Array(max);
}
var old = 0, curr = 0;
for (var i = 0; i < 100000; i++)
{
curr = Math.floor(Math.random()*max);
if (frequency[curr] === undefined)
{
frequency[curr] = -1;
}
frequency[curr] += 1;
if (transitions[old][curr] === undefined)
{
transitions[old][curr] = -1;
}
transitions[old][curr] += 1;
old = curr;
}
alert(frequency);
alert(transitions);

We Keep Coding

JavaScript is the programming language of the Web.