I'm working on a very performance critical part of a browser game and was just splitting apart a big pile of code into more manageable chunks but it seems that I'm paying a pretty serious (~40% in total) performance loss for these extra function calls.
At first I figured that V8 just doesn't do inlining upon compilation but then I tried out this little test:
const nn = 1000000000;
(()=>{
var t = Date.now();
var total = 0;
for (var i = 0; i < nn; i++) {
total += i;
}
console.log(total, Date.now() - t);
})();
(()=>{
var t = Date.now();
var total = 0;
function useless() {}
for (var i = 0; i < nn; i++) {
total += i;
useless();useless();useless();
}
console.log(total, Date.now() - t);
})();
(new class {
useless() {}
test() {
var t = Date.now();
var total = 0;
for (var i = 0; i < nn; i++) {
total += i;
this.useless();this.useless();this.useless();
}
console.log(total, Date.now() - t);
}
}).test();
And in each case, I get an identical result, useless function calls or not. That tells me that there is practically no cost for a useless function call.
Yet in my real code, adding useless function calls incurs a very real penalty for each added, for example in
add_expected_length(v) {
this.set_block_raw(); // <- this is the useless one
while (v > 127) { this.buffer.push((v & 127) + 128); v = Math.trunc(v / 128); }
this.buffer.push(v);
}
even if set_block_raw method is empty, adding it there makes the whole algorithm 5% slower, add two in a row and its 7%, three is 9% and on and on, it seems to scale almost linearly with each useless call added I get a 1-2% performance decrease.
Now if I break my class apart and start examining individual calls, isolating pieces of code here and there the problem goes away, the useless call is ignored but trying to figure out what is wrong in this huge inter-dependent algorithm like that would just take forever.
This seems very bizarre to me and I really want to dig into what V8 generates to see what is causing this. Is there a way to peer behind the js and see what chrome and V8 actually does with it?
I am implementing a selection algorithm that selects an object based on a probability proportional to its score value. This makes higher-scoring objects more likely to be chosen.
My implementation is as follows:
var pool = [];
for (var i = 0; i < 100; i++)
pool.push({ Id: i, Score: Math.random() * 100 << 0 });
const NUM_RUNS = 100000;
var start = new Date();
for( var i = 0; i < NUM_RUNS; i++ )
rouletteSelection(pool);
var end = new Date();
var runningTime = (end.getTime() - start.getTime()) / 1000;
var avgExecutionTime = ( runningTime / NUM_RUNS ) * Math.pow(10,9);
console.log('Running Time: ' + runningTime + ' seconds');
console.log('Avg. Execution Time: ' + avgExecutionTime + ' nanoseconds');
function rouletteSelection(pool) {
// Sum all scores and normalize by shifting range to minimum of 0
var sumScore = 0, lowestScore = 0;
pool.forEach((obj) => {
sumScore += obj.Score;
if (obj.Score < lowestScore)
lowestScore = obj.Score;
})
sumScore += Math.abs(lowestScore * pool.length);
var rouletteSum = 0;
var random = Math.random() * sumScore << 0;
for (var i in pool) {
rouletteSum += pool[i].Score + lowestScore;
if (random < rouletteSum)
return pool[i];
}
//Failsafe
console.warn('Could not choose roulette winner, selecting random');
return pool[Math.random() * pool.length];
};
When run, the performance isn't bad: on my machine, each call to rouleteSelection() takes about 2500-3200 nanoseconds. However, before being used in production, I want to optimize this and shave off as much overhead as I can, as this function could be potentially called tens of millions of times in extreme cases.
An obvious optimization would be to somehow merge everything into a single loop so the object array is only iterated over once. The problem is, in order to normalize the scores (negative scores are shifted to 0), I need to know the lowest score to begin with.
Does anyone have any idea as to how to get around this?
At the risk of stating the obvious: just don't do the normalisation in every call to rouletteSelection. Do it once, after you constructed the pool.
I've got multiple long running tasks, as in longer than ~10ms, that impact the responsiveness of the browser. The worst ones, such as loading and parsing 3D models from files, are already offloaded to Web Workers so that they won't affect the render loop.
Some tasks, however, aren't easily ported to Workers and therefore have to be distributed over multiple frames in the main thread. Instead of doing a 1 second task in one go, I'd like to split it into ~5ms packages to give the browser the chance to execute other events (mouse move, requestAnimationFrame, ...) in between.
Generator functions, in combination with setTimeout, seem to be the easiest way to do that. I've hacked something together that does the job but I'm wondering if there is a nicer/cleaner way to solve this issue.
The code below computes the mean of 100 million invocations of Math.random().
The first version computes the mean in one go, but stalls the browser for ~1.3 seconds.
The second version abuses generator functions to yield after every 5 million points, thereby giving the browser the chance to execute other events (mouse move) in between. The generator function is repeatedly called through a setTimout loop, until it has processed all 100 million samples.
<html>
<head></head>
<body>
<script>
let samples = 100 * 1000 * 1000;
{ // run complete task at once, possibly stalling the browser
function run(){
let start = performance.now();
let sum = 0.0;
for(let i = 0; i < samples; i++){
sum = sum + Math.random();
}
let mean = sum / samples;
let duration = performance.now() - start;
console.log(`single-run: duration: ${duration}`);
console.log(`single-run: sum: ${sum}`);
console.log(`single-run: mean: ${mean}`);
}
run();
}
{ // run in smaller packages to keep browser responsive
// move mouse to check if this callback is executed in between
document.body.addEventListener("mousemove", () => {
console.log("mouse moved");
});
function * distributedRun(){
let start = performance.now();
let packageSize = 5 * 1000 * 1000;
let sum = 0.0;
for(let i = 0; i < samples; i++){
sum = sum + Math.random();
if((i % packageSize) === 0){
yield sum;
}
}
let mean = sum / samples;
let duration = performance.now() - start;
console.log(`distributed-run: duration: ${duration}`);
console.log(`distributed-run: sum: ${sum}`);
console.log(`distributed-run: mean: ${mean}`);
yield sum;
}
let generatorInstance = distributedRun();
function loop(){
let result = generatorInstance.next();
console.log(`distributed-run intermediate result: ${result.value}`);
if(!result.done){
setTimeout(loop, 0);
}
}
loop();
}
</script>
</body>
</html>
ES2018 has async iterators which kind of sound like what I'm looking for but I'm not sure if they're really meant for this kind of problem. Using it like this still stalls the browser:
for await (const result of distributedRun()) {
...
}
(Tried some async's here and there and at the runDistributed() function but tbh, I'm still learning the details of await/async)
Here is a slightly modified version of your code. If you ajust chunk depending on your computation complexity and the amount of lag you can allow, it should work fine.
let samples = 100 * 1000 * 1000;
let chunk = 100000;
async function run() {
let sum = 0.0;
for(let i=0; i<samples; i++) {
sum += Math.random();
if (i % chunk === 0) {
console.log("finished chunk")
// wait for the next tick
await new Promise(res => setTimeout(res, 0));
}
}
let mean = sum / samples;
console.log("finished computation", mean);
}
setTimeout(run, 0);
Regardless of functional differences, does using the new keywords 'let' and 'const' have any generalized or specific impact on performance relative to 'var'?
After running the program:
function timeit(f, N, S) {
var start, timeTaken;
var stats = {min: 1e50, max: 0, N: 0, sum: 0, sqsum: 0};
var i;
for (i = 0; i < S; ++i) {
start = Date.now();
f(N);
timeTaken = Date.now() - start;
stats.min = Math.min(timeTaken, stats.min);
stats.max = Math.max(timeTaken, stats.max);
stats.sum += timeTaken;
stats.sqsum += timeTaken * timeTaken;
stats.N++
}
var mean = stats.sum / stats.N;
var sqmean = stats.sqsum / stats.N;
return {min: stats.min, max: stats.max, mean: mean, spread: Math.sqrt(sqmean - mean * mean)};
}
var variable1 = 10;
var variable2 = 10;
var variable3 = 10;
var variable4 = 10;
var variable5 = 10;
var variable6 = 10;
var variable7 = 10;
var variable8 = 10;
var variable9 = 10;
var variable10 = 10;
function varAccess(N) {
var i, sum;
for (i = 0; i < N; ++i) {
sum += variable1;
sum += variable2;
sum += variable3;
sum += variable4;
sum += variable5;
sum += variable6;
sum += variable7;
sum += variable8;
sum += variable9;
sum += variable10;
}
return sum;
}
const constant1 = 10;
const constant2 = 10;
const constant3 = 10;
const constant4 = 10;
const constant5 = 10;
const constant6 = 10;
const constant7 = 10;
const constant8 = 10;
const constant9 = 10;
const constant10 = 10;
function constAccess(N) {
var i, sum;
for (i = 0; i < N; ++i) {
sum += constant1;
sum += constant2;
sum += constant3;
sum += constant4;
sum += constant5;
sum += constant6;
sum += constant7;
sum += constant8;
sum += constant9;
sum += constant10;
}
return sum;
}
function control(N) {
var i, sum;
for (i = 0; i < N; ++i) {
sum += 10;
sum += 10;
sum += 10;
sum += 10;
sum += 10;
sum += 10;
sum += 10;
sum += 10;
sum += 10;
sum += 10;
}
return sum;
}
console.log("ctl = " + JSON.stringify(timeit(control, 10000000, 50)));
console.log("con = " + JSON.stringify(timeit(constAccess, 10000000, 50)));
console.log("var = " + JSON.stringify(timeit(varAccess, 10000000, 50)));
.. My results were the following:
ctl = {"min":101,"max":117,"mean":108.34,"spread":4.145407097016924}
con = {"min":107,"max":572,"mean":435.7,"spread":169.4998820058587}
var = {"min":103,"max":608,"mean":439.82,"spread":176.44417700791374}
However discussion as noted here seems to indicate a real potential for performance differences under certain scenarios: https://esdiscuss.org/topic/performance-concern-with-let-const
TL;DR
In theory, an unoptimized version of this loop:
for (let i = 0; i < 500; ++i) {
doSomethingWith(i);
}
might be slower than an unoptimized version of the same loop with var:
for (var i = 0; i < 500; ++i) {
doSomethingWith(i);
}
because a different i variable is created for each loop iteration with let, whereas there's only one i with var.
Arguing against that is the fact the var is hoisted so it's declared outside the loop whereas the let is only declared within the loop, which may offer an optimization advantage.
In practice, here in 2018, modern JavaScript engines do enough introspection of the loop to know when it can optimize that difference away. (Even before then, odds are your loop was doing enough work that the additional let-related overhead was washed out anyway. But now you don't even have to worry about it.)
Beware synthetic benchmarks as they are extremely easy to get wrong, and trigger JavaScript engine optimizers in ways that real code doesn't (both good and bad ways). However, if you want a synthetic benchmark, here's one:
const now = typeof performance === "object" && performance.now
? performance.now.bind(performance)
: Date.now.bind(Date);
const btn = document.getElementById("btn");
btn.addEventListener("click", function() {
btn.disabled = true;
runTest();
});
const maxTests = 100;
const loopLimit = 50000000;
const expectedX = 1249999975000000;
function runTest(index = 1, results = {usingVar: 0, usingLet: 0}) {
console.log(`Running Test #${index} of ${maxTests}`);
setTimeout(() => {
const varTime = usingVar();
const letTime = usingLet();
results.usingVar += varTime;
results.usingLet += letTime;
console.log(`Test ${index}: var = ${varTime}ms, let = ${letTime}ms`);
++index;
if (index <= maxTests) {
setTimeout(() => runTest(index, results), 0);
} else {
console.log(`Average time with var: ${(results.usingVar / maxTests).toFixed(2)}ms`);
console.log(`Average time with let: ${(results.usingLet / maxTests).toFixed(2)}ms`);
btn.disabled = false;
}
}, 0);
}
function usingVar() {
const start = now();
let x = 0;
for (var i = 0; i < loopLimit; i++) {
x += i;
}
if (x !== expectedX) {
throw new Error("Error in test");
}
return now() - start;
}
function usingLet() {
const start = now();
let x = 0;
for (let i = 0; i < loopLimit; i++) {
x += i;
}
if (x !== expectedX) {
throw new Error("Error in test");
}
return now() - start;
}
<input id="btn" type="button" value="Start">
It says that there's no significant difference in that synthetic test on either V8/Chrome or SpiderMonkey/Firefox. (Repeated tests in both browsers have one winning, or the other winning, and in both cases within a margin of error.) But again, it's a synthetic benchmark, not your code. Worry about the performance of your code when and if your code has a performance problem.
As a style matter, I prefer let for the scoping benefit and the closure-in-loops benefit if I use the loop variable in a closure.
Details
The important difference between var and let in a for loop is that a different i is created for each iteration; it addresses the classic "closures in loop" problem:
function usingVar() {
for (var i = 0; i < 3; ++i) {
setTimeout(function() {
console.log("var's i: " + i);
}, 0);
}
}
function usingLet() {
for (let i = 0; i < 3; ++i) {
setTimeout(function() {
console.log("let's i: " + i);
}, 0);
}
}
usingVar();
setTimeout(usingLet, 20);
Creating the new EnvironmentRecord for each loop body (spec link) is work, and work takes time, which is why in theory the let version is slower than the var version.
But the difference only matters if you create a function (closure) within the loop that uses i, as I did in that runnable snippet example above. Otherwise, the distinction can't be observed and can be optimized away.
Here in 2018, it looks like V8 (and SpiderMonkey in Firefox) is doing sufficient introspection that there's no performance cost in a loop that doesn't make use of let's variable-per-iteration semantics. See this test.
In some cases, const may well provide an opportunity for optimization that var wouldn't, especially for global variables.
The problem with a global variable is that it's, well, global; any code anywhere could access it. So if you declare a variable with var that you never intend to change (and never do change in your code), the engine can't assume it's never going to change as the result of code loaded later or similar.
With const, though, you're explicitly telling the engine that the value cannot change¹. So it's free to do any optimization it wants, including emitting a literal instead of a variable reference to code using it, knowing that the values cannot be changed.
¹ Remember that with objects, the value is a reference to the object, not the object itself. So with const o = {}, you could change the state of the object (o.answer = 42), but you can't make o point to a new object (because that would require changing the object reference it contains).
When using let or const in other var-like situations, they're not likely to have different performance. This function should have exactly the same performance whether you use var or let, for instance:
function foo() {
var i = 0;
while (Math.random() < 0.5) {
++i;
}
return i;
}
It's all, of course, unlikely to matter and something to worry about only if and when there's a real problem to solve.
"LET" IS BETTER IN LOOP DECLARATIONS
With a simple test (5 times) in navigator like that:
// WITH VAR
console.time("var-time")
for(var i = 0; i < 500000; i++){}
console.timeEnd("var-time")
The mean time to execute is more than 2.5ms
// WITH LET
console.time("let-time")
for(let i = 0; i < 500000; i++){}
console.timeEnd("let-time")
The mean time to execute is more than 1.5ms
I found that loop time with let is better.
T.J. Crowder's answer is so excellent.
Here is an addition of: "When would I get the most bang for my buck on editing existing var declarations to const ?"
I've found that the most performance boost had to do with "exported" functions.
So if file A, B, R, and Z are calling on a "utility" function in file U that is commonly used through your app, then switching that utility function over to "const" and the parent file reference to a const can eak out some improved performance. It seemed for me that it wasn't measurably faster, but the overall memory consumption was reduced by about 1-3% for my grossly monolithic Frankenstein-ed app. Which if you're spending bags of cash on the cloud or your baremetal server, could be a good reason to spend 30 minutes to comb through and update some of those var declarations to const.
I realize that if you read into how const, var, and let work under the covers you probably already concluded the above... but in case you "glanced" over it :D.
From what I remember of the benchmarking on node v8.12.0 when I was making the update, my app went from idle consumption of ~240MB RAM to ~233MB RAM.
T.J. Crowder's answer is very good but :
'let' is made to make code more readable, not more powerful
by theory let will be slower than var
by practice the compiler can not solve completely (static analysis) an uncompleted program so sometime it will miss the optimization
in any-case using 'let' will require more CPU for introspection, the bench must be started when google v8 starts to parse
if introspection fails 'let' will push hard on the V8 garbage collector, it will require more iteration to free/reuse. it will also consume more RAM. the bench must take these points into account
Google Closure will transform let in var...
The effect of the performance gape between var and let can be seen in real-life complete program and not on a single basic loop.
Anyway, to use let where you don't have to, makes your code less readable.
Just did some more tests, Initially I concluded that there is a substantial difference in favor of var. My results initially showed that between Const / Let / Var there was a ratio from 4 / 4 / 1 to 3 / 3 / 1 in execution time.
After Edit in 29/01/2022 (according to jmrk's remark to remove global variables in let and const tests) now results seem similar 1 / 1 / 1.
I give the code used below. Just let me mention that I started from the code of AMN and did lots of tweaking, and editing.
I did the tests both in w3schools_tryit editor and in Google_scripts
My Notes:
In GoogleScripts there seems that the 1st test ALWAYS takes longer, no-matter which one, especially for reps<5.000.000 and before separating them in individual functions
For Reps < 5.000.000 JS engine optimizations are all that matters, results go up and down without safe conclusions
GoogleScripts constantly does ~1.5x time longer, I think it is expected
There was a BIG difference when all tests where separated in individual functions, execution speed was at-least doubled and 1st test's delay almost vanished!
Please don't judge the code, I did try but don't pretend to be any expert in JS.
I would be delighted to see your tests and opinions.
function mytests(){
var start = 0;
var tm1=" Const: ", tm2=" Let: ", tm3=" Var: ";
start = Date.now();
tstLet();
tm2 += Date.now() - start;
start = Date.now();
tstVar();
tm3 += Date.now() - start;
start = Date.now();
tstConst();
tm1 += (Date.now() - start);
var result = "TIMERS:" + tm1 + tm2 + tm3;
console.log(result);
return result;
}
// with VAR
function tstVar(){
var lmtUp = 50000000;
var i=0;
var item = 2;
var sum = 0;
for(i = 0; i < lmtUp; i++){sum += item;}
item = sum / 1000;
}
// with LET
function tstLet(){
let lmtUp = 50000000;
let j=0;
let item = 2;
let sum=0;
for( j = 0; j < lmtUp; j++){sum += item;}
item = sum/1000;
}
// with CONST
function tstConst(){
const lmtUp = 50000000;
var k=0;
const item = 2;
var sum=0;
for( k = 0; k < lmtUp; k++){sum += item;}
k = sum / 1000;
}
code with 'let' will be more optimized than 'var' as variables declared with var do not get cleared when the scope expires but variables declared with let does. so var uses more space as it makes different versions when used in a loop.
I have a fairly long running (3 to 10 second) function that loads data in the background for a fairly infrequently used part of the page. The question I have is what is the optimal running time per execution and delay time between to ensure that the rest of the page stays fairly interactive, but the loading of the data is not overly delayed by breaking it up?
For example:
var i = 0;
var chunkSize = 10;
var timeout = 1;
var data; //some array
var bigFunction = function() {
var nextStop = chunkSize + i; //find next stop
if (nextStop > data.length) { nextStop = data.length }
for (; i < nextStop; i++) {
doSomethingWithData(data[i]);
}
if (i == data.length) { setTimeout(finalizingFunction, timeout); }
else { setTimeout(bigFunction , timeoutLengthInMs); }
};
bigFunction(); //start it all off
Now, I've done some testing, and a chunkSize that yields about a 100ms execution time and a 1ms timeout seem to yield a pretty responsive UI, but some examples I've seen recommend much larger chunks (~300ms) and longer timeouts (20 to 100 ms). Am I missing some dangers in cutting my function into too many small chunks, or is trial and error a safe way to determine these numbers?
Any timeout value less than roughly 15ms is equivalent - the browser will update the DOM, etc and then execute the timeout. See this and this for more info. I tend to use setTimeout(fn, 0).
I would check the time elapsed instead of guessing numbers, because as Jason pointed out, there will be speed differences between clients:
var data; // some array
var i = 0;
var MAX_ITERS = 20; // in case the system time gets set backwards, etc
var CHECK_TIME_EVERY_N_ITERS = 3; // so that we don't check elapsed time excessively
var TIMEOUT_EVERY_X_MS = 300;
var bigFunction = function () {
var maxI = i + MAX_ITERS;
if (maxI > data.length) { maxI = data.length }
var nextTimeI;
var startTime = (new Date()).getTime(); // ms since 1970
var msElapsed;
while (i < maxI) {
nextTimeI = i + CHECK_TIME_EVERY_N_ITERS;
if (nextTimeI > data.length) { nextTimeI = data.length }
for (; i < nextTimeI; i++) {
doSomethingWithData(data[i]);
}
msElapsed = (new Date()).getTime() - startTime;
if (msElapsed > TIMEOUT_EVERY_X_MS) {
break;
}
}
if (i == data.length) {
setTimeout(finalizingFunction, 0);
} else {
setTimeout(bigFunction , 0);
}
};
bigFunction(); //start it all off
The 1ms timeout is not actually 1 ms. By the time the thread yields, to achieve the 1ms, it probably yielded much more - because the typical thread time-slice is 30ms. Any number of other threads may have executed during the 1ms timeout, which could mean that 200ms went by before you received control again.
Why not just have the 'doSomethingWithData' execute in an entirely different thread?