I am curious about the performance difference between initializing a function outside a loop vs inline:
Outside the loop:
const reducer = (acc, val) => {
// work
};
largeArray.reduce(reducer);
Inline:
largeArray.reduce((acc, val) => {
// work
});
I encounter this kind of situation regularly, and, unless I'm going to reuse the function, it seems useful to avoid introducing another variable into my scope by using the inline version.
Is there a performance difference in these two examples or does the JS engine optimize them identically?
For example: is the inline function being created every time the loop runs and then garbage collected? And if so:
What kind of effect does this have on performance, and
Does the size of the function affect this? For example, a function that is 200 vs 30_000 unicode characters.
Are there any other differences or things I'm not considering?
Hopefully you understand my train of thought and can provide some insight about this. I realize that I can read all of the docs and source code for V8 or other engines, and I would get my answer, but that seems like an overwhelming task to understand this concept.
I did run test on jsben
SET1(random used twice): http://jsben.ch/8Dukx
SET2:(used once): http://jsben.ch/SnvxV
Setup
const arr = [ ...Array(100).keys() ];
const reducer = (acc, cur) => (acc + cur);
Test 1
let sumInline = arr.reduce((acc, cur) => (acc + cur), 0);
let sumInlineHalf = arr.slice(0, 50).reduce((acc, cur) => (acc + cur), 0);
console.log(sumInline, sumInlineHalf);
Test 2
let sumOutline = arr.reduce(reducer, 0);
let sumOutlineHalf = arr.slice(0, 50).reduce(reducer, 0);
console.log(sumOutline, sumOutlineHalf);
Be surprised
What kind of effect does this have on performance, and
None.
Does the size of the function affect this? For example, a function that is 200 vs 30_000 unicode characters.
Functions aren't executed as "unicode characters". It doesn't matter how "long" the code is.
Are there any other differences or things I'm not considering?
A very important one: Code is written for humans, not computers. And why do you even ask me?
is the inline function being created every time the loop runs and then garbage collected?
That would be unneccessary and slow. So probably: no.
Related
I want to write more shortly. It might be written in one line. How can I do that?
this.xxx = smt.filter(item => item.Id === this.smtStatus.ONE);
this.yyy = smt.filter(item => item.Id === this.smtStatus.TWO);
this.zzz = smt.filter(item => item.Id === this.smtStatus.THREE);
Which array methods should I use?
Thanks for your help!
Well you could shorten it using some destructuring, assuming this is the current scope:
const [xxx, yyy, zzz] = ['ONE', 'TWO', 'THREE'].map(x => smt.filter(item => item.Id === this.smtStatus[x]));
That said, I wouldn't do this, as it's harder to read and maintain.
Steve's answer also gave me another idea, you can use an getter function in your filter function to make your code more compact:
function filterItemListById(list, getter) {
return list.filter(item => item.Id === getter(this.smtStatus));
}
this.xxx = filterItemListById(smt, s => s.ONE);
this.yyy = filterItemListById(smt, s => s.TWO);
this.zzz = filterItemListById(smt, s => s.THREE);
Just because something is short, doesn't mean it is good/better. Code that "reads" should be the priority both for yourself and your teammates.
If you just don't like the look of this, you could always toss the filtering into its own function. Might look like this:
filterItemListById(list, value) {
if (list == null) return null;
return list.filter(item => item.Id === value);
}
this.xxx = filterItemListById(smt, this.smtStatus.ONE);
// etc...
Now, that is assuming you did NOT mean to combine all three lines into one. If that is what you meant, well...
Ok, so you can't assign to multiple variables (this.xxx, this.yyy, this.zzz) like that, usually. I know some people declare multiple variables in one line like:
var myInt1= 0, myInt2= 1, myInt3 = 2; // and so on
Declaring multiple primitives like this is fine, but I would never do this with any complicated logic, too messy looking.
To shorten, you have two options:
Put your logic into one or more functions with descriptive names, nothing wrong with this.
Put your variables into a list and loop over those variables while filtering. This is an awful approach, I like what you already have just fine, nothing wrong with it.
In short, don't worry about writing cute one-liners, you will only confuse yourself and your team later on. Focus on writing readable code so anyone can understand what is going on just by reading it line by line.
I am new to JS and was learning functional programming and came across the term "referential transparency". Also, I found this statement "Referential transparency says it's safe to replace a pure function with its value". Does it mean that the use of RT makes it easy for JIT compiler to replace function with its return value as long as function gets hot? Is that true?
Here's an example:
This is a pure function: it will always return the same output for the same input
const even = x => x % 2 === 0;
And let's create isTenEven() which will check wether 10 is an even number or not:
const isTenEven = () => even(10);
Since we're guaranteed that even(10) === true will always be true then we can indeed replace a function call with a value:
const isTenEven = () => true;
And your program would still work.™
However you wouldn't be able to do so if even wasn't pure!
Here's a silly example: once per month 10 won't be an even number anymore:
const even = x => (new Date()).getDate() === 15 ? false : x % 2 === 0;
Perhaps your program is excepting isTenEven() to return either true or false, so forcing it to always assume that it will return true could lead to unexpected consequences.
Of course in this particular case I'm not sure what those consequences would be but you never know... which is exactly the point.
Yes, that is exactly an advantage of RT. The compiler can not only inline a function but replace its invocations with the corresponding return value, that is it can eliminate common sub-expressions and rewrite code according to specific rules like you can rewrite formulas in math. This way of reasoning about a program is called equational reasoning and is also very helpful for the programmer.
But RT allows other optimization techniques as well, like lazy evaluation. If you want to automatically delay the evaluation of an arbitrary expression up to the point where its result is actually needed, you need the guarantee that this expression yields the same result no matter when you actually evaluate it. RT gives this guarantee.
So here is some code i am trying to work with
const someFunc = (a) => (b) => a + b;
const someArray = [1, 2];
const firstOrder = someArray.map(a => someFunc(a));
firstOrder[0] === firstOrder[1]; // returns false
I am not sure why this is a function with a different memory location.
I was expecting to accomplish a similar functionality wherein
firstOrder[0] === firstOrder[1]; // should return true
I am not sure if something like this is even possible.
The primary motivation here is to avoid memory footprint.
I guess i could use some help here.
Thanks in advance.
As said in the comment, functions with different scopes are never === to each other.
The memory overhead of a simple function is next to nothing, especially on modern hardware and modern JS engines, so before spending effort on this, make sure this is not a case of premature optimization - run a performance test, and make sure this is actually a bottleneck first.
You're currently passing around an array of functions, presumably so they can be iterated through and called by something later. Consider passing around just the someArray and a someFunc that takes 2 arguments and returns a number instead; an array of primitives takes less memory than an array of functions. For example, the following code takes up ~1,400M memory on Chrome for me:
const someFunc = (a) => (b) => a + b;
const arrayOfFunctions = Array.from({ length: 1e7 }, (_, i) => someFunc(i));
// eventually use arrayOfFunctions
But if you just store your someArray, and call the function only when you need access to the final number it returns, the memory footprint is much lighter:
const someFunc = (a, b) => a + b;
const someArray = Array.from({ length: 1e7 }, (_, i) => i);
// eventually, once you need access to the final numbers, iterate through someArray and call someFunc with it:
// ...
const theBArgument = 5;
const result = someArray.map(a => someFunc(a, theBArgument));
Before the result, this uses only ~120M memory on Chrome, for me.
I know that Ramda.js provides a reduce function, but I am trying to learn how to use ramda and I thought a reducer would be a good example. Given the following code, what would be a more efficient and functional approach?
(function(){
// Some operators. Sum and multiplication.
const sum = (a, b) => a + b;
const mult = (a, b) => a * b;
// The reduce function
const reduce = R.curry((fn, accum, list) => {
const op = R.curry(fn);
while(list.length > 0){
accum = pipe(R.head, op(accum))(list);
list = R.drop(1, list);
}
return accum;
});
const reduceBySum = reduce(sum, 0);
const reduceByMult = reduce(mult, 1);
const data = [1, 2, 3, 4];
const result1 = reduceBySum(data);
const result2 = reduceByMult(data);
console.log(result1); // 1 + 2 + 3 + 4 => 10
console.log(result2); // 1 * 2 * 3 * 4 => 24
})();
Run this on the REPL: http://ramdajs.com/repl/
I'm assuming this is a learning exercise and not for real-world application. Correct?
There are certainly some efficiencies you could gain over that code. At the core of Ramda's implementation, when all the dispatching, transducing, etc. are stripped away, is something like:
const reduce = curry(function _reduce(fn, acc, list) {
var idx = 0;
while (idx < list.length) {
acc = fn(acc, list[idx]);
idx += 1;
}
return acc;
});
I haven't tested, but this probably gains on your version because it only uses the number of functions calls strictly needed: one for each member of the list, and it does that with bare-bones iteration. Your version adds the call to curry, and then, on each iteration, calls to pipe and head, to that curried op function, to the result of the pipe call, and to drop. So this one should be faster.
On the other hand, this code is as imperative as it gets. If you want to go with something more purely functional, you would need to use a recursive solution. Here's one version:
const reduce = curry(function _reduce(fn, acc, list) {
return (list.length) ? _reduce(fn, fn(acc, head(list)), tail(list)) : acc;
});
This sacrifices all the performance of the above to the calls to tail. But it's clearly more of a straightforward functional implementation. In many modern JS engines, however, this will fail to even work on larger lists due to the stack depth.
Because it is tail-recursive, it would be able to take advantage of tail-call optimization specified by ES2015 but so far little implemented. Until then, it's mostly of academic interest. And even when that is available, because of the head and -- especially -- tail call in there, it's going to be much slower than the imperative implementation above.
You might be interested to know that Ramda was the second attempt at the API that's generated. Its original authors (disclaimer: I'm one of them) first built Eweda on the lines of this latter version. That experiment failed for exactly these reasons. Javascript simply cannot handle this sort of recursion... yet.
I'm working in a language that translates to JavaScript. In order to avoid some stack overflows, I'm applying tail call optimization by converting certain functions to for loops. What is surprising is that the conversion is not faster than the recursive version.
http://jsperf.com/sldjf-lajf-lkajf-lkfadsj-f/5
Recursive version:
(function recur(a0,s0){
return a0==0 ? s0 : recur(a0-1, a0+s0)
})(10000,0)
After tail call optimization:
ret3 = void 0;
a1 = 10000;
s2 = 0;
(function(){
while (!ret3) {
a1 == 0
? ret3 = s2
: (a1_tmp$ = a1 - 1 ,
s2_tmp$ = a1 + s2,
a1 = a1_tmp$,
s2 = s2_tmp$);
}
})();
ret3;
After some cleanup using Google Closure Compiler:
ret3 = 0;
a1 = 1E4;
for(s2 = 0; ret3 == 0;)
0 == a1
? ret3 = s2
: (a1_tmp$ = a1 - 1 ,
s2_tmp$ = a1 + s2,
a1 = a1_tmp$,
s2 = s2_tmp$);
c=ret3;
The recursive version is faster than the "optimized" ones! How can this be possible, if the recursive version has to handle thousands of context changes?
There's more to optimising than tail-call optimisation.
For instance, I notice you're using two temporary variables, when all you need is:
s2 += a1;
a1--;
This alone practically reduces the number of operations by a third, resulting in a performance increase of 50%
In the long run, it's important to optimise what operations are being performed before trying to optimise the operations themselves.
EDIT: Here's an updated jsperf
as Kolink say what your piece of code do is simply adding n to the total, reduce n by 1, and loop until n not reach 0
so just do that :
n = 10000, o = 0; while(n) o += n--;
it's more faster and lisible than the recursive version, and off course output the same result
There are not so much context changes inside the recursive version as you expect, since the named function recur is contained in the scope of recur itself/they share the same scope. The reason for that has to do with the way the JavaScript engines evaluate scope, and there are plenty of websites out there which explain this topic, so I will not do it here. At a second look you will notice that recur is also a so called "pure" function, which basically means it never has to leave it's own scope as long as the internal execution runs (simply put: until it returns a value). These two facts make it basically fast. I just want to mention here, the first example is the only tail call optimized one of all three – a tc optimization can only be done in recursive functions and this is the only recursive one.
However, a second look at the second example (no pun intended) reveals, that the "optimizer" made things worse for you, since it introduced scopes into the former pure function by splitting the operation into
variables instead of arguments
a while loop
a IIFE (immediatly invoked function expression) that separates the introduced inner and outer variables
Which leads to poorer performance since now the engine has to handle 10000 context changes.
To tell you the truth I do not know why the third example is poorer in performance than the recursive one, so maybe it has to do with:
the browser you use (ever tried another one and compared the results?)
the number of variables
stack frames created by for-loops (never heard of though), which
would have to do with the first example: the JS engines interpret a
pure recursive function until it finds a return statement. If the last thing following the statement is a function call, then evaluate any expressions (if any) and variables to pass as arguments, call the function and throw away the frame
something, only the browser-vendors can truly tell you :)