NodeJS for loop optimization

NodeJS for loop optimization - javascript

I know that in browser it is more optimal to write a for loop along the lines of
for(var i=0, l=arr.length; i<l; i++){ }
instead of
for(var i=0; i<arr.length; i++){ }
But is this true in NodeJS or does the V8 engine optimize it?
I know in ecma-262 5.1 sec-15.4 array length is defined as such:
The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant.
Thus if the length doesn't change the only reason this method would be slower is because you have to access the property. What I'm looking for is a reasonable example/ explanation that would show whether or not the V8 engine (which is used in NodeJS) suffers in performance when accessing this property.

If arr is a pure local variable and the loop doesn't touch it in any way, then yes. However even if the optimization fails, loading the same field over and over again doesn't really cost anything due to CPU cache.

I would always use the first if applicable because it informs both the interpreter and any future reader of the code that the loop does not modify the length of the array.
Even if it's not faster (although http://jsperf.com/array-length-vs-cached suggests it actually is) it's simply good practise to factor constant expressions from outside of a loop.

The problem is length's calculation. In this example for(var i=0; i<arr.length; i++){ } statement arr.length will be calculated at every loop iteration. But with for(var i=0, l=arr.length; i<l; i++){ } value will be taken without any calculations. Simply getting a value is faster than calculating length of the array.
Getting length can't be optimized by any compiler because it could be changed. So it's calculated each iteration.

Related

Javascript For-Loop Condition Caching

I've put together a jsperf test that compares for loops that iterate over an array with caching the array length condition vs not caching. I thought that caching the variable before to avoid recalculating the array length each iteration would be faster, but the jsperf test says otherwise. Can someone explain why this is? I also thought that including the array length variable definition (when cached) within the for loop's initialization would reduce the time since the parse doesn't need to look up the "var" keyword twice, but that also doesn't appear to be the case.
example without caching:
for(var i = 0; i < testArray.length; i++){
//
}
example with caching
var len = testArray.length;
for(var i = 0; i < len; i++){
//
}
example with caching variable defined in for loop initialization
for(var i = 0, len=testArray.length; i < len; i++){
//
}
http://jsperf.com/for-loop-condition-caching

Can someone explain why this is?
This optimization and case is extremely common so modern JavaScript engines will automatically perform this optimization for you.
Some notes:
This is not the case when iterating a NodeList (such as the result of querySelectorAll
This is an overkill micro optimization for most code paths anyway, usually the body of the loop takes more time than this comparison anyway.

The performance of the scenarios that you posted depends on how smart the JS engine's optimizer is. An engine with a very dump optimizer (or no optimizer at all) will likely be faster when you are using the variable, but you can't even rely on that. After all, length's type is well-known, while a variable can be anything and may require additional checks.
Given a perfect optimizer, all three examples should have the same performance. And as your examples are pretty simple, modern engines should reach that point soon.

Is the following considered micro optimization

I am not very well parsed in javascript, but do you call the following micro optimization?
for(var j=0;j < document.getElementsByTagName('a').length; j++)
{
//...
}
var Elements = document.getElementsByTagName('a');
var ElementsLength = Elements.length;
for(var j=0;j < ElementsLength ; j++)
{
//...
}
var Elements = document.getElementsByTagName('a');
for(var j=0;j < Elements.length; j++)
{
//...
}
does document.getElementByTagName really get called in every loop
cycle in the first case?
do browsers try to optimize the javascript we write?
is there any difference between the second and third case considering
that the collection will never change?

does document.getElementByTagName really get called in every loop cycle in the first case?
Yes.
do browsers try to optimize the javascript we write?
It won't change it functionally. There's a difference between calling a function once and calling it every time in a loop; perhaps you meant to call the function every time. "Optimising" that away is actually changing what the program does.
is there any difference between the second and third case considering that the collection will never change?
Not functionally, but performance wise yes. Accessing the length attribute is a little more overhead than reading the number from a simple variable. Probably not so much you'd really notice, but it's there. It's also a case that cannot be optimised away, since Elements.length may change on each iteration, which would make the program behave differently. A good optimiser may be able to detect whether the attribute is ever changing and optimise in case it's certain it won't; but I don't know how many implementations really do that, because this can become quite complex.

for..in statement not supported for XML DOM childNodes in IE9?

We're developing a javascript-based GIS application and everything was going fine until someone decided to fire up IE9 and gave it a spin over there too. As you have probably already guessed, the app broke down.
Turned out that for some inexplicable reason for..in loop is unable to iterate over childNodes:
var xmldoc = new ActiveXObject("Msxml2.DOMDocument.6.0");
xmldoc.loadXML("<i18n><item>one</item><item>two</item></i18n>");
var currentItem = xmldoc.getElementsByTagName("i18n")[0].firstChild;
while (currentItem) {
if (currentItem.nodeType == 1) {
for (var i in currentItem.childNodes) {
console.log(currentItem.childNodes[i]);
}
}
currentItem = currentItem.nextSibling;
}
However, when the internal for loop in above code is replaced with
for (var j = 0; j < currentItem.childNodes.length; j++) {
console.log(currentItem.childNodes[j]);
}
everything works as expected - it walks the child nodes without any issues.
Although I discovered a workaround, the issue still annoys me as it is not clear why it happens. MSDN documentation both for XMLDocument and for..in mentions nothing.
Is this a bug or another case of poorly-documented oddness that IE is famous of?

Well, it's not a bug since DOM spec doesn't define that there should be numeric properties, just the item method which [] desugars to.
Simply don't use for..in.

Short answer: Don't use for..in to iterate arrays. Arrays should always be iterated with a for() loop. Only non-array objects should use for...in.
If you must use a for..in here, you should filter the loop using .hasOwnProperty() to prevent unwanted iterations on properties that belong to the array prototype rather than the array itself.
This point applies to all for..in loops: to ensure your code is robust, you should get into the habit of always filtering them with .hasOwnProperty(), even on plain objects, but especially on arrays, since we already know they contain additional properties.
See here for more on this point.

Why is for-in slow in javascript?

I have read in several places that for-in loops are slower than looping over an array... Though I understand that moving forward in sizeof (type) blocks is practically effortless compared to whatever happens behind to the scenes to iterate over an object's keys, I am still curious, what the exact reason is that it's so slow...
Is it having to do a reverse hash function to get the key, and that process is what is slow?

The real answer to this in the case of any particular engine will likely depend on that engine's implementation. (As will the size of the difference, if any.)
However, there are invariants. For instance, consider:
var obj = {a: "alpha", b: "beta"};
var name;
for (name in obj) {
console.log(obj[name]);
}
var arr = ["alpha", "beta"];
var index;
for (index = 0; index < arr.length; ++index) {
console.log(arr[index]);
}
In the case of obj, the engine has to use a mechanism to keep track of which properties you've already iterated over and which ones you haven't, as well as filtering out the non-enumerable properties. E.g., there's some kind of iterator object behind the scenes (and the way the spec is defined, that may well be a temporary array).
In the case of arr, it doesn't; you're handling that in your code, in a very simple, efficient way.
The content of the block of each loop is the same: A property lookup on an object. (In the latter case, in theory, there's a number-to-string conversion as well.)
So I expect the only non-implementation-specific answer to this is: Additional overhead.

for..each loops use iterators and generators.
Iterator is an object that has a next() method. Generator is a factory function that contains yield() expressions. Both constructs are more complicated than an integer index variable.
In a typical for(var i = 0; i < arr.length; i++) loop, the two commands that execute in almost all iterations are i++ and i < arr. This is arguably much faster than making a function call (next() or yield()).
Moreover, the loop initiation (var i = 0) is also faster than creating the iterator object with next() method or calling the generator to create the iterator. However, it highly depends on the implementation and the creators of Javascript engines do their best to accelerate such commonly used language features.
I would say the difference is so marginal that I might want to spend my time optimizing other parts of the code. The choice of syntax should consider code readability and maintainability more than performance, when the performance gain is so small for adding complexity. Having said that, use the syntax that makes more sense to you and other developers who maintain your code after you get rich and famous! ;)

How expensive are JS function calls (compared to allocating memory for a variable)?

Given some JS code like that one here:
for (var i = 0; i < document.getElementsByName('scale_select').length; i++) {
document.getElementsByName('scale_select')[i].onclick = vSetScale;
}
Would the code be faster if we put the result of getElementsByName into a variable before the loop and then use the variable after that?
I am not sure how large the effect is in real life, with the result from getElementsByName typically having < 10 items. I'd like to understand the underlying mechanics anyway.
Also, if there's anything else noteworthy about the two options, please tell me.

Definitely. The memory required to store that would only be a pointer to a DOM object and that's significantly less painful than doing a DOM search each time you need to use something!
Idealish code:
var scale_select = document.getElementsByName('scale_select');
for (var i = 0; i < scale_select.length; i++)
scale_select[i].onclick = vSetScale;

Caching the property lookup might help some, but caching the length of the array before starting the loop has proven to be faster.
So declaring a variable in the loop that holds the value of the scale_select.length would speed up the entire loop some.
var scale_select = document.getElementsByName('scale_select');
for (var i = 0, al=scale_select.length; i < al; i++)
scale_select[i].onclick = vSetScale;

A smart implementation of DOM would do its own caching, invalidating the cache when something changes. But not all DOMs today can be counted on to be this smart (cough IE cough) so it's best if you do this yourself.

In principle, would the code be faster if we put the result of getElementsByName into a variable before the loop and then use the variable after that?
yes.

Use variables. They're not very expensive in JavaScript and function calls are definitely slower. If you loop at least 5 times over document.getElementById() use a variable. The idea here is not only the function call is slow but this specific function is very slow as it tries to locate the element with the given id in the DOM.

There's no point storing the scaleSelect.length in a separate variable; it's actually already in one - scaleSelect.length is just an attribute of the scaleSelect array, and as such it's as quick to access as any other static variable.

I think so. Everytime it loops, the engine needs to re-evaluate the document.getElementsByName statement.
On the other hand, if the the value is saved in a variable, than it allready has the value.

# Oli
Caching the length property of the elements fetched in a variable is also a good idea:
var scaleSelect = document.getElementsByName('scale_select');
var scaleSelectLength = scaleSelect.length;
for (var i = 0; i < scaleSelectLength; i += 1)
{
// scaleSelect[i]
}

We Keep Coding

JavaScript is the programming language of the Web.

NodeJS for loop optimization - javascript

If arr is a pure local variable and the loop doesn't touch it in any way, then yes. However even if the optimization fails, loading the same field over and over again doesn't really cost anything due to CPU cache.

Related

Javascript For-Loop Condition Caching

Is the following considered micro optimization

for..in statement not supported for XML DOM childNodes in IE9?

Why is for-in slow in javascript?

How expensive are JS function calls (compared to allocating memory for a variable)?

Categories

Resources