Why does Safari call function.apply recursively? - javascript

Consider the following:
var foo = []
for (var i=0; i<100000; i++) { foo.push(97); }
var bar = String.fromCharCode.apply(String,foo)
Most browsers run it fine, but Safari throws: RangeError: Maximum call stack size exceeded.
Based on this, it appears that Safari's implementation of Function.prototype.apply is recursive. Is this true?
The MDN page linked above mentions potential issues with the JS engine's argument length limit, but that's clearly not the case here.
EDIT: I still don't think it's an argument length issue. Via this page and my own testing, it looks like Safari can handle up to 524197 arguments, which the above code does not exceed.
Bonus question: We can rewrite the above code to avoid using apply by explicitly calling String.fromCharCode on each element of the array and joining the results together, but I suspect that would be slower (for the browsers that support the large-input apply). What's the best way to assemble a large string from an array of integer character codes?

Apply has limits in some browsers in the length of arguments they accept. Webkit has an observed limit of 2^16, so if you have any need to have more you may want to follow a strategy to break up the arguments. If you read the details of the bug, it's an enforced limitation opposed to it being a problem arising from recursion (the bug in question also threw a similar RangeError).
Anyway, I believe your hunch about string concatenation was correct - join isn't necessarily as good as other methods. Here's a test against string concat where I first break up the arguments (similar to the strategy in the MDN discussion of apply), and it edges out join. Directly adding string together even edged out join, which I'm a little surprised by (in chrome, at least, I'd imagine they must just have some smart gc that can reuse the existing string to great effect, but can say for sure).
Edit - interestingly, it looks like Chrome is the odd one out in terms of how slow join is - for every other browser, it was much closer to concat in terms of performance or even better.

Related

Why "Map" manipulation is much slower than "Object" in JavaScript (v8) for integer keys?

I was happily using Map for indexed accessed everywhere in my JavaScript codebase, but I've just stumbled upon this benchmark: https://stackoverflow.com/a/54385459/365104
I've re-created it here as well: https://jsben.ch/HOU3g
What benchmark is doing is basically filling a map with 1M elements, then iterating over them.
I'd expect the results for Map and Object to be on par, but they differ drastically - in favor of Object.
Is this expected behavior? Can it be explained? Is it because of the ordering requirement? Or because map is doing some key hashing? Or just because Map allows any object as key (I'd expect it using pointer address for key then, which does not need any hashing)? What is the difference between the Map and Object indexing algorithms?
This is quite unexpected and discouraging - basically I'll have to revert back to the old-school "object as map" coding style.
Update #1
As suggested in comments, the Object might be optimized to Array (since it is indexed by integer, starting from the zero).
Changing the iteration order from size to 0 - Object is still 2x faster. When using strings as index, Map performs 2x better.
(V8 developer here.)
I'll have to revert back to the old-school "object as map" coding style.
If you do that, you will have fallen victim to a misleading microbenchmark.
In the very special case of using consecutive integers as keys, a plain Object will be faster, yes. Nothing beats a contiguous array in that scenario. So if the "indexed accesses everywhere in your codebase" that you mentioned are indeed using index sets like the integers from 0 to 1M, then using an Object or Array is a good idea. But that's a special case. If the index space is sparse, things will already look different.
In the general case of using arbitrary strings in random order, a Map will perform significantly better than an Object. Even more importantly, the way such object property accesses are handled (in V8, and quite possibly in other engines too) has non-local effects: if one function puts excessive stress on the slow path of the object property lookup handling system, then that will likely slow down some other functions relying on that same slow path for their property accesses.
The fundamental reason is that engines optimize different things for different usage patterns. An engine could implement Objects and Maps pretty much the same under the hood; but that wouldn't be the ideal behavior, because different usage patterns benefit from different internal representations and implementation choices. So engines allow you to provide them with a hint: if you use a Map, the engine will know that you're planning to use the thing as a map (duh!), where random keys will come and go. If you use an Object, then the engine will (at least at first) assume that you want the set of optimizations that work best for your average object, where the set of properties is fairly small and static. If you use an Array (or Object with only integer properties, which is nearly the same thing in JS), then you're making it easy for the engine to give you fast integer-indexed accesses.
Using "x" + i as key is a good suggestion to demonstrate how quickly a microbenchmark can be changed so it appears to produce opposite results. But here's a spoiler: if you do (only) this modification, then a large part of what you're measuring will be number-to-string conversion and string internalization, not Map/Object access performance itself.
Beware of microbenchmarks; they are misleading. You really have to analyze them quite deeply (by profiling, and/or by inspecting generated code, and/or by tracing other engine internals) to be sure that they're measuring what you think they're measuring, and hence are producing results that are telling you what you think they're telling you.
In general, it is strongly recommended to use representative test cases for performance measurements. Ideally, your app itself; or by extracting a realistic part of it into a testcase operating on realistic data. And if you can't measure a difference between two implementation choices with a stress test for your entire production app, then it's not a difference worth worrying about it. With a microbenchmark (i.e. a couple of artificially crafted lines), I can "prove" almost anything that doesn't apply to the general case.

Chrome performance: "standard" property names vs. non-standard

So this is an interesting one... While I was testing the performance of setAttribute vs. normal property set on an element, I found an odd behavior, which I then tested on regular objects and... It's still odd!
So if you have an object A = {},
and you set its property like A['abc_def'] = 1, or A.abc_def = 1, they are basically the same.
But then if you do A['abc-def'] = 1 or A['123-def'] = 1 then you are in trouble. It goes wayyy slower.
I set up a test here: http://jsfiddle.net/naPYL/1/. They all work the same on all browsers except chrome.
The funny thing is that for "abc_def" property, chrome is actually much faster than Firefox and IE, as I expected. But for "abc-def" it's at least twice as slow.
So what happens here basically (at least from my tests) is that when using "correct" syntax for properties (legal C syntax, which you can use with dot properties) - It's fast, but when you use syntax that requires using brackets (a[...]) then you're in trouble.
I tried to imagine what implementation detail would distinguish in such a way between the two modes, and couldn't. Because as I think of it, if you do support those non-standard names, you are probably translating all names to the same mechanics, and the rest is just syntax which is compiled into that mechanic. So . syntax and [] should be all the same after compilation. But obviously something is going the other way around here...
Without looking at V8's source code, could anyone think of a really satisfying answer? (Think of it as an exercise :-))
Here's also a quick jsperf.com example
Thanks to NDM for the jsperf example!
Edit:
To clarify, of course I want also a concrete answer from the real code
(which I already found) or to be more precise - the reason behind that
specific implementation. That is one of the reasons I asked you to
look at it "as an exercise", to look behind the technical
implementation and try to find the reason.
But I also wanted to see how other people's minds work in cases like these.
This may sound "vague" to some of you - but it is very useful to try and think
like other people from time to time, or take their point of view. It
enhances your own ways of thinking.
So JS objects can be used for two conflicting purposes. They can be used as objects but they can be used as hash tables too. However what is fast and makes sense
for objects is not so for hash tables, so V8 tries to guess what a given object is.
Some signs the user can give that he wants a dictionary are deleting a property or giving a property a name that cannot be accessed using dot notation.
Some other heuristics are also used, I have made a gist https://gist.github.com/petkaantonov/6327915.
There is however a really cool hack that redempts an object from hash table hell:
function ensureFastProperties(obj) {
function f() {}
f.prototype = obj;
return obj;
}
See it in action: http://jsperf.com/property-dash-parformance/2.
The redempted object is not as fast as the original because the properties are stored in the external properties array rather than in-object. But that's still far better than hash table. Note that this is still pretty broken benchmark, do not think for a second that hash tables are only 2x slower than inobject properties.

JavaScript: Why is native Array.prototype.map faster than for loop in Chrome console?

See an example here: http://jsperf.com/map-vs-for-basic
on the contrary, in the chrome console, I get the opposite results (map is sometimes 6-10 times faster than for loop). I would be guessing it's gonna be the opposite.
var input = [];
for(var i=0;i<10000;i++)input[i]=new Date(i);
var output = [];
function perform(value,index){
return value.toString()+index*index
}
console.time(1);output = input.map(perform);console.timeEnd(1);
// 1: 45.000ms
console.time(1);for(var i=0;i<input.length;i++)output[i]=perform(input[i],i);console.timeEnd(1);
// 1: 68.000ms
First of all, your test is not realistic because: the function "perform" and the update of the web page DOM is much slower than the difference between a loop and using "map". That is, it is like comparing a 100m rush if every step runners need to take a coffe and write a book.
You should do the test on a very fast function.
Why there is difference between browsers.
Map may be implemented internally as:
A native/binary function with optimization: in this case, his use is much faster. Chrome probably do do that.
Just as a loop, like you did: in this case, performance is similar, but the additional call to "map" and internal checks may take a few more time.
Why native implementation is faster
Javascript is interpreted code, that is, an executable take the source code and try to perform requested operations, but that mean to parse the code and execute the resulted tree (lot of job). Native code is always much faster and optimized.
If map is implemented with native code, that allow to perform optimization and a much faster code than just a JS loop (supposing that both implementations are corrects and optimal).

In what Javascript engines does Function.prototype.toString not return the source code of that function?

EDIT: To be explicit, I am not looking for advice or opinions on the qualitative merit of the various issues implied by the functionality in question — neither am I looking for a reliable solution to a practical problem; I am simply looking for technical, verifiable answers to the question in the title. I have appended the question with a list of non-conforming browsers.
Using a function's .toString method will typically render the source code for that function. The problem is that this behaviour isn't specified — the spec refrains from making any commitment as to what the behaviour should be when applied to functions. Chrome's console will even tell you (when you pass anything other than a function to Function.toString.call), that Function.prototype.toString is not generic
This blog post suggests this can be used as a method to produce a readable syntax for multi-line strings (by storing the string as a multi-line comment in the body of a no-op function). The author suggests this usage in the context of writing Node.js applications with the clause that this behaviour is only reliable because Node.js runs in a controlled environment. But in Javascript's native web, anything can come along and interpret it, and we shouldn't rely on unspecified behaviour.
In practice though, I've set up a fiddle which renders a select box whose contents are determined by a large multi-line string to test the code, and every browser on my workstation (Chrome 27, Firefox 21, Opera 12, Safari 5, Internet Explorer 8) executes as intended.
What current Javascript engines don't behave as follows?
Given that:
function uncomment(fn){
return fn.toString().split(/\/\*\n|\n\*\//g).slice(1,-1).join();
}
The following:
uncomment(function(){/*
erg
arg
*/});
Should output:
erg
arg
List of non-conforming browsers:
Firefox 16
…
What current Javascript engines don't behave this way?
Your question isn't really well-defined, given that you haven't defined "popular". Is IE6 popular? IE5? IE4? Netscape Navigator? Lynx? The only way to properly answer your question is to enumerate which browsers you wish to support and check them. Unfortunately kangax's table http://kangax.github.io/es5-compat-table/# doesn't test Function.prototype.toString
Chrome's console will even tell you (when you pass anything other than a function o Function.toString.call), that Function.prototype.toString is not generic
mandated in the spec
the spec refrains from making any commitment as to what the behaviour should be when applied to functions
The required behavior is specified in ECMA-262 version 1 (from 1997, http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%201st%20edition,%20June%201997.pdf) You have to chase it down:
http://www.ecma-international.org/ecma-262/5.1/#sec-4.3.24 "function ... member of the Object type that is an instance of the standard built-in Function constructor and that may be invoked as a subroutine"
From that, we deduce that functions are objects.
http://www.ecma-international.org/ecma-262/5.1/#sec-9.8 "Let primValue be ToPrimitive(input argument, hint String)."
So now what is ToPrimitive?
http://www.ecma-international.org/ecma-262/5.1/#sec-9.1 " The default value of an object is retrieved by calling the [[DefaultValue]] internal method of the object, passing the optional hint PreferredType. "
So we need to know what DefaultValue does
http://www.ecma-international.org/ecma-262/5.1/#sec-8.12.8 (lots of words that basically say if the thing has a toString method, then call it)
Now we just need to find where Function.prototype.toString is described:
http://www.ecma-international.org/ecma-262/5.1/#sec-15.3.4.2 "An implementation-dependent representation of the function is returned. This representation has the syntax of a FunctionDeclaration. Note in particular that the use and placement of white space, line terminators, and semicolons within the representation String is implementation-dependent."
So you are guaranteed that you get a proper javascript representation (not some IL gobbledegook) but not necessarily with the comments. For example, the technique breaks in Firefox 16 (but then you have to ask if it is current).
So Kangax has returned to the subject matter (intrigued as he was by the fact that Angular uses this hack for core functionality in client-side code) and written up an analysis of the practice, and produced a test table for the state of function decompilation in Javascript.
The takeaway points are that:
The technique is only remotely reliable for user-defined function declarations.
Some old mobile browsers will still collapse functional code, allegedly for performance reasons.
Other old browsers will reveal optimized code, something like what you might get out of Closure Compiler.
Yet others will remove comments and alter whitespace.
Internet Explorer will occasionally add comments and whitespace around the functions.
The AngularJS team seem to think this technique is robust enough to include in their library without explicit caveat. They then tokenize (!) the code and re-evaluate it (!!).
For my purposes, this makes me reasonably confident I can do something relatively undemanding like detect whether a function has an uppercase name or not by parsing it as follows:
/function\s*[A-Z]/.test( fn )

Do modern JavaScript JITers need array-length caching in loops?

I find the practice of caching an array's length property inside a for loop quite distasteful. As in,
for (var i = 0, l = myArray.length; i < l; ++i) {
// ...
}
In my eyes at least, this hurts readability a lot compared with the straightforward
for (var i = 0; i < myArray.length; ++i) {
// ...
}
(not to mention that it leaks another variable into the surrounding function due to the nature of lexical scope and hoisting.)
I'd like to be able to tell anyone who does this "don't bother; modern JS JITers optimize that trick away." Obviously it's not a trivial optimization, since you could e.g. modify the array while it is being iterated over, but I would think given all the crazy stuff I've heard about JITers and their runtime analysis tricks, they'd have gotten to this by now.
Anyone have evidence one way or another?
And yes, I too wish it would suffice to say "that's a micro-optimization; don't do that until you profile." But not everyone listens to that kind of reason, especially when it becomes a habit to cache the length and they just end up doing so automatically, almost as a style choice.
It depends on a few things:
Whether you've proven your code is spending significant time looping
Whether the slowest browser you're fully supporting benefits from array length caching
Whether you or the people who work on your code find the array length caching hard to read
It seems from the benchmarks I've seen (for example, here and here) that performance in IE < 9 (which will generally be the slowest browsers you have to deal with) benefits from caching the array length, so it may be worth doing. For what it's worth, I have a long-standing habit of caching the array length and as a result find it easy to read. There are also other loop optimizations that can have an effect, such as counting down rather than up.
Here's a relevant discussion about this from the JSMentors mailing list: http://groups.google.com/group/jsmentors/browse_thread/thread/526c1ddeccfe90f0
My tests show that all major newer browsers cache the length property of arrays. You don't need to cache it yourself unless you're concerned about IE6 or 7, I don't remember exactly. However, I have been using another style of iteration since those days since it gives me another benefit which I'll describe in the following example:
var arr = ["Hello", "there", "sup"];
for (var i=0, str; str = arr[i]; i++) {
// I already have the item being iterated in the loop as 'str'
alert(str);
}
You must realize that this iteration style stops if the array is allowed to contain 'falsy' values, so this style cannot be used in that case.
First of all, how is this harder to do or less legible?
var i = someArray.length;
while(i--){
//doStuff to someArray[i]
}
This is not some weird cryptic micro-optimization. It's just a basic work avoidance principle. Not using the '.' or '[]' operators more than necessary should be as obvious as not recalculating pi more than once (assuming you didn't know we already have that in the Math object).
[rantish elements yoinked]
If someArray is entirely internal to a function it's fair game for JIT optimization of its length property which is really like a getter that actually counts up the elements of the array every time you access it. A JIT could see that it was entirely locally scoped and skip the actual counting behavior.
But this involves a fair amount of complexity. Every time you do anything that mutates that Array you have to treat length like a static property and tell your array altering methods (the native code side of them I mean) to set the property manually whereas normally length just counts the items up every time it's referenced. That means every time a new array-altering method is added you have to update the JIT to branch behavior for length references of a locally scoped array.
I could see Chrome doing this eventually but I don't think it is yet based on some really informal tests. I'm not sure IE will ever have this level of performance fine-tuning as a priority. As for the other browsers, you could make a strong argument for the maintenance issue of having to branch behavior for every new array method being more trouble than its worth. At the very least, it would not get top priority.
Ultimately, accessing the length property every loop cycle isn't going to cost you a ton even in the old browsers for a typical JS loop. But I would advise getting in the habit of caching any property lookup being done more than once because with getter properties you can never be sure how much work is being done, which browsers optimize in what ways or what kind of performance costs you could hit down the road when somebody decides to move someArray outside of the function which could lead to the call object checking in a dozen places before finding what it's looking for every time you do that property access.
Caching property lookups and method returns is easy, cleans your code up, and ultimately makes it more flexible and performance-robust in the face of modification. Even if one or two JITs did make it unnecessary in circumstances involving a number of 'ifs', you couldn't be certain they always would or that your code would continue to make it possible to do so.
So yes, apologies for the anti-let-the-compiler-handle-it rant but I don't see why you would ever want to not cache your properties. It's easy. It's clean. It guarantees better performance regardless of browser or movement of the object having its property's examined to an outer scope.
But it really does piss me off that Word docs load as slowly now as they did back in 1995 and that people continue to write horrendously slow-performing java websites even though Java's VM supposedly beats all non-compiled contenders for performance. I think this notion that you can let the compiler sort out the performance details and that "modern computers are SO fast" has a lot to do with that. We should always be mindful of work-avoidance, when the work is easy to avoid and doesn't threaten legibility/maintainability, IMO. Doing it differently has never helped me (or I suspect anybody) write the code faster in the long term.

Categories