In order to access the data in an array I created an enum-like variable to have human readable identifiers to the fields.
var columns = { first: 0, second: 1 };
var array = ['first', 'second'];
var data = array[columns.first];
When I found out about Object.freeze I wanted to use this for the enum so that it can not be changed, and I expected the VM to use this info to its advantage.
As it turns out, the tests get slower on Chrome and Node, but slightly faster on Firefox (compared to direct access by number).
The code is available here: http://jsperf.com/array-access-via-enum
Here are the benchmarks from Node (corresponding to the JSPerf tests):
fixed Number: 12ms
enum: 12ms
frozenEnum: 85ms
Does V8 just not yet have a great implementation, or is there something suboptimal with this approach for my use-case?
I tried your test in Firefox 20, which is massively faster across the board, and IE 10 which slightly faster and more consistant.
So my answer is No, V8 does not yet have a great implementation
According to this bugreport, freezing an object currently puts it in "dictionary-mode", which is slow.
So instead of improving the performance, it becomes a definite slowdown for "enums"/small arrays.
Related
Story
During some tests for a performance critical code, I observed a side-effect of Math.random() that I do not understand. I am looking for
some deep technical explanation
a falsification for my test (or expectation)
link to a V8 problem/bug ticket
Problem
It looks like that calling Math.random() allocates some memory that needs to be cleaned up by the Gargabe Collector (gc).
Test: With Math.random()
const numberOfWrites = 100;
const obj = {
value: 0
};
let i = 0;
function test() {
for(i = 0; i < numberOfWrites; i++) {
obj.value = Math.random();
}
}
window.addEventListener('DOMContentLoaded', () => {
setInterval(() => {
test();
}, 10);
});
Observation 1: Chrome profile
Chrome: 95.0.463869, Windows 10, Edge: 95.0.1020.40
Running this code in the browser and record a perfromance profile will result in a classic memory zig-zag
Memory profile of Math.random() test
Obersation 2: Firefox
Firefox Developer: 95, Windows 10
No Garbage Collection (CC/GCMinor) detected - memory quite linear
Workarounds
crypto.getRandomValues()
Replace Math.random() with a large enough array of pre-calculated random numbers using self.crypto.getRandomValues`.
(V8 developer here.)
Yes, this is expected. It's a (pretty fundamental) design decision, not a bug, and not strictly related to Math.random(). V8 "boxes" floating-point numbers as objects on the heap. That's because it uses 32 bits per field in an object, which obviously isn't enough for a 64-bit double, and a layer of indirection solves that.
There are a number of special cases where this boxing can be avoided:
in optimized code, for values that never leave the current function.
for numbers whose values are sufficiently small integers ("Smis", signed 31-bit integer range).
for elements in arrays that have only ever seen numbers as elements (e.g. [1, 2.5, NaN], but not [1, true, "hello"]).
possibly other cases that I'm not thinking of right now. Also, all these internal details can (and do!) change over time.
Firefox uses a fundamentally different technique for storing internal references. The benefit is that it avoids having to box numbers, the drawback is that it uses more memory for things that aren't numbers. Neither approach is strictly better than the other, it's just a different tradeoff.
Generally speaking you're not supposed to have to worry about this, it's just your JavaScript engine doing its thing :-)
Problem: Running this code in the browser and record a performance profile will result in a classic memory zig-zag
Why is that a problem? That's how garbage-collected memory works. (Also, just to put things in perspective: the GC only spends ~0.3ms every ~8s in your profile.)
Workaround: Replace Math.random() with a large enough array of pre-calculated random numbers using self.crypto.getRandomValues`.
Replacing tiny short-lived HeapNumbers with a big and long-lived array doesn't sound like a great way to save memory.
If it really matters, one way to avoid boxing of numbers is to store them in arrays instead of as object properties. But before going through hard-to-maintain contortions in your code, be sure to measure whether it really matters for your app. It's easy to demonstrates huge effects in a microbenchmark, it's rare to see it have much impact in real apps.
Well having grown up in C++, I am always conscious about what algorithm would fit what. So when I notice the application started to behave sluggish on mobile phones I immediately started looking at the data structures and how they are represented.
I am noticing a very strange effect Array.includes is an order of Magnitude faster than Set.has. Even though Set.has much more potential to be optimized for lookup: it's the whole idea of using a set.
My initialize code is (this code is outside the timing for the tests):
function shuffle(a) {
for (let i = a.length - 1; i > 0; i--) {
const j = Math.floor(Math.random() * (i + 1));
[a[i], a[j]] = [a[j], a[i]];
}
}
const arr = []
for (let i = 0; i < 1000; i+=1) {
arr.push(i);
};
shuffle(arr);
const prebuildset=new Set(arr);
And the tests are:
(new Set(arr)).has(-1); //20.0 kOps/s
arr.includes(-1); //632 kOps/s
(new Set(arr)).has(0); //20.0 kOps/s
arr.includes(0); //720 kOps/s
prebuildset.has(-1); //76.7 kOps/s
prebuildset.has(0); //107 kOps/s
Tested with chrome 73.0.3683.103 on Ubuntu 18.04 using https://jsperf.com/set-array-has-test/1
I kind of can expect the versions that create a set on the fly to be slower than directly testing an array for inclusion. (Though I'd wonder why chrome doesn't JIT optimize the array away - I've also tested using a literal array and a literal array vs using a variable doesn't matter at all in speed).
However even prebuild sets are an order of magnitude slower than an array-inclusion test: even for the most negative case (entry not inside the array).
Why is this? What kind of black magic is happening?
EDIT: I've updated the tests to shuffle the results so as to not skew too much to an early stoppage for array.includes()- While no longer 10 times as slow it is still many times slower, very relevant and out of what I expect it to be.
I'll start by stating that I'm not an expert on JavaScript engine implementations and performance optimization; but in general, you should not trust these kind of tests to give you a reliable assessment of performance.
Time complexity of the underlying algorithm only becomes a meaningful factor over very (very) large numbers, and as a rule of thumb, 1000 is certainly not such a large number, especially for a simple array of integer values.
Over a small amount of millisecond-timed operations, you are going to have many other things happening in the engine at a similar time scale that will throw your measurements off wildly. Optimizations, unexpected overheads, and so on.
As an example, I edited your tests by simply increasing the size of the array to 100,000. The results on my poor old laptop look like this:
arr.includes(-1); //3,323 Ops/s
arr.includes(0); //6,132 Ops/s
prebuildset.has(-1); //41,923,084 Ops/s
prebuildset.has(0); //39,613,278 Ops/s
Which is, clearly, extremely different from your results. My point is, don't try to measure microperformance for small tasks. Use the data structure that makes the most sense for your project, keep your code clean and reasonable, and if you need to scale, prepare accordingly.
Let's say I have a javascript object.
var hash = { a : [ ] };
Now I want to edit the array hash.a 2 ways: first by accessing hash.a every time, second by making a pointer var arr = hash.a to store hash.a's memory address. Is second way faster, or they are the same.
Example:
// first way
hash.a.push(1);
hash.a.push(2);
hash.a.push(3);
hash.a.push(4);
hash.a.push(5);
//second way
var arr = hash.a;
arr.push(1);
arr.push(2);
arr.push(3);
arr.push(4);
arr.push(5);
Thanks a lot!
I don't think there would be any real performance gain, and if there is a slight gain it isn't worth it as you are hampering legibility and maintainability of code + use more memory (creation and garbage collection of another variable arr, also more chances for memory leaks too if you don't handle it properly). I wouldn't recommend it.
In a typical software project, only 20% of the time is development, rest 80% is testing and maintaining it.
You're doing the compiler's job in this situation - any modern compiler will optimize this code when interpreting it so both cases should be the same in terms of performance.
Even if these two cases weren't optimized by the compiler, the performance gains would be negligible. Focus on making your code as readable as possible, and let compiler handle these referencing optimizations.
Performance associated with Arrays and Objects in JavaScript (especially Google V8) would be very interesting to document. I find no comprehensive article on this topic anywhere on the Internet.
I understand that some Objects use classes as their underlying data structure. If there are a lot of properties, it is sometimes treated as a hash table?
I also understand that Arrays are sometimes treated like C++ Arrays (i.e. fast random indexing, slow deletion and resizing). And, other times, they are treated more like Objects (fast indexing, fast insertion/removal, more memory). And, maybe sometimes they are stored as linked lists (i.e. slow random indexing, fast removal/insertion at the beginning/end)
What is the precise performance of Array/Object retrievals and manipulations in JavaScript? (specifically for Google V8)
More specifically, what it the performance impact of:
Adding a property to an Object
Removing a property from an Object
Indexing a property in an Object
Adding an item to an Array
Removing an item from an Array
Indexing an item in an Array
Calling Array.pop()
Calling Array.push()
Calling Array.shift()
Calling Array.unshift()
Calling Array.slice()
Any articles or links for more details would be appreciated, as well. :)
EDIT: I am really wondering how JavaScript arrays and objects work under the hood. Also, in what context does the V8 engine "know" to "switch-over" to another data structure?
For example, suppose I create an array with...
var arr = [];
arr[10000000] = 20;
arr.push(21);
What's really going on here?
Or... what about this...???
var arr = [];
//Add lots of items
for(var i = 0; i < 1000000; i++)
arr[i] = Math.random();
//Now I use it like a queue...
for(var i = 0; i < arr.length; i++)
{
var item = arr[i].shift();
//Do something with item...
}
For conventional arrays, the performance would be terrible; whereas, if a LinkedList was used... not so bad.
I created a test suite, precisely to explore these issues (and more) (archived copy).
And in that sense, you can see the performance issues in this 50+ test case tester (it will take a long time).
Also as its name suggest, it explores the usage of using the native linked list nature of the DOM structure.
(Currently down, rebuilt in progress) More details on my blog regarding this.
The summary is as followed
V8 Array is Fast, VERY FAST
Array push / pop / shift is ~approx 20x+ faster than any object equivalent.
Surprisingly Array.shift() is fast ~approx 6x slower than an array pop, but is ~approx 100x faster than an object attribute deletion.
Amusingly, Array.push( data ); is faster than Array[nextIndex] = data by almost 20 (dynamic array) to 10 (fixed array) times over.
Array.unshift(data) is slower as expected, and is ~approx 5x slower than a new property adding.
Nulling the value array[index] = null is faster than deleting it delete array[index] (undefined) in an array by ~approx 4x++ faster.
Surprisingly Nulling a value in an object is obj[attr] = null ~approx 2x slower than just deleting the attribute delete obj[attr]
Unsurprisingly, mid array Array.splice(index,0,data) is slow, very slow.
Surprisingly, Array.splice(index,1,data) has been optimized (no length change) and is 100x faster than just splice Array.splice(index,0,data)
unsurprisingly, the divLinkedList is inferior to an array on all sectors, except dll.splice(index,1) removal (Where it broke the test system).
BIGGEST SURPRISE of it all [as jjrv pointed out], V8 array writes are slightly faster than V8 reads =O
Note: These metrics applies only to large array/objects which v8 does not "entirely optimise out". There can be very isolated optimised performance cases for array/object size less then an arbitrary size (24?). More details can be seen extensively across several google IO videos.
Note 2: These wonderful performance results are not shared across browsers, especially
*cough* IE. Also the test is huge, hence I yet to fully analyze and evaluate the results : please edit it in =)
Updated Note (dec 2012): Google representatives have videos on youtubes describing the inner workings of chrome itself (like when it switches from a linkedlist array to a fixed array, etc), and how to optimize them. See GDC 2012: From Console to Chrome for more.
At a basic level that stays within the realms of JavaScript, properties on objects are much more complex entities. You can create properties with setters/getters, with differing enumerability, writability, and configurability. An item in an array isn't able to be customized in this way: it either exists or it doesn't. At the underlying engine level this allows for a lot more optimization in terms of organizing the memory that represents the structure.
In terms of identifying an array from an object (dictionary), JS engines have always made explicit lines between the two. That's why there's a multitude of articles on methods of trying to make a semi-fake Array-like object that behaves like one but allows other functionality. The reason this separation even exists is because the JS engines themselves store the two differently.
Properties can be stored on an array object but this simply demonstrates how JavaScript insists on making everything an object. The indexed values in an array are stored differently from any properties you decide to set on the array object that represents the underlying array data.
Whenever you're using a legit array object and using one of the standard methods of manipulating that array you're going to be hitting the underlying array data. In V8 specifically, these are essentially the same as a C++ array so those rules will apply. If for some reason you're working with an array that the engine isn't able to determine with confidence is an array, then you're on much shakier ground. With recent versions of V8 there's more room to work though. For example, it's possible to create a class that has Array.prototype as its prototype and still gain efficient access to the various native array manipulation methods. But this is a recent change.
Specific links to recent changes to array manipulation may come in handy here:
http://code.google.com/p/v8/source/detail?r=10024
http://code.google.com/p/v8/source/detail?r=9849
http://code.google.com/p/v8/source/detail?r=9747
As a bit of extra, here's Array Pop and Array Push directly from V8's source, both implemented in JS itself:
function ArrayPop() {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.pop"]);
}
var n = TO_UINT32(this.length);
if (n == 0) {
this.length = n;
return;
}
n--;
var value = this[n];
this.length = n;
delete this[n];
return value;
}
function ArrayPush() {
if (IS_NULL_OR_UNDEFINED(this) && !IS_UNDETECTABLE(this)) {
throw MakeTypeError("called_on_null_or_undefined",
["Array.prototype.push"]);
}
var n = TO_UINT32(this.length);
var m = %_ArgumentsLength();
for (var i = 0; i < m; i++) {
this[i+n] = %_Arguments(i);
}
this.length = n + m;
return this.length;
}
I'd like to complement existing answers with an investigation to the question of how implementations behave regarding growing arrays: If they implement them the "usual" way, one would see many quick pushes with rare, interspersed slow pushes at which point the implementation copies the internal representation of the array from one buffer to a larger one.
You can see this effect very nicely, this is from Chrome:
16: 4ms
40: 8ms 2.5
76: 20ms 1.9
130: 31ms 1.7105263157894737
211: 14ms 1.623076923076923
332: 55ms 1.5734597156398105
514: 44ms 1.5481927710843373
787: 61ms 1.5311284046692606
1196: 138ms 1.5196950444726811
1810: 139ms 1.5133779264214047
2731: 299ms 1.5088397790055248
4112: 341ms 1.5056755767118273
6184: 681ms 1.5038910505836576
9292: 1324ms 1.5025873221216042
Even though each push is profiled, the output contains only those that take time above a certain threshold. For each test I customized the threshold to exclude all the pushes that appear to be representing the fast pushes.
So the first number represents which element has been inserted (the first line is for the 17th element), the second is how long it took (for many arrays the benchmark is done for in parallel), and the last value is the division of the first number by that of the one in the former line.
All lines that have less than 2ms execution time are excluded for Chrome.
You can see that Chrome increases array size in powers of 1.5, plus some offset to account for small arrays.
For Firefox, it's a power of two:
126: 284ms
254: 65ms 2.015873015873016
510: 28ms 2.0078740157480315
1022: 58ms 2.003921568627451
2046: 89ms 2.0019569471624266
4094: 191ms 2.0009775171065494
8190: 364ms 2.0004885197850513
I had to put the threshold up quite a bit in Firefox, that's why we start at #126.
With IE, we get a mix:
256: 11ms 256
512: 26ms 2
1024: 77ms 2
1708: 113ms 1.66796875
2848: 154ms 1.6674473067915691
4748: 423ms 1.6671348314606742
7916: 944ms 1.6672283066554338
It's a power of two at first and then it moves to powers of five thirds.
So all common implementations use the "normal" way for arrays (instead of going crazy with ropes, for example).
Here's the benchmark code and here's the fiddle it's in.
var arrayCount = 10000;
var dynamicArrays = [];
for(var j=0;j<arrayCount;j++)
dynamicArrays[j] = [];
var lastLongI = 1;
for(var i=0;i<10000;i++)
{
var before = Date.now();
for(var j=0;j<arrayCount;j++)
dynamicArrays[j][i] = i;
var span = Date.now() - before;
if (span > 10)
{
console.log(i + ": " + span + "ms" + " " + (i / lastLongI));
lastLongI = i;
}
}
While running under node.js 0.10 (built on v8) I was seeing CPU usage that seemed excessive for the workload. I traced one performance problem to a function that was checking for the existence of a string in an array. So I ran some tests.
loaded 90,822 hosts
loading config took 0.087 seconds (array)
loading config took 0.152 seconds (object)
Loading 91k entries into an array (with validate & push) is faster than setting obj[key]=value.
In the next test, I looked up every hostname in the list one time (91k iterations, to average the lookup time):
searching config took 87.56 seconds (array)
searching config took 0.21 seconds (object)
The application here is Haraka (a SMTP server) and it loads the host_list once at startup (and after changes) and subsequently performs this lookup millions of times during operation. Switching to an object was a huge performance win.
That is, would I be better suited to use some kind of tree or skip list data structure if I need to be calling this function a lot for individual array insertions?
You might consider whether you want to use an object instead; all JavaScript objects (including Array instances) are (highly-optimized) sets of key/value pairs with an optional prototype An implementation should (note I don't say "does") have a reasonable performance hashing algorithm. (Update: That was in 2010. Here in 2018, objects are highly optimized on all significant JavaScript engines.)
Aside from that, the performance of splice is going to vary a lot between implementations (e.g., vendors). This is one reason why "don't optimize prematurely" is even more appropriate advice for JavaScript applications that will run in multiple vendor implementations (web apps, for instance) than it is even for normal programming. Keep your code well modularized and address performance issues if and when they occur.
Here's a good rule of thumb, based on tests done in Chrome, Safari and Firefox: Splicing a single value into the middle of an array is roughly half as fast as pushing/shifting a value to one end of the array. (Note: Only tested on an array of size 10,000.)
http://jsperf.com/splicing-a-single-value
That's pretty fast. So, it's unlikely that you need to go so far as to implement another data structure in order to squeeze more performance out.
Update: As eBusiness points out in the comments below, the test performs an expensive copy operation along with each splice, push, and shift, which means that it understates the difference in performance. Here's a revised test that avoids the array copying, so it should be much more accurate: http://jsperf.com/splicing-a-single-value/19
Move single value
// tmp = arr[1][i];
// arr[1].splice(i, 1); // splice is slow in FF
// arr[1].splice(end0_1, 0, tmp);
tmp = arr[1][i];
ii = i;
while (ii<end0_1)
{
arr[1][ii] = arr[1][++ii];
cycles++;
}
arr[1][end0_1] = tmp;