I have a n*n*n array in javascript, in which i need to perform a LOT of access.
I don't need to access all elements sequentially, but at specific positions only. I also want, if possible, not to allocate all the memory of the array cells until it's used (other it would take several MB of memory directly).
I'm looking for the most efficient way to do so.
I tried to use a dictionnary indexed by a built key ( x + '#' + y + '#' + z ) but it's cleary not efficient enough.
Could you suggest some other efficient ways to achieve this ?
There's no faster way to access objects than a dictionary method, I'm afraid, because that's what everything in Javascript is, really. To not allocate the memory, you can use an object instead of an array:
var x = {};
var key = x + '#' + y + '#' + z;
x[key] = 'some value';
This will at least give you your memory concern, but I'm not sure it's really much of a concern. (Also, I'm not even sure that if you use an array it WILL allocate the memory, because I'm unfamiliar with memory allocation in Javascript).
I think your multidimensional array is perfectly fine. If created sparse, it will not eat up all the memory, and act more like a simple "dictionary" object - you could use nested objects as well. Yet, I'd propose that the nested lookup will be faster than in a huge dictionary, since the hash function gets simpler with less keys. Also, loading or iterating a full array from the innermost dimension will be noticably faster than querying each single item from the huge dictionary.
After all, if you don't actually experience any essential performance issues use what you find easier to write/read/use.
Related
I have a question about how object key-value work when we get value by key? Does it need to lookup where the key first and return the corresponding value?
For example: If we have object like
var obj = { keyX: valueX, keyY: valueY, keyZ: valueZ }
Then we retrieve the value obj.keyY, so does it lookup one by one on those object keys to find for keyY?
So for large object (e.g object with 1 million keys), is it slow when we get value by key?
I appreciate any helps. Thank you.
Then we retrieve the value obj.keyY, so does it lookup one by one on those object keys to find for keyY?
It's up to the implementation, but even in old JavaScript engines, the answer is no, it's much, much more efficient than that. Object property access is a hugely common operation, so JavaScript engines aggressively optimize it, and are quite sophisticated at doing so.
In modern JavaScript engines objects are often optimized to just-in-time generated class machine code, so property lookup is blindingly fast. If they aren't optimized for some reason (perhaps only used infrequently), typically a structure like a hash table is used, so lookup remains much better than linear access (looking at each property).
Objects with large numbers of properties, or properties that vary over time, may not be as well optimized as objects with more reasonable numbers of properties. But they'll at least be optimized (in anything vaguely modern) to a hash table level of access time. (FWIW: For objects that have varying properties over time, you may be better off with a Map in the first place.)
Although interesting academically, don't worry about this in terms of writing your code until/unless you run into a performance problem that you trace to slow property access. (Which I literally never have in ~20 years of JavaScript coding. :-) )
Little benchmark:
//setup..
for(var obj = {}, i =0; i<1000000; obj['key'+i] = i++);
var a;
//find 1
console.time(1);
a = obj.key1;
console.timeEnd(1);
//find last
console.time(2);
a = obj.key999999;
console.timeEnd(2);
As you see, it is not lookup one by one
I was wondering what the use-cases for code like var foo = new Array(20), var foo = [1,2,3]; foo.length = 10 or var foo = [,,,] were (also, why would you want to use the delete operator instead of just removing the item from the array). As you may know already, all these will result in sparse arrays.
But why are we allowed to do the above thnigs ? Why would anyone want to create an array whose length is 20 by default (as in the first example) ? Why would anyone want to modify and corrupt the length property of an array (as in the second example) ? Why would anyone want to do something like [, , ,] ? Why would you use delete instead of just removing the element from the array ?
Could anyone provide some use-cases for these statements ?
I have been searching for some answers for ~3 hours. Nothing. The only thing most sources (2ality blog, JavaScript: The Definitive Guide 6th edition, and a whole bunch of other articles that pop up in the Google search results when you search for anything like "JavaScript sparse arrays") say is that sparse arrays are weird behavior and that you should stay away from them. No sources I read explained, or at least tried to explain, why we were allowed to create sparse arrays in the first place. Except for You Don't Know JS: Types & Grammar, here is what the book says about why JavaScript allows the creation of sparse arrays:
An array that has no explicit values in its slots, but has a length property that implies the slots exist, is a weird exotic type of data structure in JS with some very strange and confusing behavior. The capability to create such a value comes purely from old, deprecated, historical functionalities ("array-like objects" like the arguments object).
So, the book implies that the arguments object somehow, somewhere, uses one of the examples I listed above to create a sparse array. So, where and how does arguments use sparse arrays ?
Something else that is confusing me is this part in the book "JavaScript: The Definitive Guide 6th Edition":
Arrays that are sufficiently sparse are typically implemented in a slower, more memory-efficient way than dense arrays are`.
"more memory-efficient" appears like a contradiction to "slower" to me, so what is the difference between the two, in the context of sparse arrays especially ? Here is a link to that specific part of the book.
I was wondering what the use-cases for code like var foo = new Array(20), var foo = [1,2,3]; foo.length = 10 or var foo = [,,,] were
in theory, for the same reason people usually use sparse data structure ( not necessarily in order of importance ): memory usage ( var x = []; x[0]=123;x[100000]=456; won't consume 100000 'slots' ), performance ( say, take the avarage of the aforementioned x, via for-in or reduce() ) and convenience ( no 'hard' out of bound errors, no need to grow/shrink explicitly );
that said, semantically, a js array is just a special associative collection with index keys and a special property 'length' satisfying the invariant of being greater than all its index properties. While being a pretty elegant definition, it has the drawback of rendering sparsely defined arrays somewhat confusing and error prone as you noticed.
But why are we allowed to do the above thnigs ?
even if we were not allowed to define sparse arrays, we could still put undefined elements into arrays, resulting in basically the same usability problems you see with sparse arrays.
So, say, having [0,undefined,...,undefined,1,undefined] the same as [0,...,1,] would buy you nothing but more memory consuming arrays and slower iterations.
Arrays that are sufficiently sparse are typically implemented in a slower, more memory-efficient way than dense arrays are. more memory-efficient and slower appear like a contradiction to me
"dense arrays" used for general purpose data are typically implemented as a contiguous block of memory filled with elements of the same size; if you add more elements, you continue filling the memory block allocating a new block if exhausted. Given that reallocation implies moving all elements to the new memory block, said memory is typically allocated in abundance to minimize chances of reallocation ( something like the golden ratio times the last capacity ).
Hence, such data structures are typically the fastest for ordered/local traversal ( being more CPU/cache friendly ), the slowest for unpredicatble insertions/deletions ( for sufficiently big N ) and have high memory overhead ~ sizeof(elem) * N + extra space for future elems.
Conversely, "sparse arrays/matrices/..." are implemented by 'linking' together smaller memory blocks spreaded in memory or by using some 'logically compressed' form of a dense data structure or both; in either case, memory consumption is reduced for obvious reasons, but traversing them comparatively requires more work and less local memory access patterns.
So, if compared relative to the same effectively traversed elements sparse arrays consume much less memory but are much slower than dense arrays. However, given that you use sparse arrays with sparse data and algorithms acting trivially on 'zeros', sparse arrays can turn out much faster in some scenarios ( eg. multiply very big matrices with few non zero elements ... ).
Because a JS Array is a very curious kind of data type that do not obey the time complexity rules as you would probably expect when the right tool is used. I mean the for in loop or the Object.keys() method. Despite being a very functional man i would swing towards the for in loop here since it is breakable.
There are some very beneficial use cases of sparse arrays in JS such as inserting and deleting items into a sorted array in O(1) without disturbing the sorted structure if your values are numerals like a Limit Order Book. Or in another words, if you could establish a direct numerical correlation among your keys and values.
I did search this info on the web, some say yes because javascript must create a new string object to store the result of the concatenation, some say no because string objects are not collected.
Maybe it depends on the context. For example, if I had an array of objects like
animals["blue_dog","red_dog","yellow_cat","red_bird","green_bird"...]
and I had a function with animal and color arguments, in this function I would access my object like this:
animals[animal+"_"+color].
Most of the time I do concatenations when drawing text, which obviously doesn't happen a lot of times per frame. So even if it becomes garbage, it is insignificant. But when using a concatenation as an object's key, because of loops this concatenation could happen a thousand times per frame, and then this may become a problem.
Doing something like animals[color + "_" + animal] creates a temporary value for accessing the object. That temporary will be collected either by the garbage collector or at the end of the block/function call (implementation dependent).
My assumption (based on my own experience in working with compilers) is that since the result isn't stored in a variable, it will be placed on the stack and released once the function completes. But, again, this depends on how the engine is written.
I'm in no way an expert on compilers.
May or may not, depends on how optimal is the compiler.
This a + b + c is (a + b) + c so you have two concatenations. Result of (a + b) will be a temporary object (string here). That temporary is a garbage.
For the given expression this form a.concat(b,c) is conceptually better. It in principle does not require intermediate temporaries.
In my experience, concatenating strings can definitely produce garbage. I had a scenario where I had a lot of cells in a large table and each cell could have a different css class (from a set of combinations of maybe 30 different combinations) assigned. Doing this the normal way:
const class = group + empty + active;
Would produce a lot of garbage. I created a memoized function that received a bitmask as argument which would produce the class string and got rid of the garbage that way.
This reply is an excellent list of what to avoid:
https://stackoverflow.com/a/18411275
I want to create an array in javascript and remember two ways of doing it so I just want to know what the fundamental differences are and if there is a performance difference in these two "styles"
var array_1 = new Array("fee","fie","foo","fum");
var array_2 = ['a','b','c'];
for (let i=0; i<array_1.length; i++){
console.log(array_1[i])
}
for (let i=0; i<array_2.length; i++){
console.log(array_2[i])
}
They do the same thing. Advantages to the [] notation are:
It's shorter.
If someone does something silly like redefine the Array symbol, it still works.
There's no ambiguity when you only define a single entry, whereas when you write new Array(3), if you're used to seeing entries listed in the constructor, you could easily misread that to mean [3], when in fact it creates a new array with a length of 3 and no entries.
It may be a tiny little bit faster (depending on JavaScript implementation), because when you say new Array, the interpreter has to go look up the Array symbol, which means traversing all entries in the scope chain until it gets to the global object and finds it, whereas with [] it doesn't need to do that. The odds of that having any tangible real-world impact in normal use cases are low. Still, though...
So there are several good reasons to use [].
Advantages to new Array:
You can set the initial length of the array, e.g., var a = new Array(3);
I haven't had any reason to do that in several years (not since learning that arrays aren't really arrays and there's no point trying to pre-allocate them). And if you really want to, you can always do this:
var a = [];
a.length = 3;
There's no difference in your usage.
The only real usage difference is passing an integer parameter to new Array() which will set an initial array length (which you can't do with the [] array-literal notation). But they create identical objects either way in your use case.
This benchmark on JSPerf shows the array literal form to be generally faster than the constructor on some browsers (and not slower on any).
This behavior is, of course, totally implementation dependent, so you'll need to run your own test on your own target platforms.
I believe the performance benefits are negligible.
See http://jsperf.com/new-array-vs-literal-array/4
I think both ways are the same in terms of performance since they both create an "Array object" eventually. So once you start accessing the array the mechanism will be the same. I not too sure about how different the mechanisms to construct the arrays be (in terms of performance) though it shouldn't be any noticeable gains using one way to the other.
I was reading this post the other night about the inner workings of the Array, and learned a lot from the answers posted, especially from Jonathan Holland's one.
So the reason you give a size to an array beforehand is so that space will need to be reserved beforehand, so that elements in the array will be placed next each other in memory, and thus providing O(1) access time, because of the pointer + offset traversal.
But in JavaScript, you can initialize an array like such:
var anArray = []; //Initialize an empty array, without a dimension
So my question is, since in JavaScript you can initialize an array Without specifying a dimension before hand, how is memory allocated for an array to still provide a O(1) access time since the 'amount' of memory locations is not specified beforehand ?
Hmm. You should distinguish between arrays and associative arrays.
arrays:
A=[0,1,4,9,16];
associative arrays:
B={a:'ha',b:27,c:30};
The former has a length, the latter does not. When I run this in a javascript shell, I get:
js>A=[0,1,4,9,16];
0,1,4,9,16
js>A instanceof Array
true
js>A.length
5
js>B={a:'ha',b:27,c:30};
[object Object]
js>B instanceof Array
false
js>B.length
js>
How arrays "work" in Javascript is implementation-dependent. (Firefox and Microsoft and Opera and Google Chrome would all use different methods) My guess is they (arrays, not associative arrays) use something like STL's std::vector. Your question:
how is memory allocated for an array
to still provide a O(1) access time
since the 'amount' of memory locations
is not specified beforehand ?
is more along the lines of how std::vector (or similar resizable arrays) works. It reallocates to a larger array as necessary. Insertions at the end take amortized O(1) time, namely if you insert N elements where N is large, the total time takes N*O(1). Those individual inserts where it does have to resize the array may take longer, but on the average it takes O(1).
Arrays in Javascript are "fake". They are implemented as hash maps. So in the worst case their access time is not O(1). They also need more memory and you can use any string as an array index. You think that's weird? It is.
As I understand, it's like this:
There are two different things in JavaScript: Arrays and Objects. They both act as hashtables, although the underlying implementation is specific to each runtime. The difference between the two is that an Array has an implicit length property, while an object does not. Otherwise you can use [] or . syntaxes for both of them. Yes, that means that objects can have numerical properties and arrays have string indices. No problem. Although the length property might not be what you expect when using such tricks or sparse arrays. You should rely on it only if the array is not sparse and indices start from 0.
As for the performance - sorry, it's not the O(1) you'd expect. As stated before - it's actually implementation specific. But in general case it's not possible to ensure that there will be O(1) performance for all operations in a hashtable. That said, I'd expect that decent implementations should have quite a few optimizations in place for standard cases, which would make the performance quite close to O(1) under most scenarios. But at any rate - storing huge volumes of data in JavaScript is not a wise idea.
It is the same in PHP. I came from a PHP/Javascript background and dimensionalizing arrays really got me when i moved onto other languages.
javascript has no real arrays.
Elements are allocated as you define them.
It is a flexible tool. You can use the for many purposes but as a general purpose tool is not as efficient as special purpose arrays.
As Jason said, unless it is explicitly specified by the ECMAScript standard (unlikely), it is implementation dependent. The article shown by Feet shows that IE's implementation was poor (until IE8?), which is confirmed by JavaScript loop performance.
Other JS engines probably takes a more pragmatic approach. For example, they can do like in Lua, having a true array part and an associative array part: if the array is dense, it lives in the true array (which can still be extended at the cost of re-allocation), and you can still have sparse high indices living in the associative part.
Thus you have the best of two worlds, speed of access to dense parts and low memory use for sparse parts.
You can even do:
var asocArray = {key: 'val', 0: 'one', 1: 'two'}
and
var array = []; array['key'] = 'val'; array[0] = 'one'; array[1] = 'two';
When looping, you can use them in the same way using a for i in object loop, instead of using that for the asocArray and using a for var i=0; i<array.length; i++ loop for "array". The only difference here is that if you're expecting the indices in "array" to be a typeof i === 'number' or (i).constructor === Number then it will only be in the "for var i=0; i<array.length; i++ loop"; the for i in object loop makes all the keys (even indices) a String.