Javascript Array lookup efficiency: associative vs. stored associative? - javascript

I've been reading, and they're saying that associative arrays won't give you the same efficiency as arrays. An associative array can look things up in O(N) time, where an array can look things up in O(1).
Here's my question: which one would be more efficient in terms of looking up values quickly and not hogging too much memory?
Associative:
var myVars=new Array();
myVars['test1'] = a;
myVars['test2'] = b;
myVars['test3'] = c;
... (up to 200+ values)
echo myVars['test2'];
Stored Associative:
var myVars=new Array();
var TEST1 = 1;
var TEST2 = 2;
var TEST3 = 3;
... (up to 200+ values)
myVars[TEST1] = a;
myVars[TEST2] = b;
myVars[TEST3] = c;
... (up to 200+ values)
echo myVars[TEST2];

First, the first usage of Array is wrong. Although it is possible to do it, it does not mean you should. You are "abusing" the fact that arrays are objects too. This can lead to unexpected behaviour, e.g. although you add 200 values, myVars.length will be 0.
Don't use a JavaScript array as associative array. Use plain objects for that:
var myVars = {};
myVars['test1'] = a;
myVars['test2'] = b;
myVars['test3'] = c;
Second, in JavaScript there is no real difference between the two (objects and arrays). Arrays extend objects and add some behaviour, but they are still objects. The elements are stored as properties of the array.
You can find more information in the specification:
Array objects give special treatment to a certain class of property names. A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 232−1. (...)
So both:
var obj = {'answer': 42};
obj['answer'];
and
var arr = [42];
arr[0];
have the same access time†, which is definitely not O(n).
†: It is better to say should have. Apparently this varies in different implementations.
Apart from that, your second example is horrible to maintain. If you assign numbers to variables, why not use the numbers directly?
var myVars = [];
myVars[0] = a;
myVars[1] = b;
myVars[2] = c;
Update:
More importantly: You have to choose the right data structure for your needs and this is not only determined by the access time of a single element, but also:
Are the keys consecutive numbers or arbitrary strings/numbers?
Do you have to access all (i.e. loop over all) elements of the collection?
Numerical arrays (arrays) and associative arrays (or hash tables/maps (objects in JS)) provide different solutions for different problems.

I posit that the present responses do not fully consider more practical use cases. I created this jsperf to demonstrate. While #Felix's jsperf demonstrates lookup speed, it's not performed on sufficiently large objects to be really useful. I think 10,000 simple properties is more reasonable. Further, you need to randomly select keys in the sequence to read, modify, delete and create to truly demonstrate the performance differences between the two types.

First of all, whoever they are, feel free to ignore them.
Every decent implementation of every decent scripting language, including JavaScript, will give you associative arrays that are either O(log(n)) access time, or else O(1) average access time, O(n) worst case (which you almost never hit). Either way in practice a lookup is fast.
Arrays have O(1) guaranteed access time, which is incredibly fast. But in some scripting languages (eg PHP) there isn't even a native array type provided. They just use associative arrays for both.

Answer: test it out yourself.
Update: After some back-and-forth with Felix, it appears that array access is usually faster than both associative arrays and objects. This is not always the case, notably in Chrome. In Chrome 11 on Ubuntu 11, arrays are faster. In Chrome 11 on Mac OS 10.6 there is no notable difference between them.
These tests did not measure manipulation, only reading.

Related

Javascript Hashing Two Numbers

I have two numbers in javascript.
They are integral coordinates.
I also have an object: var regions = {};
I'd like to be able to access specific objects as quickly as possible.
E.g. there could be an object at (5,-2)
What's a fast way of creating unique hashes, that do not clash, for two numbers like this?
I'm assuming the "specific objects" you want to access are referenced by properties on regions. Object properties in JavaScript are named either by string names or by Symbol names. You'd be using strings in this case.
Since it's going to be by string, the simple solution is just create a string, and look it up with that:
var obj = regions[num1 + "," + num2];
That leaves it up to the JavaScript engine to do a good job looking up the property. JavaScript engines are very good (read: fast) at this, because they have to do it a lot.
How the engine does the lookup will depend in some measure on how the regions object is created. Wherever possible, modern engines will create objects that are effectively micro-classes and provide very quick resolution of property names. If you do certain things (like using delete on one of the properties), a modern engine may fall back to "dictionary" mode where it uses a hash lookup. This is still fast, just not as fast as the optimized form.
I doubt you'll find a quicker way that that. In theory, if region were a contiguous array where elements referenced contiguous arrays, the JavaScript engine could make those true arrays under the covers, and one of your example numbers is negative which would prevent a true array from being used. And you'd still be doing two lookups (the first number, then the second), and there's no guarantee it would be faster. It would certainly be more complicated.
As you've said you'll receive the information in dribs and drabs, I'd go with a simple compound string key until/unless you have a performance problem with that solution, then look at trying to get contiguous arrays happening on your target engines (converting those negative indexes into something else, which could be problematic). I doubt you'll find that the lookup is every the bottleneck.
You can use array notation:
var regions = {}
regions[[5,-2]] = myObject;
regions[[5,-2]]; // myObject
Object keys can't be arrays, so under the hood [5,-2] will be stringified to "5,-2". But I think that using array notation is more pretty than stringifying manually.
Example:
var regions = {
'5,-2': { ... },
'1,3': { ... }
};
and then if you have the 2 numbers you could easily generate the key and access the corresponding object:
var x = 5;
var y = -2;
var obj = regions[x + ',' + y];
I used two function encode and decode.
var x = 10;
var y = -3.2;
function encode(x, y) {
return x + ',' + y;
}
function decode(code) {
var xy = code.split(',');
return [+xy[0], +xy[1]];
}
var code = encode(x, y);
console.log(code);
var xy = decode(code);
console.log(xy);
The code can be used as dict key value.

Why does javascript array with very high index numbers lead to crash / slow down / trouble?

Basically the code that broke my node js express server was this:
resultArr = [];
resultArr["test"] = [];
resultArr["test"][2015073012] = someObject;
when I changed this to this, it ran without problems
resultArr = [];
resultArr["test"] = {};
resultArr["test"][2015073012] = someObject;
I did work like this in a loop.
Why did it break my app?
As you found, you shouldn't be using arrays for this, you should be using objects. But you should go a step further and use an object for the top level as well. And since your 2015073012 value will be used as a string, it's a good practice to make it one from the start:
var results = {};
results.test = {};
results.test['2015073012'] = someObject;
or:
var results = {};
results['test'] = {};
results['test']['2015073012'] = someObject;
Now you won't have any problem in any JavaScript engine.
(As an aside, I changed the name from resultArr to results so the name doesn't make it sound like it's an array.)
JavaScript arrays are for cases where you have sequential entries like array[0], array[1], array[2], etc. When you have arbitrary strings or arbitrarily large numbers for keys, do not use an array, use an object.
Don't be confused by other languages such as PHP that have a single array type which serves both as a sequential 0,1,2,3,... array and as a dictionary of key-value pairs. JavaScript has both arrays and objects: use arrays for the sequential case and objects for the key-value case.
Back to your question, why did this code break:
resultArr = [];
resultArr["test"] = [];
resultArr["test"][2015073012] = someObject;
One possible explanation is that the JavaScript engine is doing exactly what you told it to do when you assign a value to the [2015073012] array index: it creates an array with 2,015,073,013 entries (one more than the value you gave, because array indexes start at 0). That is over two billion entries in your array! You can probably see how that would cause a problem - and it certainly isn't what you intended.
Other engines may realize that this is a ridiculously large number and treat it as a string instead of a number, as if you'd used an object instead of an array in the first place. (A JavaScript array is also an object and can have key-value pairs as well as numeric indices.)
In fact, I crossed my fingers and tried this in the JavaScript console in the latest version of Chrome, and it worked with no problem:
a = [];
a[2015073012] = {};
But you weren't as lucky. In any case, you should always use objects instead of arrays for this kind of use, to insure that they are treated as key-value pairs instead of creating enormous arrays with mostly-empty elements.

Non-functionals of Arrays in JavaScript [duplicate]

The difference between a JavaScript Array, and Object is not very big. In fact it seems Array mainly adds the length field, so you can use both Arrays and Objects as numeric arrays:
var ar = new Array();
ar[0] = "foo";
ar["bar"] = "foo";
var ob = new Object();
ob[0] = "foo";
ob["bar"] = "foo";
assert(ar[0] == ob[0] == ar["0"] == ob["0"] == ar.bar == ob.bar); // Should be true.
So my questions is, in popular JavaScript engines (V8, JavaScriptCore, SpiderMonkey, etc.), how is this handled? Obviously we do not want our arrays to be actually stored as hash maps with key values! How can we be reasonably sure our data is stored as an actual array?
As far as I can see there are a few approaches engines could take:
Array is implemented exactly the same way as Object - as an associative array with string keys.
Array is a special case, with a std::vector-like array backing the numeric keys, and some density heuristic to prevent insane memory use if you do ar[100000000] = 0;
Array is the same as Object, and all objects get a heuristic to see if using an array would make more sense.
Something insanely complicated that I haven't thought of.
Really this would be simpler if there were a proper array type (cough WebGL typed arrays cough).
In SpiderMonkey, arrays are implemented basically as C arrays of jsvals. These are referred to as "dense arrays". However, if you start doing un-array-like things to them -- like treating them like objects -- their implementation is changed to something which very much resembles objects.
Moral of the story: when you want an array, use an array. When you want an object, use an object.
Oh, a jsval is a sort of variadic type which can represent any possible JavaScript value in a 64 bit C type.
In V8 and Carakan (and presumably Chakra), all (non-host) objects (both those that are arrays and those that aren't) with properties whose names are array indexes (as defined in ES5) are stored as either a dense array (a C array containing some value wrapper) or a sparse array (which is implemented as a binary search tree).
The unified object representation shows through in that it affects enumeration order: with an object, SpiderMonkey and SquirrelFish both give all properties in insertion order; and with an array, they in general (there are special cases in SM at least!) array indexes first then all other properties in insertion order. V8, Carakan, and Chakra always give array indexes first then all other properties in insertion order, regardless of object type.

Javascript Array index fundamentals

I am not sure how the Javascript engines (specifically browser engines) store an array.
For example - how much memory would this use?
var x = new Array(0, 1, 2, 1000, 100000000);
I want to map integer dates as array indexes, but I need to be sure it isn't a bad idea.
Arrays are "special" in only a couple ways:
They've got some interesting array-like methods from their prototype ("slice()" etc)
They've got a "magic" length property that tracks the largest numeric property "name"
If you store something at position 10299123 in a brand-new array, the runtime does not use up all your memory allocating an actual, empty array. Instead, it stores whatever you want to store and makes sure that length is updated to 10299124.
Now the problem specifically with dates, if you're talking about storing the timestamp, is that (I think) they're bigger than 32-bit integers. Array indexes are limited to that size. However, all that means is that length won't be correct. If you don't really care about any of the array stuff anyway, then really all you need is a plain object:
var dateStorage = {};
dateStorage[someDate.getTime()] = "whatever";
JavaScript objects can be used as name-value maps as long as the name can be represented as a string (which is clearly true for numbers).

How are JavaScript arrays implemented?

Namely, how does the following code:
var sup = new Array(5);
sup[0] = 'z3ero';
sup[1] = 'o3ne';
sup[4] = 'f3our';
document.write(sup.length + "<br />");
output '5' for the length, when all you've done is set various elements?
My 'problem' with this code is that I don't understand how length changes without calling a getLength() or a setLength() method. When I do any of the following:
a.length
a['length']
a.length = 4
a['length'] = 5
on a non-array object, it behaves like a dict / associative array. When I do this on the array object, it has special meaning. What mechanism in JavaScript allows this to happen? Does JavaScript have some type of property system which translates
a.length
a['length']
into "get" methods and
a.length = 4
a['length'] = 5
into "set" methods?
Everything in JavaScript is an object. In the case of an Array, the length property returns the size of the internal storage area for indexed items of the array. Some of the confusion may come into play in that the [] operator works for both numeric and string arguments. For an array, if you use it with a numeric index, it returns/sets the expected indexed item. If you use it with a string, it returns/sets the named property on the array object - unless the string corresponds to a numeric value, then it returns the indexed item. This is because in JavaScript array indexes are coerced to strings by an implicit toString() call. Frankly, this is just one more of those things that makes you scratch your head and say "JavaScript, this, this is why they laugh at you."
The actual underlying representation may differ between browsers (or it may not). I wouldn't rely on anything other than the interface that is supplied when working with it.
You can find out more about JavaScript arrays at MDN.
Characteristics of a JavaScript array
Dynamic - Arrays in JavaScript can grow dynamically .push
Can be sparse - for example, array[50000] = 2;
Can be dense - for example, array = [1, 2, 3, 4, 5]
In JavaScript, it is hard for the runtime to know whether the array is going to be dense or sparse. So all it can do is take a guess. All implementations use a heuristic to determine if the array is dense or sparse.
For example, code in point 2 above, can indicate to the JavaScript runtime that this is likely a sparse array implementation. If the array is initialised with an initial count, this could indicate that this is likely a dense array.
When the runtime detects that the array is sparse, it is implemented in a similar way to an object. So instead of maintaining a contiguous array, a key/value map is built.
For more references, see How are JavaScript arrays implemented internally?
This really depends on what you intend to do with it.
[].length is "magical".
It doesn't actually return the number of items in the array. It returns the largest instated index in the array.
var testArr = []; testArr[5000] = "something"; testArr.length; // 5001
But the method behind the setter is hidden in the engine itself.
Some engines in some browsers will give you access to their implementations of those magic-methods.
Others will keep everything completely locked down.
So don't rely on defineGetter and defineSetter methods, or even, really, __proto__ methods, unless you know which browsers you know you're targeting, and which you aren't.
This will change in the future, where opt-in applications written in ECMAScript Next/6 will have access to more.
ECMAScript 5-compliant browsers are already starting to offer get and set magic methods in objects and there's more to come... ...but it's probably a while away before you can dump support for oldIE and a tonne of smartphones, et cetera...
It is important to know that when you do sup['look'] = 4; you are not using an associative array, but rather modify properties on the object sup.
It is equivalent to sup.look = 4; since you can dynamically add properties on JavaScript objects at any time. sup['length'] would for an instance output 5 in your first example.
To add to tvanfosson's answer: In ECMA-262 (the 3.0 specification, I believe), arrays are simply defined as having this behavior for setting properties (See 15.4.5.1). There's no general mechanism underlying it (at least as of now) - this is just how it's defined, and how JavaScript interpreters must behave.
As other people have mentioned, a property in JavaScript can basically act as both as getter and a setter of your array (or string or other inputs).
As a matter of fact, you might try this yourself:
const test = [1, 2, 3, 4, 5]
test.length = 3
console.log(test) // [1, 2, 3]
test.length = 5
console.log(test) // Guess what happens here!
As far as I know, arrays in JavaScript do not work exactly like associative arrays and you have elements which are put in memory as contiguously as possible (given that you can have arrays of mixed objects), depending on the JavaScript engine you are considering.
As a side note, I am a bit baffled that the most voted answer keeps spreading the over-simplified myth (or half-truth) of "everything being an object in JavaScript"; that is not exactly true, otherwise you will never study primitives, for example.
Try to do this:
const pippi = "pippi"
pippi.cat = "cat"
console.log(pippi.cat) // Will it work? Throw an error? Guess why again
Spoiler: the string is wrapped in a throwaway object for that specific operation on the second line, and then in the following one you are just going to access a property of the primitive which is not there (provided you did not play with String.prototype or the like), so you get undefined.
Array object inherits caller, constructor, length, and name properties from Function.prototype.
A JavaScript array is an object just like any other object, but JavaScript gives it special syntax.
arr[5] = "yo"
The above is syntactic sugar for
arr.insert(5,"yo")
which is how you would add stuff to a regular object. It's what is inside the insert method that changes the value of arr.length
See my implementation of a customArray type here: http://jsfiddle.net/vfm3vkxy/4/

Categories