Object and asscoiative array in php/js is hash map,
so I worry about the reliability if I store many elements in an array.
I worry data will be overwritten when collision occureed.
What hash is used in php/js, and how to find two collision string to test it?
[...] the array's internal hashing mechanism will convert your string
to an integer it can then use to address a bucket within the array.
PHP's arrays aren't true/real arrays - they are some sort of Linked
HashMap internally. Considering that multiple strings can boild down
to the same hash, each bucket is a list itself. If there are multiple
elements within the same bucket, each key has to be evaluated.
Extracted from PHP associative array's keys (indexes) limitations? (rodneyrehm answer)
Related
How do relatively modern languages such as ruby/python/js etc may store multiple data types in arrays and are still able to access any element from the array using its index in O(1) time?
As far as I understand, we do simple mathematics to determine the memory address pointing to any element, and we do so by the index multiplied by the size of each element of the array.
Firstly, neither the Ruby Language Specification nor the Python Language Specification nor the ECMAScript Language Specification prescribe any particular implementation strategy for arrays (or lists as they are called in Python). Every implementor is free to implement them however they wish.
Secondly, lumping them all together doesn't make much sense. For example, in ECMAScript, arrays are really just objects with numeric properties, and actually, those numeric properties aren't even really numeric, they are strings.
Third, they don't really store multiple data types. E.g. Ruby only has one data type: objects. Since everything is an object, everything has the same type, so there is no problem storing objects in arrays.
Fourth, at least the Ruby Language Specification does not actually guarantee that array access is O(1). It is highly likely that a Ruby Implementation which does not provide O(1) access would be rejected by the community, but it would not violate any spec.
Now, of course, any implementor is allowed to be as clever as they want to be. E.g. V8 detects when all values of an array are numbers and then stores the array differently. But that is a private internal implementation detail of V8.
In memory, a heterogeneous array is an array of pointers. Every array element stores the memory address of the item at that position in the array.
Since memory addresses are all the same size, you can find the address of each address by multiplying the array index by the address size, and adding it to the base address of the array.
I have two key strings, representing two objects of two different classes of objects. The keys are unique within their classes. I need to create a unique key representing the combination of these two keys. How do I do this?
I can't just concatenate them, because any character could be present in the base keys. I could maybe escape the concatenation delimiter before I concatenate? I could create a hash of the two keys, but this feels heavy.
Are there built in functions to help me do this? Is this "called something"? It seems like this would be a common (solved) problem.
(I'm looking for this specifically in Javascript, but I'm curious about higher level solutions or frameworks as well)
Simply concatenate the two strings, but prefix the first string with its length, and a separator. For example, if the input strings are "cat" and "doorbell", the output string would be "3:catdoorbell". On the other hand, if the inputs were "catdoor" and "bell", the output would be "7:catdoorbell".
With this method, the outputs are unique, and the original keys are recoverable from the output string.
Create a third type of object that has it's own key (or not, the key is optional) to track all the combinations. If you are dynamically creating these entries you may want something to check and enforce the uniqueness of the values.
I usually call this a crosswalk table or a linking table.
The wiki page covering this topic calls it an Associative Entity
Wikipedia - Associative entity
So, I just learnt about python's implementation of a hash-table, which is dictionary.
So here are what I understand so far, please correct me if I'm wrong:
A dictionary is basically a structured data that contains key-value pairs.
When we want to search for a key, we can directly call dict[key]. This is possible because python does a certain hash function on the key. The hash results is the index of the value in the dictionary. This way, we can get to the value directly after doing the hash function, instead of iterating through a list.
Python will update the hash-table by increasing the amount of 'buckets' when the hash-table is filled 2/3rd of its maximum size.
Python will always ensure that every 'buckets' will only have 1 entry in it so that the performance on lookup will be optimal, no iterations needed.
My first question is, am I understanding python dictionary correctly?
Second, does the javascript object also has all these 4 features? If not, is there another built-in javascript implementantion of dictionary/hash-table in general?
JavaScript Objects can be used as dictionaries, but see Map for details on a JavaScript Map implementation. Some key takeaways are:
The Object prototype can potentially cause key collisions
Object keys can be Strings or Symbols. Map keys can be any value.
There is no direct means of determining how many "map" entries an Object has, whereas Map.prototype.size tells you exactly how many entries it has.
As a rule of thumb: if you're creating what is semantically a collection (an associative array), use a Map. If you've got different types of values to store, use an Object.
TL:DR
Beside the use of the convent Array helper functions (which I could theoretically create for objects), and considering the performance advantage of Object lookups, what reason could be given to use an Array instead of an Object?
Objects
From what I understand, because JavaScript objects use hash tables to lookup their key -> data pairs, the look-up time, no matter the length of the object is very small.
For example if I want a really fast dictionary look up, in the past I've (and we can condense the syntax but that's besides the point) stored dictionary data in JSON as
"apple" : "apple",
and then used
if (Dictionary.apple) console.log("Yep it's a word!");
And the result return very very fast regardless of whether my dictionary contains 30,000 words or 300,000.
Arrays
On the other hand, unless I know the number an array item is attached to, I have to loop through the entire array, causing larger lookup times the further the item is down the list.
The good thing I know of about using an array is that I get access to convenient functions such as slice, but these could probably be created for use with objects.
My Question
So, considering the lookup efficiency of objects, I'd currently choose an object over an array for every situation. But I could easily be wrong about this.
Beside the use of the convent Array helper functions (which I could theoretically create for objects), and considering the performance advantage of Object lookups, what reason could be given to use an Array instead of an Object?
You're comparing apples to oranges here. If you need to map from arbitrary string keys to values, as in your example with "apple", then you use an object. (In ES2015, you might alternatively use a Map instance.)
If you have a whole bunch of oranges, and you want to keep them in a list numbered from 0, you put the oranges in an array and index by which (numbered) orange you want.
The process of locating a property on an object is the same whether the object is a plain Object instance or an Array instance. In modern JavaScript runtime environments, it's safe to assume that the process for looking up number-indexed array properties is appropriately optimized to be even faster than the hash lookup for arbitrary string-named properties. That, however, is a completely separate issue from the nature of the work you need to do and the choice of data structure. Either you have a list of things, such that the order of the things in the list is the salient relationship between them, or you have named things that you need to access by those names. The two situations are conceptually different.
One big difference is the order of elements.
Looping through objects keys can't guarantee any specific order.
Looping through array keys will always give you the same order of elements.
As far as i know, unlike arrays object hashmaps use literal keys to query the value, Is it possible to query the object hashmap using an index like one would do on arrays? I know this is possible using frameworks like underscore.js but i just want a vanilla javascript method if by any chance its possible.
Basically Arrays are the kind of data structure which is defined loosely as collection of values (may be of different types in JS) whereas Object or hashmaps (traditionally) are a record of key-value pairs i.e. a value mapped to a key.
So in Arrays can be accessed by an index, but Objects are a type in themselves. But you surely can have an array of Objects which you can access by index.
Hope this answers your query.