JavaScript: memory/efficiency of associative arrays? - javascript

I am building a tree-like data structure out of associative arrays. Each key is 1-2 characters. Keys are unique to their respective level. There will be no more than 40 keys on the root level and no more than 5 keys on each subsequent levels of the tree. It might look something like this:
{a:{b:null,c:null},de:{f:{g:null}},h:null,i:null,j:null,k:null}
Initially, I thought that creating so many objects with so few keys (on average, < 3) would be inefficient and memory intensive. In that case, I would implement my own hash table like so:
//Suppose keys is a multi-dimensional array [[key,data],...]
var hash = function(keys){
var max = keys.length*3, tbl = [];
//Get key hash value
var code = function(key){
return (key.charCodeAt(0)*31)%max;
}
//Get key values
this.get(key){
//2 character keys actually have a separate hash generation algorithm...
//we'll ignore them for now
var map = code(key), i=map;
//Find the key value
while(true){
if (typeof tbl[i] == 'undefined') return false;
if (code(tbl[i][0]) == map && tbl[i][0] == key) return tbl[i][1];
else i = (i+1)%max;
}
}
//Instantiate the class
for (var i=0; i<keys.length; i++){
var index = code(keys[i][0]);
while(typeof tbl[index] != 'undefined')
index = (index+1)%max;
tbl[index] = keys[i];
}
}
Then, I read somewhere that JavaScript's arrays are sometimes implemented as associative arrays when sparsely filled, which could defeat the purpose of making my own hash structure. But I'm not sure. So, which would be more efficient, in terms of memory and speed?

Read this article: http://mrale.ph/blog/2011/11/05/the-trap-of-the-performance-sweet-spot.html
Basically due to the dynamic nature of JavaScript, your data structures will not be very efficient. If you do need very efficient data structures, you should try using the new Typed Arrays introduced recently.
If you aren't into theoretical results, Resig has done real word performance testing on different types of trees looking at data size and performance parsing and processing: http://ejohn.org/blog/javascript-trie-performance-analysis/

Your solution, if I understand it correctly, will definitely perform worse. You express a concern with this:
[...] creating so many objects with so few keys (on average, < 3) [...]
but your solution is doing the same thing. Every one of your nested hashes will still be an object with a small number of keys, only now some of its keys are a closure named get (which will have higher memory requirements, since it implicitly closes over variables such as tbl and code, where code is another closure . . .).

Related

Checking if element exists in array without iterating through it

my array:
tempListArray = "[{"id":"12","value":false},{"id":"10","value":false},{"id":"9","value":false},{"id":"8","value":false}]";
To check if an element exists I would do this:
for (var i in tempListArray) {
//check flag
if (tempListArray[i].id == Id) {
flagExistsLoop = 1;
break;
}
}
Is there anyway, I can check if an Id exists without looping through the whole array. Basically I am worried about performance if say I have a 100 elements.
Thanks
No, without using custom dictionary objects (which you seriously don't want to for this) there's no faster way than doing a 'full scan' of all contained objects.
As a general rule of thumb, don't worry about performance in any language or any situation until the total number of iterations hits 5 digits, most often 6 or 7. Scanning a table of 100 elements should be a few milliseconds at worst. Worrying about performance impact before you have noticed performance impact is one of the worst kinds of premature optimization.
No, you can't know that without iterating the array.
However, note for...in loops are a bad way of iterating arrays:
There is no warranty that it will iterate the array with order
It will also iterate (enumerable) non-numeric own properties
It will also iterate (enumerable) properties that come from the prototype, i.e., defined in Array.prototype and Object.protoype.
I would use one of these:
for loop with a numeric index:
for (var i=0; i<tempListArray.length; ++i) {
if (tempListArray[i].id == Id) {
flagExistsLoop = 1;
break;
}
}
Array.prototype.some (EcmaScript 5):
var flagExistsLoop = tempListArray.some(function(item) {
return item.id == Id;
});
Note it may be slower than the other ones because it calls a function at each step.
for...of loop (EcmaScript 6):
for (var item of tempListArray) {
if (item.id == Id) {
flagExistsLoop = 1;
break;
}
}
Depending on your scenario, you may be able to use Array.indexOf() which will return -1 if the item is not present.
Granted it is probably iterating behind the scenes, but the code is much cleaner. Also note how object comparisons are done in javascript, where two objects are not equal even though their values may be equal. See below:
var tempListArray = [{"id":"12","value":false},{"id":"10","value":false},{"id":"9","value":false},{"id":"8","value":false}];
var check1 = tempListArray[2];
var check2 = {"id":"9","value":false};
doCheck(tempListArray, check1);
doCheck(tempListArray, check2);
function doCheck(array, item) {
var index = array.indexOf(item);
if (index === -1)
document.write("not in array<br/>");
else
document.write("exists at index " + index + "<br/>");
}
try to use php.js it may help while you can use same php function names and it has some useful functionalities
There is no way without iterating through the elements (that would be magic).
But, you could consider using an object instead of an array. The object would use the (presumably unique) id value as the key, and the value could have the same structure you have now (or without the redundant id property). This way, you can efficiently determine if the id already exists.
There is a possible cheat for limited cases :) and it is magic...cough cough (math)
imagine you have 3 elements:
1
2
3
and you want to know if one of these is in an array without iterating it...
we could make a number that contains a numerical flavor of the array. we do this by assigning prime numbers to the elements:
1 - 2
2 - 3
3 - 5
the array so when we add item 2 we check that the array doesn't already contain the prime associated to that item by checking (if Flavor!=0 && (Flavor%3)!=0) then adding the prime Flavor*=3;
now we can tell that the second element is in the array by looking at the number.
if Flavor!=0 && (Flavor%3)==0 // its There!
Of course this is limited to the numerical representation that can be handled by the computer. and for small array sizes (1-3 elements) it might still be faster to scan. but it's just one idea.
but the basis is pretty sound. However, this method becomes unusable if you cannot correlate elements one to one with a set of primes. You'll want to have the primes calculated in advance. and verify that the product of those is less numerical max numerical representation. (also be careful with floating-point. because they might not be able to represent the number at the higher values due to the gaps between representable values.) You probably have the best luck with an unsigned integer type.
This method will probably be too limiting. And there is something else you can do to possibly speed up your system if you don't want to iterate the entire array.
Use different structures:
dictionaries/maps/trees etc.
if your attached to the array another method can be a bloom filter. This will let you know if an element is not in your set, which can be just as useful.

JavaScript object vs. array lookup performance

What is the performance difference between retrieving the value by key in a JavaScript object vs iterating over an array of individual JavaScript objects?
In my case, I have a JavaScript object containing user information where the keys are the user's IDs and the values are each user's information.
The reason I ask this is because I would like to use the angular-ui-select module to select users, but I can't use that module with a Javascript object - it requires an array.
How much, if anything, am I sacrificing by switching from a lookup by key, to a lookup by iteration?
By key:
var user = users[id];
By iteration
var user;
for (var i = 0; i < users.length; i ++) {
if (users[i].id == id) {
user = users[i]; break;
}
}
The answer to this is browser dependent, however, there are a few performance tests on jsperf.com on this matter. It also comes down to the size of your data. Generally it is faster to use object key value pairs when you have large amounts of data. For small datasets, arrays can be faster.
Array search will have different performance dependent on where in the array your target item exist. Object search will have a more consistent search performance as keys doesn't have a specific order.
Also looping through arrays are faster than looping through keys, so if you plan on doing operations on all items, it can be wise to put them in an array. In some of my project I do both, since I need to do bulk operations and fast lookup from identifiers.
A test:
http://jsben.ch/#/Y9jDP
This problem touches all programming languages. It depends on many factors:
size of your collection -arrays will get slower when you are searching for the last key, and array is quite long
can elements repeat them selves-if yes, than you need a array. If no: you need either a dictionary (map) or you need to write a add method that for each add will iterate your array and find possible duplicates-that can be troublesome, when dealing with large lists
average key usage - you will lose performance, if the most requested userId is at the end of the list.
In your example map would be a better solution.
Secondly, you need to add a break to yor code:)
var user;
for (var i = 0; i < users.length; i ++) {
if (users[i].id == id) {
user = users[i]; break;
}
}
Or you will lose performance:)
associative arrays are much slower then arrays with numbered indexes, because associative arrays work by doing string comparisons, which are much, much slower then number comparisons!

Which javascript structure has a faster access time for this particular case?

I need to map specific numbers to string values. These numbers are not necessarily consecutive, and so for example I may have something like this:
var obj = {};
obj[10] = "string1";
obj[126] = "string2";
obj[500] = "string3";
If I'm doing a search like this obj[126] would it be faster for me to use an object {} or an array []?
There will be no difference. ECMAScript arrays, if sparse (that is don't have consecutive indices set) are implemented as hash tables. In any case, you are guaranteed the O(n) access time, so this shouldn't concern you at all.
I created a microbenchmark for you - check out more comprehensive test by #Bergi. On my browser object literal is a little bit slower, but not significantly. Try it yourself.
A JS-array is a object, so it should not matter what you choose.
Created a jsperf test (http://jsperf.com/array-is-object) to demonstrate this.
Definetely an object should be the best choice.
If you have such code:
var arr = [];
arr[10] = 'my value';
, your array becomes an array of 11 values
alert(arr.length); // will show you 11
, where first 10 are undefined.
Obviously you don't need an array of length 1000 to store just
var arr = [];
arr[999] = 'the string';
Also I have to notice that in programming you have to chose an appropriate classes for particular cases.
Your task is to make a map of key: value pairs and object is the better choice here.
If your task was to make an ordered collection, then sure you need an array.
UPDATE:
Answering to your question in comments.
Imagine that you have two "collections" - an array and an object. Each of them has only one key/index equal to 999.
If you need to find a value, you need to iterate through your collection.
For array you'll have 999 iterations.
For object - only one iteration.
http://jsfiddle.net/f0t0n/PPnKL/
var arrayCollection = [],
objectCollection = {};
arrayCollection[999] = 1;
objectCollection[999] = 1;
var i = 0,
l = arrayCollection.length;
for(; i < l; i++) {
if(arrayCollection[i] == 1) {
alert('Count of iterations for array: ' + i); // displays 999
}
}
i = 0;
for(var prop in objectCollection) {
i++;
if(objectCollection[prop] == 1) {
alert('Count of iterations for object: ' + i); // displays 1
}
}
​
Benchmark
​
In total:
You have to design an application properly and take into account possible future tasks which will require some different manipulations with your collection.
If you'll need your collection to be ordered, you have to chose an array.
Otherwise an object could be a better choice since the speed of access to its property is roughly same as a speed of access to array's item but the search of value in object will be faster than in sparse array.

How to document JavaScript/CoffeeScript data structures

I am looking for a descriptive way to document the used data structures in my JavaScript application. I find it hard to get this done due to the dynamic character of JavaScript.
For instance, what could be a good way to tell, that a used variable distance is a two-dimensional array with length i and j and stores numbers between -1 and MAX_INT. I could think of something like this:
distance[i][j] = -1 <= n <= MAX_INT
What about an object which is used as a map/dictionary for certain data types, what about a two-dimensional array where the first element of an array defines other data then the rest, etc.
Of course, it is always possible to document these things in a text, I just thought, maybe there is a well known and used way to do this in a semiformal way.
Although it's not too widely adopted (yet?), there is a draft standard for JSON schema. I'm just learning it myself but you could write a schema for your two-dimensional array (wrapped inside of an object) as:
{
"description":"Two dimensional array of numbers",
"type":"object",
"properties":{
"two-d-array":{
"description":"columns",
"type":"array",
"items":{
"description":"rows",
"type":"array",
"items": {
"description":"values",
"type":"number",
"minimum":-1,
"maximum":Number.MAX_VALUE
}
}
}
}
}
or simply:
{
"type":"array",
"items":{
"type":"array",
"items": {
"type":"number",
"minimum":-1,
"maximum":Number.MAX_VALUE
}
}
}
There is no CoffeeScript implementation that I know of, but there is a list of several JavaScript validators here. I'm playing with the one that's written by the spec authors called (simply enough) json-schema and I like it well enough calling it from CoffeeScript.
What I tend to do in my JavaScript when I am replicating a lot of data models is to write out what their class definition would be in comments. I am not sure if this is what you meant with your question.
// JavaScript Class jsHomeLocation
// jsHomeLocation{
// string name
// string address
// }
var jsHomeLocation = {};
jsHomeLocation.name = "Travis";
jsHomeLocation.address = "California";
You could also use javascript objects to track the information of the example, a two-dimensional array
var distanceData = {};
distanceData.type = "two-dimensional array";
distanceData.length = i * j;
distanceData.min = -1;
distanceData.max = MAX_INT;

JavaScript Multidimensional arrays - column to row

Is it possible to turn a column of a multidimensional array to row using JavaScript (maybe Jquery)? (without looping through it)
so in the example below:
var data = new Array();
//data is a 2D array
data.push([name1,id1,major1]);
data.push([name2,id2,major2]);
data.push([name3,id3,major3]);
//etc..
Is possible to get a list of IDs from data without looping? thanks
No, it is not possible to construct an array of IDs without looping.
In case you were wondering, you'd do it like this:
var ids = [];
for(var i = 0; i < data.length; i++)
ids.push(data[i][1]);
For better structural integrity, I'd suggest using an array of objects, like so:
data.push({"name": name1, "id": id1, "major":major1});
data.push({"name": name2, "id": id2, "major":major2});
data.push({"name": name3, "id": id3, "major":major3});
Then iterate through it like so:
var ids = [];
for(var i = 0; i < data.length; i++)
ids.push(data[i].id);
JavaScript doesn't really have multidimensional arrays. What JavaScript allows you to have is an array of arrays, with which you can interact as if it was a multidimensional array.
As for your main question, no, you would have to loop through the array to get the list of IDs. It means that such an operation cannot be done faster than in linear time O(n), where n is the height of the "2D array".
Also keep in mind that arrays in JavaScript are not necessarily represented in memory as contiguous blocks. Therefore any fast operations that you might be familiar with in other low level languages will not apply. The JavaScript programmer should treat arrays as Hash Tables, where the elements are simply key/value pairs, and the keys are the indices (0, 1, 2...). You can still access/write elements in constant time O(1) (at least in modern JavaScript engines), but copying of elements will often be done in O(n).
You could use the Array map function which does the looping for you:
var ids = data.map(function(x) { return x[1] });
Unfortunately, like everything else on the web that would be really nice to use, INTERNET EXPLORER DOESN'T SUPPORT IT.
See this page for details on how the map function works:
https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/map
The good news it that the link above provides some nice code in the "Compatibility" section which will check for the existence of Array.prototype.map and define it if it's missing.
You don't need anything special- make a string by joining with newlines, and match the middle of each line.
var data1=[['Tom Swift','gf102387','Electronic Arts'],
['Bob White','ea3784567','Culinarey Arts'],
['Frank Open','bc87987','Janitorial Arts'],
['Sam Sneer','qw10214','Some Other Arts']];
data1.join('\n').match(/([^,]+)(?=,[^,]+\n)/g)
/* returned value: (Array)
gf102387,ea3784567,bc87987
*/

Categories