javascript hash/array hybrid performance

javascript hash/array hybrid performance - javascript

For my application I need a collection that can do fast iteration and fast lookups by key.
Example Data
var data = [
{ myId: 4324, val: "foo"},
{ myId: 6280, val: "bar"},
{ myId: 7569, val: "baz"},
... x 100,000
];
The key is contained within the objects that I want to store. I hacked together Harray (Hash Array) https://gist.github.com/3451147.
Here's how you use it
// initialize with the key property name
var coll = new Harray("myId");
// populate with data
data.forEach(function(item){ coll.add(item); });
// key lookup
coll.h[4324] // => { myId: 4324, val: "foo"}
// array functionality
coll[1] // => { myId: 6280, val: "bar"}
coll.map(function(item){ return item.val; }); // => ["foo", "bar", "baz"]
coll.length // => 3
// remove value
coll.remove(coll[0]); // delete => { myId: 4324, val: "foo"}
// by key
coll.removeKey(7569) // delete => { myId: 7569, val: "baz"}
// by index
coll.removeAt(0); // delete => { myId: 6280, val: "bar"}
Removal speed seems to be the only tradeoff I can see. The stored objects are shared between the h Object and the Array so I'm not storing 2 copies of anything.
Question
Should I just stick with using for in to iterate through object properties?
Keep an array of the object's keys instead of the objects themselves?
other options?
Note: browser compatibility is not a factor. This is exclusively for chrome.

In order to know if a specific collection can be helpful, you must :
first verify there is a performance problem. If it's fast enough, don't worry. To check this, supposing the whole page is slow, use a profiler like the Chrome profiler to check the problem is in the collection you're currently using
then check the alternate collection you're building is really faster. To do this a common solution is to benchmark both solutions with a big enough data set using a site like http://jsperf.com/ (or simply by building your own timed tests).
Only after that should you work on ensuring your solution is API-complete, totally bug free, (using test units) and so on.
Doing the two checks I mentioned first might prevent useless work as the standard objects in the V8 engine are amazingly fast and smart.

Related

Remove a property from array of object using reduce API

I am trying to delete a property name from the array of object, it's working properly using filter API,
const users = [
{ name: 'Tyler', age: 28},
{ name: 'Mikenzi', age: 26},
{ name: 'Blaine', age: 30 }
];
const myProp = users.filter(function (props) {
delete props.name;
return true;
});
console.table(myProp);
const myProp2 = users.reduce((people, user) => {
console.log(people);
console.log(user);
delete user.name;
return people;
}, []);
console.log(myProp2);
The same example before I am trying complete using reduce API, However, it's not working as expected.

It's not working because your not pushing to the previous element (you are always returning the empty array). You need to change it to:
const myProp2 = users.reduce((people, user) => {
delete user.name;
people.push(user)
return people;
}, []);
Please note that is not the intended use for reduce though - map is the operation you are looking for:
const myProp2 = users.map(u=> ({age: u.age}));

You actually want to use map for this, because you are selecting a transormation of the data into a new object (similar to Select in SQL or LINQ)
const myProps = users.map(u=> ({age: u.age}))
Also although the filter method worked, this is actually abuse of the filter method. The filter method is supposed to remove elements from the array depending on a condition. Your method worked because you returned true (which removed no elements) but you modified the current value on each iteration.
This is bad practice because you will confuse the next person to look at your code, they will wonder why you used filter as a method to transform the data rather than map.
Also don't use reduce because reduce is an aggregation function intended to perform aggregate functions on objects. Since the number of elements you are returning will be the same, map is better for this.
Reduce would be better suited for if you wanted to find out the average,max,min,median age or the most popular name etc...

Storing things in objects vs arrays in JavaScript

This is more of a general question than a problem I need solved. I'm just a beginner trying to understand the proper way to do things.
What I want to know is whether or not I should only use objects as prototypes (if that's the correct term to use here) or whether or not it's OK to use them to store things.
As an example, in the test project I'm working on, I wanted to store some images for use later. What I currently have is something like:
var Images = {
james: "images/james.png",
karen: "images/karen.png",
mike: "images/mike.png"
};
Because I would know the position, I figure I could also put them in an array and reference the position in the array appropriately:
var images = ["images/james.png", "images/karen.png", "images/mike.png"];
images[0];
Using the object like this works perfectly fine but I'm wondering which is the more appropriate way to do this. Is it situational? Are there any performance reasons to do one over the other? Is there a more accepted way that, as a new programmer, I should get used to?
Thanks in advance for any advice.

Introduction
Unlike PHP, JavaScript does not have associative arrays. The two main data structures in this language are the array literal ([]) and the object literal ({}). Using one or another is not really a matter of style but a matter of need, so your question is relevant.
Let's make an objective comparison...
Array > Object
An array literal (which is indirectly an object) has much more methods than an object literal. Indeed, an object literal is a direct instance of Object and has only access to Object.prototype methods. An array literal is an instance of Array and has access, not only to Array.prototype methods, but also to Object.prototype ones (this is how the prototype chain is set in JavaScript).
let arr = ['Foo', 'Bar', 'Baz'];
let obj = {foo: 'Foo', bar: 'Bar', baz: 'Baz'};
console.log(arr.constructor.name);
console.log(arr.__proto__.__proto__.constructor.name);
console.log(obj.constructor.name);
In ES6, object literals are not iterable (according to the iterable protocol). But arrays are iterable. This means that you can use a for...of loop to traverse an array literal, but it will not work if you try to do so with an object literal (unless you define a [Symbol.iterator] property).
let arr = ['Foo', 'Bar', 'Baz'];
let obj = {foo: 'Foo', bar: 'Bar', baz: 'Baz'};
// OK
for (const item of arr) {
console.log(item);
}
// TypeError
for (const item of obj) {
console.log(item);
}
If you want to make an object literal iterable, you should define the iterator yourself. You could do this using a generator.
let obj = {foo: 'Foo', bar: 'Bar', baz: 'Baz'};
obj[Symbol.iterator] = function* () {
yield obj.foo;
yield obj.bar;
yield obj.baz;
};
// OK
for (const item of obj) {
console.log(item);
}
Array < Object
An object literal is better than an array if, for some reason, you need descriptive keys. In arrays, keys are just numbers, which is not ideal when you want to create an explicit data model.
// This is meaningful
let me = {
firstname: 'Baptiste',
lastname: 'Vannesson',
nickname: 'Bada',
username: 'Badacadabra'
};
console.log('First name:', me.firstname);
console.log('Last name:', me.lastname);
// This is ambiguous
/*
let me = ['Baptiste', 'Vannesson', 'Bada', 'Badacadabra'];
console.log('First name:', me[0]);
console.log('Last name:', me[1]);
*/
An object literal is extremely polyvalent, an array is not. Object literals make it possible to create "idiomatic" classes, namespaces, modules and much more...
let obj = {
attribute: 'Foo',
method() {
return 'Bar';
},
[1 + 2]: 'Baz'
};
console.log(obj.attribute, obj.method(), obj[3]);
Array = Object
Array literals and object literals are not enemies. In fact, they are good friends if you use them together. The JSON format makes intensive use of this powerful friendship:
let people = [
{
"firstname": "Foo",
"lastname": "Bar",
"nicknames": ["foobar", "barfoo"]
},
{
"firstName": "Baz",
"lastname": "Quux",
"nicknames": ["bazquux", "quuxbaz"]
}
];
console.log(people[0].firstname);
console.log(people[0].lastname);
console.log(people[1].nicknames[0]);
In JavaScript, there is a hybrid data structure called array-like object that is extensively used, even though you are not necessarily aware of that. For instance, the good old arguments object within a function is an array-like object. DOM methods like getElementsByClassName() return array-like objects too. As you may imagine, an array-like object is basically a special object literal that behaves like an array literal:
let arrayLikeObject = {
0: 'Foo',
1: 'Bar',
2: 'Baz',
length: 3
};
// At this level we see no difference...
for (let i = 0; i < arrayLikeObject.length; i++) {
console.log(arrayLikeObject[i]);
}
Conclusion
Array literals and object literals have their own strengths and weaknesses, but with all the information provided here, I think you can now make the right decision.
Finally, I suggest you to try the new data structures introduced by ES6: Map, Set, WeakMap, WeakSet. They offer lots of cool features, but detailing them here would bring us too far...

Actually, the way you declared things brings up the "difference between associative arrays and arrays".
An associative array, in JS, is really similar to an object (because it's one):
When you write var a = {x:0, y:1, z:3} you can access x using a.x(object) or a["x"](associative array).
On the other hand, regular arrays can be perceived as associative arrays that use unsigned integers as ID for their indexes.
Therefore, to answer your question, which one should we pick ?
It depends : I would use object whenever I need to put names/labels on thing (typically not for a collection of variables for instance). If the type of the things you want to store is homogeneous you will probably use an array (but you can still go for an object if you really want to), if some of/all your things have a different type than you should go for an object (but in theory you could still go for an array).
Let's see this :
var a = {
x:0,
y:0,
z:0
}
Both x,y,z have a different meaning (components of a point) therefore an object is better (in terms of semantic) to implement a point.
Because var a = [0,0,0] is less meaningful than an object, we will not go for an array in this situation.
var storage = {
one:"someurl",
two:"someurl2",
three:"someurl3",
}
Is correct but we don't need an explicit name for every item, therefore we might choose var storage = ["someurl","someurl2","someurl3"]
Last but not least, the "difficult" choice :
var images = {
cathy: "img/cathy",
bob: "img/bob",
randompelo: "img/randompelo"
}
and
var images = ["img/cathy","img/bob","img/randompelo"]
are correct but the choice is hard. Therefore the question to ask is : "Do we need a meaningful ID ?".
Let's say we work with a database, a meaningful id would be better to avoid dozens of loops each time you wanna do something, on the other hand if it's just a list without any importance (index is not important, ex: create an image for each element of array) maybe we could try and go for an array.
The question to ask when you hesitate between array and object is : Are keys/IDs important in terms of meaning ?
If they are then go for an object, if they're not go for an array.

You're correct that it would be situational, but in general its not a good idea to limit your program by only allowing a finite set of supported options like:
var Images = {
james: "images/james.png",
karen: "images/karen.png",
mike: "images/mike.png"
};
Unless, of course, you happen to know that these will be the only cases which are possible - and you actively do not want to support other cases.
Assuming that you dont want to limit the possibilities, then your array approach would be just fine - although personally I might go with an array of objects with identifiers, so that you arent forced to track the index elsewhere.
Something like:
var userProfiles = [
{"username": "james", "image": "images/james.png"},
{"username": "karen", "image": "images/karen.png"},
{"username": "mike", "image": "images/mike.png"}
];

Array.splice (+ loops) vs Delete Object[...] vs Object[...] = undefined

So I have a bunch of (10000+) objects that I need to remove/replace for simplicity, we can presume that the object contains a (String) Unique 'id'.
The items often need to be renamed (change id), but not as often as it's looked up
{ id: 'one' }, { id: 'two' }, ...
If I place them inside an 'associative array' (bad practise), I can access them quickly, but need to loop (slow) to remove (NOTE: This doesn't actually work, because findIndex only works correctly on proper arrays, but a for loop would do the same thing)
arr = [];
arr['one'] = { id: 'one' };
arr['two'] = { id: 'two' };
arr.splice(arr.findIndex(function(i) { return i.id === 'one'; }), 1);
If I place them in a normal array, I have to loop (slow) to find the item by ID, and deleting would require a loop (slow) as well (Edit: In my particular case deleting it should be relatively quick as I'll have already looked it up and have a reference, but obviously slower if I lose reference)
arr = [{ id: 'one', }, { id: 'two' }];
arr.splice(arr.findIndex(function(i) { return i.id === 'one'; }), 1);
or, if I store them the obviously correct way, I have the choice of using the delete keyword (which I've always been told is slow and breaks optimisations), or setting as undefined (which leaves me with a lot of elements that exist - memory leaks? and slower loops)
obj = { one: { id: one }, two: { id: two } };
delete obj['one'];
...
obj = { one: { id: one }, two: { id: two } };
obj['one'] = undefined;
I'm thinking that delete object[...] is the best choice, but I'm interested in other's feedback. Which should I use, and why?

There's a difference in the three methods.
Array.splice removes an object, and pushes every element after this 1 back, so the indexing doesn't get cut.
Delete tries to delete an object, but may fail. This doesn't free up memory, it only breaks the reference. The garbage collector can free up the corresponding memory later.
Setting a variable to undefined pretty much marks the object to deletion for the garbage collector. It won't happen instantly, only whenever JavaScript feels so. If you set enough variables to undefined, then this method pretty much achieves the same thing as deleting the objects.
Setting a variable to undefined is not the same as deleting it, if you use delete, the you may encounter errors, when you try to reach that variable again, this won't happen when you set it to undefined.

How to iterate over an array of objects in JavaScript?

I'm using PHP to fetch "tasks" from my database and encoding it as JSON. When I transfer the data over to javascript, I end up with something like this:
Array {
[0] => Task {
id: 2,
name: 'Random Task',
completed: 0
}
[1] => Task {
id: 8,
name: 'Another task',
completed: 1
}
}
etc.
I guess my real question is, what's the most efficient way to find the task by its id? Iterating through the array and checking each object seems like it might not be the most efficient? Is there any other way to do this?

The thing about Javascript objects is that they are essential maps. You can access properties through using both dot notation ("object.property") and also index notation ("object["property"]). You can also enumerate through its properties, either using a for (i...) or for (in...)
for (var i = 0; i < arrayObj.length; i++) { ... }
for (var prop in arrayObj) { ... }
What I have been doing recently is building some Linq-esque extensions to the array object:
Array.prototype.Where = function(predicate) {
Throw.IfArgumentNull(predicate, "predicate");
Throw.IfNotAFunction(predicate, "predicate");
var results = new Array();
for (var i = 0; i < this.length; i++) {
var item = this[i];
if (predicate(item))
results.push(item);
}
return results;
};
Ignoring my custom Throw type, it basically allows you do to something like:
var item = arrayObj.Where(function(i) { return (i.id == 8); }).FirstOrDefault();
I'll publish it all at some point if you are interested?

Usually the most efficient way to iterate over an array collection in Javascript is to stick to the native for loop. The reason I say "usually" is that the implementation comes down to each unique browser's implementation of javascript so there is no absolute definitive answer.
There's a nice post at http://solutoire.com/2007/02/02/efficient-looping-in-javascript/ which covers the performance of each of the main iteration methods and empirically comes to the same conclusion.

If you don't need to maintain order, then the best way is to a regular object, and index by task id. That gives you O(1) access.
var tasks = {
'2': {
id: 2,
name: 'Random Task',
completed: 0
},
...
}
If you also need ordering maintained, then write an OrderedMap "class" that maintains the order by creating an array of task ids, but the actual tasks will still be stored in an object indexed by task id. So essentially you would have:
// internal API (to help maintain order)
taskIDs = [a, b, c, ..];
// internal API (for actual storage)
tasks = {
a: { .. },
b: { .. },
};
// external API for iterating objects in order
forEach(fn);
// external API for accessing task by ID
get(id);
The outside world can be ignorant of how you maintain order as long as you provide a nice encapsulated way of iterating these in order, and accessing them by task id.
If you need reference for implementing such a class, see the source for LinkedMap from Google Closure Library.

Just a little more food for thought, this is what I ended up with:
this.find = function (test) {
var results = [];
for (var i = 0,l = this.tasks.length; i < l; i++) {
var t = this.tasks[i];
if (eval(test)) {
results.push(this.tasks[i]);
}
}
return results;
}
this allows me to do a simple tasks.find('t.id == 2') or tasks.find('t.completed == 1');

If id is unique (and mostly continuous) you can do a one time rearrange of the array so that array index reflects the id. If they're not unique, you can sort them and do a binary search.
But this would be useful only if you access the items by id from the array frequently, otherwise the overhead of sorting won't be worth it.

Is your array large? If not, you probably won't win many microseconds on optimizing it.
If it is large, you should really return a dictionary instead (As Matthew Flaschen commented), which uses the task's ID as key. In this way you'll get constant time lookup (atleast if the javascript implementation is optimal).
Just use a ordinary PHP associative array, and run it through json_encode or whatever you're using.
//Assume you have your original Array named $tasks:
$dictionary = Array();
foreach($tasks as $task)
$dictionary[$task->getID()] = $task;

How to get javascript object references or reference count?

How to get reference count for an object
Is it possible to determine if a javascript object has multiple references to it?
Or if it has references besides the one I'm accessing it with?
Or even just to get the reference count itself?
Can I find this information from javascript itself, or will I need to keep track of my own reference counters.
Obviously, there must be at least one reference to it for my code access the object. But what I want to know is if there are any other references to it, or if my code is the only place it is accessed. I'd like to be able to delete the object if nothing else is referencing it.
If you know the answer, there is no need to read the rest of this question. Below is just an example to make things more clear.
Use Case
In my application, I have a Repository object instance called contacts that contains an array of ALL my contacts. There are also multiple Collection object instances, such as friends collection and a coworkers collection. Each collection contains an array with a different set of items from the contacts Repository.
Sample Code
To make this concept more concrete, consider the code below. Each instance of the Repository object contains a list of all items of a particular type. You might have a repository of Contacts and a separate repository of Events. To keep it simple, you can just get, add, and remove items, and add many via the constructor.
var Repository = function(items) {
this.items = items || [];
}
Repository.prototype.get = function(id) {
for (var i=0,len=this.items.length; i<len; i++) {
if (items[i].id === id) {
return this.items[i];
}
}
}
Repository.prototype.add = function(item) {
if (toString.call(item) === "[object Array]") {
this.items.concat(item);
}
else {
this.items.push(item);
}
}
Repository.prototype.remove = function(id) {
for (var i=0,len=this.items.length; i<len; i++) {
if (items[i].id === id) {
this.removeIndex(i);
}
}
}
Repository.prototype.removeIndex = function(index) {
if (items[index]) {
if (/* items[i] has more than 1 reference to it */) {
// Only remove item from repository if nothing else references it
this.items.splice(index,1);
return;
}
}
}
Note the line in remove with the comment. I only want to remove the item from my master repository of objects if no other objects have a reference to the item. Here's Collection:
var Collection = function(repo,items) {
this.repo = repo;
this.items = items || [];
}
Collection.prototype.remove = function(id) {
for (var i=0,len=this.items.length; i<len; i++) {
if (items[i].id === id) {
// Remove object from this collection
this.items.splice(i,1);
// Tell repo to remove it (only if no other references to it)
repo.removeIndxe(i);
return;
}
}
}
And then this code uses Repository and Collection:
var contactRepo = new Repository([
{id: 1, name: "Joe"},
{id: 2, name: "Jane"},
{id: 3, name: "Tom"},
{id: 4, name: "Jack"},
{id: 5, name: "Sue"}
]);
var friends = new Collection(
contactRepo,
[
contactRepo.get(2),
contactRepo.get(4)
]
);
var coworkers = new Collection(
contactRepo,
[
contactRepo.get(1),
contactRepo.get(2),
contactRepo.get(5)
]
);
contactRepo.items; // contains item ids 1, 2, 3, 4, 5
friends.items; // contains item ids 2, 4
coworkers.items; // contains item ids 1, 2, 5
coworkers.remove(2);
contactRepo.items; // contains item ids 1, 2, 3, 4, 5
friends.items; // contains item ids 2, 4
coworkers.items; // contains item ids 1, 5
friends.remove(4);
contactRepo.items; // contains item ids 1, 2, 3, 5
friends.items; // contains item ids 2
coworkers.items; // contains item ids 1, 5
Notice how coworkers.remove(2) didn't remove id 2 from contactRepo? This is because it was still referenced from friends.items. However, friends.remove(4) causes id 4 to be removed from contactRepo, because no other collection is referring to it.
Summary
The above is what I want to do. I'm sure there are ways I can do this by keeping track of my own reference counters and such. But if there is a way to do it using javascript's built-in reference management, I'd like to hear about how to use it.

No, no, no, no; and yes, if you really need to count references you will have to do it manually. JS has no interface to this, GC, or weak references.
Whilst you could implement a manual reference-counted object list, it's questionable whether all the extra overhead (in performance terms but more importantly code complexity) is worth it.
In your example code it would seem simpler to forget the Repository, use a plain Array for your lists, and let standard garbage collection take care of dropping unused people. If you needed to get a list of all people in use, you'd just concat the friends and coworkers lists (and sort/uniquify them if you needed to).

You may interest to look into reduce functions, and array.map functions. map could be used to help identify where your collections intersect, or if there is an intersection at all. A user defined reduce function could be used like a merge (kinda like overriding the addition operator so that you can apply operation to objects, or merge all collections on "id" if that is how you define your reduce function - then assign the result to your master reference array, I recommend keeping a shadow array that holds all of the root object/values in case you would like to REWIND or something). Note: one must be careful of prototype chains when reducing an object or array. The map function will be very helpful in this case.
I would suggest not to remove the object or record that is in your Repository as you may want to reference it again later. My approach would be to create a ShadowRepository that would reflect all records/objects that have at least one "Reference". From your description and code presented here it appears you are initializing all of the data and storing reference to 1,2,4,5 as appears in your code.
var contactRepo = new Repository([
{id: 1, name: "Joe"},
{id: 2, name: "Jane"},
{id: 3, name: "Tom"},
{id: 4, name: "Jack"},
{id: 5, name: "Sue"}
]);
var friends = new Collection(contactRepo,[
contactRepo.get(2),
contactRepo.get(4)
]);
var coworkers = new Collection(contactRepo,[
contactRepo.get(1),
contactRepo.get(2),
contactRepo.get(5)
]);
From the initialization of the Repository and the collections, what you are asking "Remove item from repository if there are no references to it" item 3 would need to be removed immediatly. You can however track the references in a few different ways.
I have considered using Object.observe for a similar situation. However, Object.observe does not work in all browsers. I have recently turned to WatchJS
I am working on understanding the code behind Watch.JS to allow a list of observers on an object to be created dynamically this would allow one to also remove an item that is no longer watched, though I suggest to remove the reference at the point of access - What I mean is a variable that shares the immediate lexical scope with an object that has given a single point of reference to it's sibling can be removed making it no longer accessable outside of the object that had exposed the record/item/property/object of it's sibling. With the reference that all of your other references depended on removed access to the underlying data is stopped. I am generating unique id for origin references to avoid accidentally reusing the same one.
Thank you for sharing your question and the structure you are using, it has helped me consider one of my own specific cases where I was generating uniquely identified references to a lexical sibling these unique ids were kept on the ONE object that had scope, After reading here I have reconsidered and decided to expose only one reference then assign that reference to a variable name where ever it is needed such as in creating a watcher or observer or other Collection.

We Keep Coding

JavaScript is the programming language of the Web.

javascript hash/array hybrid performance - javascript

Related

Remove a property from array of object using reduce API

Storing things in objects vs arrays in JavaScript

Array.splice (+ loops) vs Delete Object[...] vs Object[...] = undefined

How to iterate over an array of objects in JavaScript?

How to get javascript object references or reference count?

Categories

Resources