Array.splice (+ loops) vs Delete Object[...] vs Object[...] = undefined - javascript

So I have a bunch of (10000+) objects that I need to remove/replace for simplicity, we can presume that the object contains a (String) Unique 'id'.
The items often need to be renamed (change id), but not as often as it's looked up
{ id: 'one' }, { id: 'two' }, ...
If I place them inside an 'associative array' (bad practise), I can access them quickly, but need to loop (slow) to remove (NOTE: This doesn't actually work, because findIndex only works correctly on proper arrays, but a for loop would do the same thing)
arr = [];
arr['one'] = { id: 'one' };
arr['two'] = { id: 'two' };
arr.splice(arr.findIndex(function(i) { return i.id === 'one'; }), 1);
If I place them in a normal array, I have to loop (slow) to find the item by ID, and deleting would require a loop (slow) as well (Edit: In my particular case deleting it should be relatively quick as I'll have already looked it up and have a reference, but obviously slower if I lose reference)
arr = [{ id: 'one', }, { id: 'two' }];
arr.splice(arr.findIndex(function(i) { return i.id === 'one'; }), 1);
or, if I store them the obviously correct way, I have the choice of using the delete keyword (which I've always been told is slow and breaks optimisations), or setting as undefined (which leaves me with a lot of elements that exist - memory leaks? and slower loops)
obj = { one: { id: one }, two: { id: two } };
delete obj['one'];
...
obj = { one: { id: one }, two: { id: two } };
obj['one'] = undefined;
I'm thinking that delete object[...] is the best choice, but I'm interested in other's feedback. Which should I use, and why?

There's a difference in the three methods.
Array.splice removes an object, and pushes every element after this 1 back, so the indexing doesn't get cut.
Delete tries to delete an object, but may fail. This doesn't free up memory, it only breaks the reference. The garbage collector can free up the corresponding memory later.
Setting a variable to undefined pretty much marks the object to deletion for the garbage collector. It won't happen instantly, only whenever JavaScript feels so. If you set enough variables to undefined, then this method pretty much achieves the same thing as deleting the objects.
Setting a variable to undefined is not the same as deleting it, if you use delete, the you may encounter errors, when you try to reach that variable again, this won't happen when you set it to undefined.

Related

Is it cheating to wrap a large array in an object to avoid deep cloning in React?

I am very confused.
Say I have a messages array that is structured like this:
this.state.messages = [
{ message_id: 0, body: "hello", author: "john" },
{ message_id: 1, body: "how", author: "lilly" },
{ message_id: 2, body: "are", author: "abe" },
{ message_id: 3, body: "you", author: "josh" }
];
and the user wants to edit the third message. React says to be immutable. That is, it is bad to do this:
this.state.messages[2].body = "test";
this.forceUpdate();
The problem with this is that React can't tell the messages state variable changed, because its reference is still the same. Firstly, does this even matter? I'm calling forceUpdate anyways, so it's going to re-render regardless. Shouldn't this only matter if you are using a PureComponent or a custom shouldComponentUpdate function?
Okay, well let's say it does matter (for some reason). Then it's advised to do a deep copy so that React knows the object changed. But how deep is deep enough?
For example, is it enough to do this?
let x = this.state.messages.slice(0);
x[2].body = "test";
this.setState({ messages: x });
This copies over the references to all the array elements, but it's not entirely deep! Because (before we do setState of course), this.state.messages[0].body === x[0].body. The string was not copied. It is still sharing internal state with the previous object. It is not a fully deep copy.
Okay, but if we don't need to do a full deep copy and the only thing that matters is that the parent node reference changes, then can we not just cheat and do this?
this.state.messages = { a: [
{ message_id: 0, body: "hello", author: "john" },
{ message_id: 1, body: "how", author: "lilly" },
{ message_id: 2, body: "are", author: "abe" },
{ message_id: 3, body: "you", author: "josh" }
]};
let x = { a: this.state.messages.a };
this.setState({ messages: x });
Now instead of copying over all the message references (of which there may be thousands), it is a single pointer change. But both messages are not deep copies. If the slice version is okay then should this not also be okay? Neither are deep copies it seems.
And if this latter method is okay, is there a more elegant way of writing it such that you don't need an arbitrary container object (a in this case)? A way to just change the pointer somehow to signal to React that the contents changed?
Edit: Okay, I don't think I explained myself well enough. Sorry, let me try again. Forget about performance for a second. What I'm wondering is: do you need to do deep clones or not for state updates? For example, what would be the proper way to update a body in the messages array in the example posted above? .slice(0) is not a deep copy, because it is still sharing internal structure (it is only copying references). Is this okay? Or do you need to do a deep copy to be proper? And if it IS okay, then should it not also be okay to just have a wrapper object and just only change THAT pointer instead?
Further Edit: I'm not sure if I'm just not explaining myself properly or if I am missing something very obvious. Does React need deep clones or not? I feel like this is an either-or sort of thing. I find it very doubtful that React requires 70% of a deep clone, but not 30%. Either it needs a full deep clone or it doesn't. If it doesn't, then shouldn't just changing a single wrapper pointer be enough? And if it isn't, then isn't slice(0) and Object.assign also not enough, because they are also shallow clones? In the case of these clones the internal objects still maintain the same structure (for example, the String references don't change).
You shouldn't be using force updates.
You shouldn't mutate the state object - it should be immutable for a reason (easier to debug, better performance, etc)
https://redux.js.org/faq/immutable-data#what-are-the-benefits-of-immutability
https://medium.com/#kkranthi438/dont-mutate-state-in-react-6b25d5e06f42
https://daveceddia.com/why-not-modify-react-state-directly/
Do not worry about performance unless there is a real performance issue - premature optimizations like this could actually harm performance, and slow your development (it makes your app harder to debug).
"Thousands of objects" is nothing to modern day computers / mobile devices. There are many other things you could try before breaking the immutability convention - you could try to split your state in smaller chunks, use selectors, better organize your components so not everything has to be re-rendered when only a small chunk of your state changed.
It's usually better to use flat / simple objects for your state - so it's much easier to create new copies of them. (when you do Object.assign or { ...foo } you are only getting shallow copies.) If you can't do away with deep nested objects you could try to add a 3rd party library like immutability-helper, or lodash.cloneDeep
In short, only update your state with setState - try to avoid forceUpdate and things like this.state.foo = bar. Even if some deeply nested objects are still referencing their old state, as long as you are following the rules you should be ok in most cases.
However do try to keep your state objects shallow whenever possible.
In your example, you mentioned this.state.messages[0].body === x[0].body the string is indeed copied. Strings are always copied in JS. your expression is comparing two string values - not their references.
// Given:
let obj = { foo: { bar: 'baz' } };
let fooObj = obj.foo; // fooObj is a reference to obj.foo;
let str = obj.foo.bar; // str is a COPY of the string 'baz';
str === fooOjb.bar // true, you are comparing their values, not references.
obj.foo.bar = 'baz2';
fooObj === obj.foo; // true, because you are comparing their references.
str === obj.foo.bar // false - str value does not change when obj.foo.bar changes.
Is immutability absolutely required in react?
Short answer: NO. You can do whatever you want. React is not going to throw errors at you when parts of your new state is still referencing to the old state.
However, never mutate your state directly. Always do it via setState. Don't worry too much about if your state object is 100% deep cloned or not. As long as you can make sure no parts of your app modifies your state, React can handle the rest.
There is a really good example of storing Todo list as a map of todo items and list of its' ids:
const state = {
todos: {
0: "Buy a milk",
1: "Buy a bread",
},
todosList: [0, 1]
}
Maybe you should split your state and store messages as map with ID as a key. It gives you a O(1) time for adding, changing and deleting messages.
And if you would store a messages map as a separate state, you'll be able to change any message like this:
this.setState({
[messageId]: newValue
})
Stop mutating your data!
One of React's primary features is doing change detection on data held in the 'state' object. The key to this change detection is to make sure you are always using previous state, modifying whatever data you need and React will handle the change detection, then update the UI accordingly and as efficiently as possible.
this.setState((prevState)=> {
return {
...prevState,
messages: prevState.messages.map((obj,i)=> i === 2 ? { ...obj, body:"test" } : obj)
}
})

delete property vs creating new object

So I have an object as follows:
var data = [
{
unmatchedLines: [], //possible big array ~700+ lines per entry
statusCode: 200,
title: "title",
message: "some message",
details: [],
objectDetails: []
},
{
//same object here
}
];
This array gets filled by service calls and then looped through for merging the output before sending this result back to the frontend.
function convertResultToOneCsvOutput(data) {
var outPutObject = {
unmatchedLines: [],
responses: []
};
for(var i = 0; i < data.length; i++) {
if(data[i].fileData) {
outPutObject.unmatchedLines = outPutObject.unmatchedLines.concat(data[i].fileData);
}
outPutObject.responses.push({
statusCode: data[i].statusCode,
title: data[i].title,
message: data[i].message,
details: data[i].details,
objectDetails: data[i].objectDetails,
})
}
return outPutObject;
};
Now I was first using delete to get rid of the unmatchedLines from the data object so that, in stead of creating a whole new object and pushing this to the output I could do:
delete data[i].unmatchedLines;
outPutObject.responses.push(data[i]);
But then I read that delete is very slow and stumbled upon a post stating that putting this to undefined in stead of using delete would be faster.
My question:
What is better to use in the end? delete, setting to undefined or creating a new object in itself like I am doing right now?
Neither is "better."
It is true that when you delete a property from an object, on many modern engines that puts the object into a slower "dictionary mode" than it would be if you didn't delete the property from it, meaning that subsequent property lookups on that object (data[i] in your case) will be slower than they were before the delete. So setting the property to undefined instead (which is not quite the same thing) may be appropriate in those very rare situations where the speed of property access on the object matters.
It could be true, but it really depends on the length of the list, however:
any performance difference between the two techniques is going to be
immeasurably small in any typical case.
( more info about delete )
Anyway: if you really need extra performance, you should consider dealing with clones and an immutable data-set ( more info about immutable performances )

Understanding lists in JavaScript

I am reading through Eloquent JavaScript and have been stuck trying to understand lists for about two days so I figured I would finally ask a question. The example they give in the book is:
var list = {
value: 1,
rest: {
value: 2,
rest: {
value: 3,
rest: null
}
}
};
Now I think I understand the example... There is a list object and it has properties value and rest. Then, rest has properties of value and rest, etc... However, I don't understand what rest is or even stands for. Does the rest property contain an object? So, list.rest.value would == 2? How is this useful? Some ways I could see this as useful are having a list Car, with prop engine, gauge, etc, with further properties of accelerate, brake, low fuel... How would something like this be achieved?
I do apologize for the "all overness" of this post, I don't exactly know what to ask or how to phrase it. It seems like the book only explained objects and properties, but never actually having objects as an objects property.
Thank you all in advance, and if you need any clarification or more info I will try to provide it.
This code simply uses JavaScript Object Notion to define an object named list.
// Would simply define an empty object.
var list = {};
Now you can add some properties to the object.
// Would define an object with a single property: `value`.
var list = {
value: 1
};
Using nested object declarations, you can give the list object child objects as well:
var list = {
value: 1,
rest: {}
};
Now list.rest is an empty object. You can fill that out by adding some properties:
var list = {
value: 1,
rest: {
value: 2
}
};
And your nesting can continue ad-infinitum. The object in your original post, the following is possible:
console.log(list.value); // 1
console.log(list.rest.value); // 2
console.log(list.rest.rest.value); // 3
It's important to understand that this in no way creates a class or includes any additional methods with the object. It seems to be structured as a linked list but provides no functionality to add/remove/modify (except by directly modifying the original object).
In the example above the list variable is an associative array. This is JavaScript's version of an "object". While the property list.value ends up being typed as an integer, the property list.rest is typed as a nested associative array. The properties themselves can be any valid type. Many jQuery plugins are coded where the properties themselves are actually delegate functions.
The object you have described above in the example does not seem to me to be terribly useful beyond being an example of how this kind of object can contain references to other objects. However, when you begin applying this in an "object oriented" concept (keep in mind that it is not truly object oriented), it becomes more useful. You can then create your own "namespace" with properties, functions and delegates that can be re-used time and again.
Thank you all for your information. I don't know if there is a best answer selection on this site or not, but I really do appreciate the help Justin, Joel, and Evan. I think the main part I was confused about is just practical application for real applications. I have messed around a little bit and came up with this and have a much better basic understanding now:
var car = {
engine: {
turn_on: "Turned engine on",
turn_off: "Turned engine off",
desc: {
size: "V6",
year: 2000
}
},
fuel: {
level: 55
}
};
function CheckFuel(fuel){
if(fuel > 50){
console.log("In good shape");
}
else{
console.log("We should fuel up");
}
}
console.log(car.engine.turn_on);
console.log(car.engine.turn_off);
console.log(car.engine.desc.size);
console.log(car.engine.desc.year);
CheckFuel(car.fuel.level);
Now time to practice iterating through. Thanks again!
This is an implementation of a linked list. Each node in the list has a reference to the next node. 'Rest' is an object (the next node in the list) that also contains every other node in the list (via it's rest property).
The first value in the list would be list.value;. The second value in the list would be list.rest.value;. The items in the list can be shown as:
item1 = list;
item2 = list.rest;
item3 = item2.rest;
This continues until itemX.rest is null.
These two functions could be used to manage the list and may help you understand how iterating through it would work:
function addToList(item)
{
if(!list)
{
list = item;
return;
}
var temp = list;
while(temp.rest)
{
temp = temp.rest;
}
temp.rest = item;
}
function printList()
{
var temp = list;
while (temp)
{
print temp.value; //i'm not sure what the javascript print function is
temp = temp.rest
}
}
The add function would be called like this: addToList({ value:10, rest:null });

javascript hash/array hybrid performance

For my application I need a collection that can do fast iteration and fast lookups by key.
Example Data
var data = [
{ myId: 4324, val: "foo"},
{ myId: 6280, val: "bar"},
{ myId: 7569, val: "baz"},
... x 100,000
];
The key is contained within the objects that I want to store. I hacked together Harray (Hash Array) https://gist.github.com/3451147.
Here's how you use it
// initialize with the key property name
var coll = new Harray("myId");
// populate with data
data.forEach(function(item){ coll.add(item); });
// key lookup
coll.h[4324] // => { myId: 4324, val: "foo"}
// array functionality
coll[1] // => { myId: 6280, val: "bar"}
coll.map(function(item){ return item.val; }); // => ["foo", "bar", "baz"]
coll.length // => 3
// remove value
coll.remove(coll[0]); // delete => { myId: 4324, val: "foo"}
// by key
coll.removeKey(7569) // delete => { myId: 7569, val: "baz"}
// by index
coll.removeAt(0); // delete => { myId: 6280, val: "bar"}
Removal speed seems to be the only tradeoff I can see. The stored objects are shared between the h Object and the Array so I'm not storing 2 copies of anything.
Question
Should I just stick with using for in to iterate through object properties?
Keep an array of the object's keys instead of the objects themselves?
other options?
Note: browser compatibility is not a factor. This is exclusively for chrome.
In order to know if a specific collection can be helpful, you must :
first verify there is a performance problem. If it's fast enough, don't worry. To check this, supposing the whole page is slow, use a profiler like the Chrome profiler to check the problem is in the collection you're currently using
then check the alternate collection you're building is really faster. To do this a common solution is to benchmark both solutions with a big enough data set using a site like http://jsperf.com/ (or simply by building your own timed tests).
Only after that should you work on ensuring your solution is API-complete, totally bug free, (using test units) and so on.
Doing the two checks I mentioned first might prevent useless work as the standard objects in the V8 engine are amazingly fast and smart.

How to get javascript object references or reference count?

How to get reference count for an object
Is it possible to determine if a javascript object has multiple references to it?
Or if it has references besides the one I'm accessing it with?
Or even just to get the reference count itself?
Can I find this information from javascript itself, or will I need to keep track of my own reference counters.
Obviously, there must be at least one reference to it for my code access the object. But what I want to know is if there are any other references to it, or if my code is the only place it is accessed. I'd like to be able to delete the object if nothing else is referencing it.
If you know the answer, there is no need to read the rest of this question. Below is just an example to make things more clear.
Use Case
In my application, I have a Repository object instance called contacts that contains an array of ALL my contacts. There are also multiple Collection object instances, such as friends collection and a coworkers collection. Each collection contains an array with a different set of items from the contacts Repository.
Sample Code
To make this concept more concrete, consider the code below. Each instance of the Repository object contains a list of all items of a particular type. You might have a repository of Contacts and a separate repository of Events. To keep it simple, you can just get, add, and remove items, and add many via the constructor.
var Repository = function(items) {
this.items = items || [];
}
Repository.prototype.get = function(id) {
for (var i=0,len=this.items.length; i<len; i++) {
if (items[i].id === id) {
return this.items[i];
}
}
}
Repository.prototype.add = function(item) {
if (toString.call(item) === "[object Array]") {
this.items.concat(item);
}
else {
this.items.push(item);
}
}
Repository.prototype.remove = function(id) {
for (var i=0,len=this.items.length; i<len; i++) {
if (items[i].id === id) {
this.removeIndex(i);
}
}
}
Repository.prototype.removeIndex = function(index) {
if (items[index]) {
if (/* items[i] has more than 1 reference to it */) {
// Only remove item from repository if nothing else references it
this.items.splice(index,1);
return;
}
}
}
Note the line in remove with the comment. I only want to remove the item from my master repository of objects if no other objects have a reference to the item. Here's Collection:
var Collection = function(repo,items) {
this.repo = repo;
this.items = items || [];
}
Collection.prototype.remove = function(id) {
for (var i=0,len=this.items.length; i<len; i++) {
if (items[i].id === id) {
// Remove object from this collection
this.items.splice(i,1);
// Tell repo to remove it (only if no other references to it)
repo.removeIndxe(i);
return;
}
}
}
And then this code uses Repository and Collection:
var contactRepo = new Repository([
{id: 1, name: "Joe"},
{id: 2, name: "Jane"},
{id: 3, name: "Tom"},
{id: 4, name: "Jack"},
{id: 5, name: "Sue"}
]);
var friends = new Collection(
contactRepo,
[
contactRepo.get(2),
contactRepo.get(4)
]
);
var coworkers = new Collection(
contactRepo,
[
contactRepo.get(1),
contactRepo.get(2),
contactRepo.get(5)
]
);
contactRepo.items; // contains item ids 1, 2, 3, 4, 5
friends.items; // contains item ids 2, 4
coworkers.items; // contains item ids 1, 2, 5
coworkers.remove(2);
contactRepo.items; // contains item ids 1, 2, 3, 4, 5
friends.items; // contains item ids 2, 4
coworkers.items; // contains item ids 1, 5
friends.remove(4);
contactRepo.items; // contains item ids 1, 2, 3, 5
friends.items; // contains item ids 2
coworkers.items; // contains item ids 1, 5
Notice how coworkers.remove(2) didn't remove id 2 from contactRepo? This is because it was still referenced from friends.items. However, friends.remove(4) causes id 4 to be removed from contactRepo, because no other collection is referring to it.
Summary
The above is what I want to do. I'm sure there are ways I can do this by keeping track of my own reference counters and such. But if there is a way to do it using javascript's built-in reference management, I'd like to hear about how to use it.
No, no, no, no; and yes, if you really need to count references you will have to do it manually. JS has no interface to this, GC, or weak references.
Whilst you could implement a manual reference-counted object list, it's questionable whether all the extra overhead (in performance terms but more importantly code complexity) is worth it.
In your example code it would seem simpler to forget the Repository, use a plain Array for your lists, and let standard garbage collection take care of dropping unused people. If you needed to get a list of all people in use, you'd just concat the friends and coworkers lists (and sort/uniquify them if you needed to).
You may interest to look into reduce functions, and array.map functions. map could be used to help identify where your collections intersect, or if there is an intersection at all. A user defined reduce function could be used like a merge (kinda like overriding the addition operator so that you can apply operation to objects, or merge all collections on "id" if that is how you define your reduce function - then assign the result to your master reference array, I recommend keeping a shadow array that holds all of the root object/values in case you would like to REWIND or something). Note: one must be careful of prototype chains when reducing an object or array. The map function will be very helpful in this case.
I would suggest not to remove the object or record that is in your Repository as you may want to reference it again later. My approach would be to create a ShadowRepository that would reflect all records/objects that have at least one "Reference". From your description and code presented here it appears you are initializing all of the data and storing reference to 1,2,4,5 as appears in your code.
var contactRepo = new Repository([
{id: 1, name: "Joe"},
{id: 2, name: "Jane"},
{id: 3, name: "Tom"},
{id: 4, name: "Jack"},
{id: 5, name: "Sue"}
]);
var friends = new Collection(contactRepo,[
contactRepo.get(2),
contactRepo.get(4)
]);
var coworkers = new Collection(contactRepo,[
contactRepo.get(1),
contactRepo.get(2),
contactRepo.get(5)
]);
From the initialization of the Repository and the collections, what you are asking "Remove item from repository if there are no references to it" item 3 would need to be removed immediatly. You can however track the references in a few different ways.
I have considered using Object.observe for a similar situation. However, Object.observe does not work in all browsers. I have recently turned to WatchJS
I am working on understanding the code behind Watch.JS to allow a list of observers on an object to be created dynamically this would allow one to also remove an item that is no longer watched, though I suggest to remove the reference at the point of access - What I mean is a variable that shares the immediate lexical scope with an object that has given a single point of reference to it's sibling can be removed making it no longer accessable outside of the object that had exposed the record/item/property/object of it's sibling. With the reference that all of your other references depended on removed access to the underlying data is stopped. I am generating unique id for origin references to avoid accidentally reusing the same one.
Thank you for sharing your question and the structure you are using, it has helped me consider one of my own specific cases where I was generating uniquely identified references to a lexical sibling these unique ids were kept on the ONE object that had scope, After reading here I have reconsidered and decided to expose only one reference then assign that reference to a variable name where ever it is needed such as in creating a watcher or observer or other Collection.

Categories