Doesn't copying a large state in redux reduce performance?

Doesn't copying a large state in redux reduce performance? - javascript

I understand that in Redux's reducers we don't change the state, we make a copy of the state, and then we change the copy.
But what happens if the state is very large. For example, maybe the state holds data we got from the server and can be several megabytes in size.
So if we make a copy of it each time in the reducer, wouldn't that slow the app?
Note: I looked at the redux documentation (https://redux.js.org/docs/faq/Performance.html) in regards to performance and it speaks about 'shallow copies' vs 'deep copies', and how shallow copies doesn't hamper performance as much.
Let's say I have an object of students that is 40mb in size. If I move these to a new state object, does that mean just a pointer gets modified so there's no performance impact?

The short answer: it depends.
Specifically, it depends on the kinds of operations that you are using in your reducers. If you find yourself frequently creating / modifying complex subtrees of your app's sate tree, you might want to look at libraries that efficiently handle immutability (see Immutable.js).
If you profile your application and find that you are needlessly modifying some parts of your sate's subtrees, you could take a look at redux-ignore, which allows you to ignore certain actions based on predicate functions.
With that said, most applications don't need either of the two libraries above as you will most likely only be modifying a small section of your state's tree as a response to an action. If you've split your root reducer into smaller reducers that handle specific parts of your tree (technique explained here), then you are most likely creating a few objects for every action, regardless of how many total objects exist in your state tree.
While it's true that your root reducer is returning a new object for every state change, you are not required to create a deep copy of your previous state for every state change. Most of the time, the object returned by your root reducer will reuse a majority of the objects that were in your previous state.
in your example of having a huge object containing large strings like so:
let tree = {
student1: ..., // huge string
student2: ... // enormous string
... // more huge strings
};
let newTree = Object.assign({}, tree);
Our newtree is actually created extremely quickly, as the strings in the object are passed by reference, so Object.assign's job is very easy: it just has to copy the existing references to the student strings into a new object that will serve as our application's updated state tree.
The same conclusion is true if our student properties were pointing to huge objects. Those objects would not be recreated when we create a new tree. Instead, the object references will be copied over to the new state tree.
In summary, as long as you are not making deep copies of your state tree for each action, your new state tree will reuse most of your old tree's properties, making the state update performant.

Ideally you're only copying the piece of state that needs to be modified, not an entire large, deeply-nested object. You can read through these links to see discussion points that may be relevant to your implementation/usage. Performance is unlikely to be a real concern if used as designed.
Redux Performance FAQ
Redux Performance Discussion
Reddit Discussion

Related

Question on Best practices: Immutability and re-rendering with React Native

I have created an immutable map and am currently using it with redux and i have some general questions about immutability. From what I understand, when passing props down to a component the props do an initial render. If the value of the prop is changed it doesn't re-render since Javascript is doing an === operation to check the address of memory rather than checking the values of that memory. What immutability does is change the address within memory to trigger the re-render. My concern right now is: aren't we wasting memory resources if I plan on never using the map that is in stored in the old address in memory. Also, if this is done repetitively with the user clicking on an immutable map expanding its memory usage more and more, couldn't this cause performance issues? Is there a way to remove the old address in memory after the new one is created?
Here is some of my Redux code if you could give me pointers on if I am doing anything wrong:
import {Map} from 'immutable'
const likesAndSaved = new Map()
function likesAndSavedReducer(state = likesAndSaved, action) {
switch (action.type) {
case 'updateObj':
return state.set(action.payloadId, action.payloadData)
default:
return state;
}
}

Unreferenced objects are removed from memory by the garbage collector. That also applies to "the old value" in a Immutable.js Map.
To cite the intro page of immutable-js.com:
These data structures are highly efficient on modern JavaScript VMs by
using structural sharing via hash maps tries and vector
tries as popularized by Clojure and Scala, minimizing the need to
copy or cache data.
When you build a Map with 1000 entries, and then change a single one, this type of datastructure does not have to copy all these references and/or values. Instead it just has to extend a small part of the trie. So even if you kept a reference to both old and new state, the increase in memory usage would be minimal. When you clear the reference to one (e.g. the old) state, the javascript VM can clear that memory whenever it feels like doing so.
You might even gain performance, because shallow checking properties with === is now enough to determine changes. You may now safely use use PureComponent. No need to deeply analyze objects in order to determine whether data has actually changed.
Disclaimer: I am part of the maintainer team of Immutable.js and also a heavyily using it in large applications.

Object vs Array: Is there a clear winner when storing collections of objects in state?

In a state-managing javascript framework (eg: React), if you have a collection of objects to store in state, which is the more useful and/or performant dataset type to hold them all, an object or an array? Here are a few of the differences I can think of that might come up in using them in state:
Referencing entries:
With objects you can reference an entry directly by its key, whereas with an array you would have to use a function like dataset.find(). The performance difference might be negligible when doing a single lookup on a small dataset, but I imagine it gets larger if the find function has to pore over a large set, or if you need to reference many entries at once.
Updating dataset:
With objects you can add new entries with {...dataset, [newId]: newEntry}, edit old entries with {...dataset, [id]: alteredEntry} and even edit multiple entries in one swoop with {...dataset, [id1]: alteredEntry1, [id2]: alteredEntry2}. Whereas with arrays, adding is easy [...dataset, newEntry1, newEntry2], but to edit you have to use find(), and then probably write a few lines of code cloning the dataset and/or the entry for immutability's sake. And then for editing multiple entries it's going to either require a loop of find() functions (which sounds bad for large lists) or use filter() and then have to deal with adding them back into the dataset afterwards.
Deleting
To delete a single entry from the object dataset you would do delete dataset[id] and for multiple entries you would either use a loop, or a lodash function like _.omit(). To remove entries from an array (and keep it dense) you'd have to either use findIndex() and then .slice(index, 1), or just use filter() which would work nicely for single or multiple deletes. I'm not sure about the performance implications of any of these options.
Looping/Rendering: For an array you can use dataset.map() or even easily render a specialized set on the fly with dataset.filter() or dataset.sort(). For the object to render in React you would have to use Object.values(dataset) before running one of the other iteration functions on it, which I suppose might create a performance hit depending on dataset size.
Are there any points I'm missing here? Does the usability of either one depend perhaps on how large the dataset is, or possibly how frequent the need to use "look up" operations are? Just trying to pin down what circumstances might dictate the superiority of one or the other.

There's no one real answer, the only valid answer is It dependsTM.
Though there are different use-cases that requires different solutions. It all boils down to how the data is going to be used.
A single array of objects
Best used when the order matters and when it's likely rendered as a whole list, where each item is passed from the list looping directly and where items are rarely accessed individually.
This is the quickest (least developer-time consuming) way of storing received data, if the data is already using this structure to begin with, which is often the case.
Pros of array state
Items order can be tracked easily,
Easy looping, where the individual items are passed down from the list.
It's often the original structure returned from API endpoints,
Cons of an array state
Updating an item would trigger a render of the full list.
Needs a little more code to find/edit individual items.
A single object by id
Best used when the order doesn't matter, and it's mostly used to render individual items, like on an edit item page. It's a step in the direction of a normalized state, explained in the next section.
Pros of an object state
Quick and easy to access/update by id
Cons of an object state
Can't re-order items easily
Looping requires an extra step (e.g. Object.keys().map)
Updating an item would trigger a render of the full list,
Likely needs to be parsed into the target state object structure
Normalized state
Implemented using both an object of all items by id, and an array of all the id strings.
{
items: {
byId: { /**/ },
allIds: ['abc123', 'zxy456', /* etc. */],
}
}
This becomes necessary when:
all use-cases are equally likely,
performance is a concern (e.g. huge list),
The data is nested a lot and/or duplicated at different levels,
re-rendering the list as undesirable side-effects
An example of an undesirable side-effect: updating an item, which triggers a full list re-render, could lose a modal open/close state.
Pros
Items order can be tracked,
Referencing individual items is quick,
Updating an item:
Requires minimal code
Doesn't trigger a full list render since the full list loops over allIds strings,
Changing the order is quick and clear, with minimal rendering,
Adding an item is simple but requires adding it in both dataset
Avoids duplicated objects in nested data structures
Cons
Individual removal is the worse case scenario, while not a huge deal either.
A little more code needed to manage the state overall.
Might be confusing to keep both state dataset in sync.
This approach is a common normalization process used in a lot of places, here's additional references:
Redux's state normalization is a strongly recommended best practice,
The normalizr lib.

Why and how does Object.assign increase app performance in React?

I am reading the Facebook tutorial for React. About 40% of the way through, there's a section called Why Immutability is Important where they state (regarding the importance of immutability):
The end result is the same but by not mutating (or changing the underlying data) directly we now have an added benefit that can help us increase component and overall application performance.
My question is: why/how? That is, in React, specifically why/how does immutability (using Object.assign(...), etc.) "help increase...overall application performance"?

I think that it is easier to understand with arrays:
Imagine that you have an array, containing many, many entries.
You replace one entry with another - to see if anything changed react has to loop through the whole array.
Now, imagine that everytime you make some changes you create new array - then the only thing react has to do is to compare references
Object.assign does the same - instead of changing existing object, you create a new one, so that react can detect changes simply by comparing references

Isn't Redux just glorified global state?

So I started learning React a week ago and I inevitably got to the problem of state and how components are supposed to communicate with the rest of the app. I searched around and Redux seems to be the flavor of the month. I read through all the documentation and I think it's actually a pretty revolutionary idea. Here are my thoughts on it:
State is generally agreed to be pretty evil and a large source of bugs in programming. Instead of scattering it all throughout your app Redux says why not just have it all concentrated in a global state tree that you have to emit actions to change? Sounds interesting. All programs need state so let's stick it in one impure space and only modify it from within there so bugs are easy to track down. Then we can also declaratively bind individual state pieces to React components and have them auto-redraw and everything is beautiful.
However, I have two questions about this whole design. For one, why does the state tree need to be immutable? Say I don't care about time travel debugging, hot reload, and have already implemented undo/redo in my app. It just seems so cumbersome to have to do this:
case COMPLETE_TODO:
return [
...state.slice(0, action.index),
Object.assign({}, state[action.index], {
completed: true
}),
...state.slice(action.index + 1)
];
Instead of this:
case COMPLETE_TODO:
state[action.index].completed = true;
Not to mention I am making an online whiteboard just to learn and every state change might be as simple as adding a brush stroke to the command list. After a while (hundreds of brush strokes) duplicating this entire array might start becoming extremely expensive and time-consuming.
I'm ok with a global state tree that is independent from the UI that is mutated via actions, but does it really need to be immutable? What's wrong with a simple implementation like this (very rough draft. wrote in 1 minute)?
var store = { items: [] };
export function getState() {
return store;
}
export function addTodo(text) {
store.items.push({ "text": text, "completed", false});
}
export function completeTodo(index) {
store.items[index].completed = true;
}
It's still a global state tree mutated via actions emitted but extremely simple and efficient.

Isn't Redux just glorified global state?
Of course it is. But the same holds for every database you have ever used. It is better to treat Redux as an in-memory database - which your components can reactively depend upon.
Immutability enables checking if any sub-tree has been altered very efficient because it simplifies down to an identity check.
Yes, your implementation is efficient, but the entire virtual dom will have to be re-rendered each time the tree is manipulated somehow.
If you are using React, it will eventually do a diff against the actual dom and perform minimal batch-optimized manipulations, but the full top-down re-rendering is still inefficient.
For an immutable tree, stateless components just have to check if the subtree(s) it depends on, differ in identities compared to previous value(s), and if so - the rendering can be avoided entirely.

Yes it is!!!
Since there is no governance of who is allowed to write a specific property/variable/entry to the store and practically you can dispatch any action from anywhere, the code tends to be harder to maintain and even spaghetti when your code base grows and/or managed by more than one person.
I had the same questions and issues with Redux when I started use it so I have created a library that fix these issue:
It is called Yassi:
Yassi solves the problems you mentioned by define a globally readable and privately writable store. It means that anyone can read a property from the store (such as in Redux but simpler).
However only the owner of the property, meaning the object that declare the property can write/update that property in the store
In addition, Yassi has other perks in it such as zero boilerplate to declare entry in the store by using annotations (use #yassit('someName'))
Update the value of that entry does not require actions/reducers or other such cumbersome code snippets, instead just update the variable like in regular object.

Immutable.js relationships

Imagine a situation that John have two childrens Alice and Bob, and Bob have a cat Orion.
var Immutable = require('immutable');
var parent = Immutable.Map({name: 'John'});
var childrens = Immutable.List([
Immutable.Map({name: 'Alice', parent: parent}),
Immutable.Map({name: 'Bob', parent: parent})
]);
var cat = Immutable.Map({name: 'Orion', owner: childrens.get(1)});
After few years John wants to rename to Jane.
var renamedParent = parent.set('name', 'Jane');
...and let childrens know about it.
childrens = childrens.map(function(children) {
children.set('parent', renamedParent);
});
Then I have to update cat because Bob changed.
cat = cat.set('owner', childrens.get(1));
Is it possible to automatically update all related objects when one object change? I looked at Cursors, but I'm not sure if they are a solution. If it possible, can you give me an example?

Paraphrasing the question:
Is it possible to automatically update all related objects when one object changes in an immutable collection?
Short answer
No.
Long answer
No, but nothing ever changes in an immutable data structure so that's not a problem.
Even longer answer
It's more complicated...
Immutability
The whole point of immutable objects is that if you have a reference to an immutable object you don't ever have to bother checking whether any of its properties have changed. So, is it a non-issue? Well...
Consequences
There are some consequences of that - whether they are good or bad depends on your expectations:
There is no difference between pass-by-value and pass-by-reference semantics
Some comparisons can be easier
When you pass a reference to an object somewhere you don't have to worry that some other part of code will change it
When you get a reference to an object from somewhere you know it will never change
You avoid some problems with concurrency because there is no notion of a change in time
When nothing ever changes you don't have to worry whether changes are atomic
It's easier to implement software transactional memory (STM) with immutable data structures
But the world is mutable
Of course in practice we often deal with values that change in time. It may seem that immutable state can't describe mutable world but there are some ways people deal with it.
Look at it this way: if you have your address on some ID and you move to a different address, that ID should be changed to be consistent with new true data, because it is no longer true that you live at that address. But when you get an invoice when you buy something which contains your address and then you change your address, the invoice stays the same because it is still true that you lived at that address when the invoice was written. Some data representations in the real world are immutable, like invoices in that example, and some are mutable like IDs.
Now taking your example, if you choose to use immutable structures to model your data, you have to think about it in a way that you think about the invoice from my example. It may be true that the data is not up to date but it will always be consistent and true for some point in time, and it will never change.
How to deal with change
So how to model change with immutable data? There is a nice way it has been solved in Clojure using Vars, Refs, Atoms, and Agents, and ClojureScript (a compiler for Clojure that targets JavaScript) supports some of them (in particular Atoms are supposed to work as in Clojure but there are no Refs, STM, Vars or Agents - see what are the differences between ClojureScript and Clojure regarding concurrency features).
Looking at how Atoms are implemented in ClojureScript it seems that you can just use ordinary JavaScript objects to achieve the same. It will work for things like having a JavaScript object that is itself mutable, but it has a property that is a reference to an immutable object - you will not be able to change any properties of that immutable object, but you will be able to construct a different immutable object, and swap the old one to the new one in your top-level mutable object.
Other languages like Haskell that are purely functional may have different ways to deal with mutable world, like monads (a concept notoriously hard to explain - Douglas Crockford, author of JavaScript: The Good Parts and discoverer of JSON attributes it to "the monadic curse" in his talk Monads and Gonads).
Your question seems simple but the problem it touches is actually quite complicated. Of course it would be missing the point to just answer "No" to your question whether it is possible to automatically update all related objects when one object changes, but it is more complicated than that, and saying that in immutable objects nothing ever changes (so this problem never happens) will be equally unhelpful.
Possible solutions
You can have a top level object or variable from which you always access all your structures. Let's say you have:
var data = { value: Immutable.Map({...}) }
If you always access your data using data.value (or with some better names) then you can pass the data to some other part of your code, and whenever your state changes you can just assign a new Immutable.Map({...}) to your data.value and at that point all your code that uses data will get fresh values.
How and when to update the data.value to a new immutable structure could be solved by making it automatically triggered from your setter functions that you would use to update your state.
Another way would be to use similar tricks at the level of every structure, for example - I use the original spelling of the variables:
var parent = {data: Immutable.Map({name: 'John'}) };
var childrens = {data: Immutable.List([
Immutable.Map({name: 'Alice', parent: parent}),
Immutable.Map({name: 'Bob', parent: parent})
])};
but then you have to remember that what you have as values are not immutable structures but rather those additional objects with references to immutable structures that introduce an additional level of indirection.
Some reading
What I would suggest is to look at some of those projects and articles:
Immutable React article by Peter Hausel
Morearty.js - using immutable state in React like in Om but written in pure JavaScript
react-cursor - functional state management abstraction for use with Facebook
Omniscient - a library providing an abstraction for React components that allows for fast top-down rendering embracing immutable data
* Om - ClojureScript interface to React
mori - a library for using ClojureScript's persistent data structures in JavaScript
Fluxy - an implementation of Facebook's Flux archtecture
Facebook's Immutable - persistent data collections for JavaScript that when combined with Facebook React and Facebook Flux faces similar problems that you have (searching for how to combine Immutable with React and Flux may give you some good ideas)
I hope this answer even if not giving a simple solution will nevertheless be helpful, because describing mutable world with immutable structures is a very interesting and important problem.
Other approaches
For a different approach to immutability, with immutable objects that are proxies to data that is not necessarily constant, see the Immutable Objects vs. Common Sense webinar by Yegor Bugayenko and his articles:
Objects Should Be Immutable
How Immutability Helps
How an Immutable Object Can Have State and Behavior?
Immutable Objects Are Not Dumb
Yegor Bugayenko uses the term "immutable" in a slightly different sense than what it usually means in the context of functional programming. Instead of using immutable or persistent data structures and functional programming, he advocates the use of object oriented programming in the original sense of the term in a way that you never actually mutate any object but you can ask it to change some state, or mutate some data, that is itself considered to be separate from the object. It's easy to imagine an immutable object that talks to a relational database. The object itself can be immutable but it can still update the data stored in the database. It's somewhat harder to imagine that some data stored in RAM can be thought of as equally separate from the object as the data in the database but there is really not much difference. If you think about objects as autonomous entities that expose some behavior and you respect their abstraction boundaries then it actually makes a lot of sense to think about objects as something different than just data and then you can have immutable objects with mutable data.
Please comment if anything needs some clarification.

As rsp clarified, this problem reflects the inherent trade-off offered by immutability.
Here's an example react/flux-ish app demonstrating one way to deal with this problem. The key here is that numeric IDs are used for reference rather than Javascript references.

From what I understand, you might be using immutable objects a bit too high level. As far as I know immutable objects are useful to represent time-captured states. So it's easy to compare state1 === state2. In your example however, you have also time-captured relationships in the state. Which is fine to compare the complete state with another complete state, because it's unchanging data so it's very readable, but therefor also hardly updatable. I suggest before you add another library to fix this problem, try and see if you can implement Immutable at a lower level, where you actually NEED to compare time-captured states. For example, to know whether 1 entity (person/cat) changed.
So where you do want to use immutable objects, use them. But if you want dynamic objects, don't use immutable objects.

It might be that this situation can be solved by using ids to reference the parents rather than parent records themselves. I'm looking to SQL (sorta) for guidance here: A record should not have a pointer to another record in memory, but rather should know the id of the other record.
In my apps I like having each object only know the primary keys of its reference objects so I don't have to worry about questions like what if an object changes. As long as its key stays the same (as it should) I can always find it:
var Immutable = require('immutable');
var people = Immutable.OrderedMap({
0: Immutable.Map({name: 'John', id: '0'}),
1: Immutable.Map({name: 'Alice', id: '1', parentId: '0'}),
2: Immutable.Map({name: 'Bob', id: '2', parentId: '0'})
});
var cat = Immutable.Map({name: 'Orion', ownerId: '2'});
Now I can change any characteristic of any record (except the id of course) using the standard immutable.js tools without having to do any additional updating.
people = people.set('0', people.get('0').set('name', 'Jane'))

After being frustrated and getting no where with this exact same problem, I decided to move relationships out of my objects completely and manage the logic of various common associations (one-to-one, one-to-many, many-to-many) separately.
Essentially, I'm managing relationships with bi-directional maps and keeping relationship logic centralized in an object instead of in each of my modeled data as is traditionally the case. That means I needed to come up with the logic to observe, for example, when a parent has changed and propagate the changes to the child relationships (hence the bimaps). Since its abstracted in its own module however, it can be reused to model future relationships.
I decided not go with ImmutableJS for handling this, instead rolling my own library (https://github.com/jameslk/relatedjs) and using ES6 features, but I imagine it can be done in a similar way.

We Keep Coding

JavaScript is the programming language of the Web.