Improving Performance on massive IndexedDB Insert

Improving Performance on massive IndexedDB Insert - javascript

We are trying to pre-cache a large sum of data on load of our web application into indexed db. From my performance testing the speed is decent on a desktop browser (e.g. Internet Explorer) where I can insert 10,000 records in around 2 seconds. But comparing the exact same functionality on the iPad it drops to 30 seconds. That comparison just blew my mind.
Does anyone know of any hints or tricks to inserting large data sets into indexedDB. I dont know if it is possible at all but if we could build up a copy of an indexedDB server side with all the data prepopulated and then just shoot it over to the client and it just stores it down to the browser. Is anything along these lines doable?
Thanks

I had problems with massive bulk insert (100.000 - 200.000 records). I've solved all my IndexedDB performance problems using Dexie library. It has this important feature:
Dexie has a kick-ass performance. It's bulk methods take advantage of
a not well known feature in indexedDB that makes it possible to store
stuff without listening to every onsuccess event. This speeds up the
performance to a maximum.
Dexie: https://github.com/dfahlander/Dexie.js

Some pretty bad IndexedDB performance problems can be caused by a prolonged period of the browser just calling onsuccess callbacks and running into event loop overhead after the work is actually done. The performance pattern observed by my app which was doing this was that it did a bunch of work, then it just went answering thousands of callbacks very inefficiently:
The right hand part of this image is the callbacks on every request. The solution to doing that is, of course, to not put a callback on every request, but it was previously unclear to me how to do this.
The way that Dexie.js accomplishes this (for details, see src/dbcore/dbcore-indexeddb.ts) is that it saves the last request (e.g. IDBObjectStore.put, etc) sent and sets an onsuccess callback on that one, which then collects the results from the rest of the requests. Thus, it avoids the callback hell.
Another approach from this is to use the IDBTransaction.oncomplete event, and not worry about the callbacks on the individual requests at all.
(note: yes, I know how old this question is, I had this problem today and wanted to put something more useful for this question which is high in Google results)

How is your data stored in the indexeddb? Is everything in a single object store of do you use multiple objectstores. Do you need all the cached data immediatly?
If you only have a single object store you can start with storing all the data you initialy need, commit that transaction and start a new for all the rest. This way you can start retrieving the initial data while inserting the rest. IndexedDB is async so it should block you.
If you have multiple object stores you can use the same stratigy. First fill up the objectstore you need immediatly and delay the others.
Or maybe consider using the AppCache API instead of the indexeddb api. Using this you can just cache a javascriptfile containing all the json objects you want to cache. This is more the case when you don't need a lot of querying on the data.

Related

Should updates to Firstore items in AngularFire be done through the AngularFirestoreCollection?

In my app, I have a list that requires an "or" condition. But, as the docs say:
In this case, you should create a separate query for each OR condition and merge the query results in your app.
As a result, in my service, I'm managing two queries and surfacing them as a single observable list to consumers.
The problem comes in with updating. I have the choice of doing extra work to match up the item needing update to the correct collection so I can do the following:
myCollection.doc(item.id).update(item);
or I can make this much more simple and just:
angularFirestore.doc(`path/to/${item.id}`).update(item);
I'm operating under the assumption that the first method will result in faster updates as I'm using the same reference that it would optimistically update instantly. And that the latter will be slower in that it would be more round about by updating the persistence layer and then the collection referencing getting notified about later (probably still a small time).
All of the above is assumption, however. I back this just with a few random instances where I've seen it take a second or two for an update or delete to show up in an other part of the view, but I haven't been able to actually inspect the process.
Does anyone know if the above is correct? Should I be doing the extra work to write through the collection references or does angularfire(and/or firestore) handle this and make them effectively the same operation under the hood?

AngularFire2 is a thin wrapper around RxFire, which itself is a relatively thin wrapper around the Firebase JavaScript SDK.
There should be no significant performance difference between updating a document through AngularFire or updating it directly through the JavaScript SDK. In both cases the majority of the time is spent in the JavaScript SDK, and on the wire between the client and server. For this reason I typically update directly through the JavaScript SDK, since it's often a bit more direct and the AngularFire abstraction has little advantage for me in write operations. Given that AngularFire is built on top of this SDK, it picks up the changes instantly even when they're not made through AngularFire.
If you have an instance where this does not seem to be the case, I recommend creating a question with the minimal, complete/standalone code that reproduces that problem.

Slow app with LocalStorage and AngularJS

I created an app that stores, compares, filters and takes statistics out of a collection of records. I've done it so it works offline, as in some user cases the user might not have constant (or at all) access to internet.
My problem is that after I've included ~60 records, the app starts to behave really slow. For instance, I list a collection of simple objects from LocalStorage into a ng-model (Select list), and after those ~60 records are in, to open the Select box will be seriously slowed down.
What could the problem be? I'm thinking, either some function is sucking more resources than necessary, or LocalStorage is not intended for such uses?
I'm starting to get into PouchDB, would you say that migrating all to Pouch instead of LocalStorage would be a good move?
I can't paste the whole controller here as it's huge, but I've put an online version for testing. You can see it here.
For you not to have to create 60 records just to see the effect, you can download this CSV and import it in the app.
In order to import, the pass for Edit Mode is: admin
Let's see if someone has a tip for this one!

I see you are storing all your records inside a single LocalStorage value (with the key being recordspax). So yeah, that will get quite slow, because your app has to 1) JSON parse/stringify and 2) store/retrieve the entire list every time you read/write data to the database.
Basically you are reading your entire database in and out of disk for every operation. Since both LocalStorage and JSON stringify/parse happen synchronously on the main thread, it can block DOM rendering and will thus slow down your app.
PouchDB could be a help here, but you could also benefit from something simpler like LocalForage, or simply changing your DB design so that every record has its own key/value rather than storing everything into a single key with a single value.
(Both LocalForage and PouchDB use IndexedDB/WebSQL rather than LocalStorage, meaning that database operations are not synchronous and do not block the DOM. However, you still don't want to stuff everything into a single document and therefore read the entire DB in and out of disk. :))

Node.js, MongoDB, and Concurrency

I'm working on a game prototype and worried about the following case: Browser does AJAX to Node.JS, which has to do several MongoDB operations using async.series.
What prevents multiple requests at the same time causing the database issues? New events (i.e. db operations) seem like they could be run out of order or in between the async.series steps.
In other words, what happens if a user does AJAX calls very quickly, before the prior ones have finished their async.series. Hopefully that makes sense.
If this is indeed an issue, what is the proper way to handle it?

First and foremost, #fmodos's comment should be completely disregarded. It is wrong on many levels but most simply you could have any number of nodes running (say on Heroku) and there is no guarantee that subsequent requests will hit the same node.
Now, I'm going to answer your question by asking more questions. (You really didn't give me a choice here)
What are these operations doing? Inserting documents? Updating existing documents? Removing documents? This is very important because if all you're doing is simply inserting documents then why does it matter if one finishes for before the other? If you're updating documents then you should NOT be issuing a find, grabbing a ref to the object, and then calling save. (I'm making the assumption you're using mongoose, if you're not, I would) Instead what you should be doing is using built in mongo functions like $inc which properly handle concurrent requests.
http://docs.mongodb.org/manual/reference/operator/update/inc/
Does that help at all? If not, please let me know and I will give it another shot.

Mongo has database wide read/write locks. It gives preference to writes of the same collection first then fulfills reads. So, if by chance, you have Bill writing to the db and Joe is reading at the same time, Bill's write will execute first while Joe waits until the write is complete and then he is given all the data (including Bill's).

Syncing Database and Javascript

I'm working on a real-time JavaScript Application that requires all changes to a database are mirrored instantly in JavaScript and vise versa.
Right now, when changes are made in JavaScript, I make an ajax call to my API and make the corresponding changes to the DOM. On the server, the API handles the request and finishes up by sending a push using PubNub to the other current JavaScript users with the change that has been made. I also include a changeID that is sequential to JavaScript can resync the entire data set if it missed a push. Here is an example of that push:
{
"changeID":"2857693",
"type":"update",
"table":"users",
"where":{
"id":"32"
},
"set":{
"first_name":"Johnny",
"last_name":"Applesead"
}
}
When JavaScript gets this change, it updates the local storage and makes the corresponding DOM changes based on which table is being changed. Please keep in mind that my issue is not with updating the DOM, but with syncing the data from the database to JavaScript both quickly and seamlessly.
Going through this, I can't help but think that this is a terribly complicated solution to something that should be reasonably simple. Am I missing a Gotcha? How would you sync multiple JavaScript Clients with a MySQL Database seamlessly?

Just to update the question a few months later - I ended up sticking with this method and it works quite well.

I know this is an old question, but I've spent a lot of time working on this exact same problem although for a completely different context. I am creating a Phonegap App and it has to work offline and sync at a later point.
The big revelation for me is that what I really need is a version control between the browser and the server so that's what I made. stores data in sets and keys within those sets and versions all of those individually. When things go wrong there is a conflict resolution callback that you can use to resolve it.
I just put the project on GitHub, it's URL is https://github.com/forbesmyester/SyncIt

can entire web document be cached/archived (including the precise state of the DOM) and later reloaded?

imagine that we have loaded a complex website with lots of Javascript which loaded all sort of info via AJAX and by making computations based on user input. So, now suppose we want to archive it in such a way that we can reliably load it later on from file (maybe even without an internet connection) and study its behavior / debug it / etc. Is there a way to do this?

The browsers already do this to make the "Back" button work fast -- in Firefox it's called "bfcache". This cache lives only in memory, though. I don't know if it's possible to serialize it to a file, if yes, it would be very interesting.

I don't think there's a way to export the entire DOM state without manually looking at each piece, and storing it. There is a lot of information that goes in representing that DOM than what visible in the source.
For instance, you might want to save the window scrollbar position which is available in the window object as window.scrollX and window.scrollY. This is just one example but there's plenty of other state information to be saved including attached event handlers etc.
If you could identify the pieces that are relevant for you purposes while ignoring others, you could store it locally using Google Gears (now obsolete) or the new Local Storage introduced in HTML5 and if you are already serializing this information, you could pass it on to some server and restore it from there. The new storage mechanism in HTML5 is called DOM Storage but its a little misleading because it's just a key value pair storage where both the keys and values are strings.
Edit: This might be a different perspective on the problem but here it goes. Instead of storing the entire DOM state, you could store just the intial state, and the relevant actions that change it. To get to the final state, a replay mechanism would be used that runs each action in sequence. This is a popular design pattern known as the Command pattern. That's how multiplayer games keep each player up-to-date and in-sync by passing only the player actions like a keystroke, mouse movement, etc. instead of the entire view and the receiver applies those actions to update its state. It's a lot more complicated than that in practice but thats the crux of it.

Where would you want to store it?
Currently there's no way to store anything browser side (apart from new browser features, that very few people have installed). The only realistic solution would be in a Cookie if the DOM is small enough (this is because a Cookie can only hold a certain amount of data).
If you're looking at storing the DOM server-side, then you could use document.body.innerHTML to access the current DOM state and then send it to your server.

We Keep Coding

JavaScript is the programming language of the Web.