How to call a mongodb function using monk/node.js? - javascript

Tyring to implement this:
http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/#auto-increment-counters-collection
to provide sequence counters for a few things.
I've got the function stored in my db, but I can't call it without an error:
var model_id = db.eval('getNextSequence("model")');
Returns:
Object # has no method 'getNextSequence'
Is this because monk doesn't support the use of db functions via eval?

The getNextSequence method is part of your application code and runs in your application, not in the database. In the example from the mongodb docs, that application is the mongo shell which is basically a simple javascript-wrapper where you can easily declare your own methods.
In any case, implementing reliable, gap-free counters isn't trivial and should be avoided unless absolutely required:
incrementing a counter before insert will lead to gaps if the subsequent insert fails because of client crash, network partition, etc.
incrementing the counter after insert requires an optimistic insert loop (but doesn't require an explicit counter, as demonstrated in the link). This is more reliable, but gets inefficient with concurrent writers because it performs lots of queries and failed updates.
In a nutshell, such counters are OK if you're using them e.g. within a single account / tenant where the users of an account are humans. If your accounts are huge or if you have API clients, things get messy because they might burst and do 10,000 inserts in a few seconds which leads to a whole lot of conflicts.
Never use increment keys as primary keys.

Related

Should updates to Firstore items in AngularFire be done through the AngularFirestoreCollection?

In my app, I have a list that requires an "or" condition. But, as the docs say:
In this case, you should create a separate query for each OR condition and merge the query results in your app.
As a result, in my service, I'm managing two queries and surfacing them as a single observable list to consumers.
The problem comes in with updating. I have the choice of doing extra work to match up the item needing update to the correct collection so I can do the following:
myCollection.doc(item.id).update(item);
or I can make this much more simple and just:
angularFirestore.doc(`path/to/${item.id}`).update(item);
I'm operating under the assumption that the first method will result in faster updates as I'm using the same reference that it would optimistically update instantly. And that the latter will be slower in that it would be more round about by updating the persistence layer and then the collection referencing getting notified about later (probably still a small time).
All of the above is assumption, however. I back this just with a few random instances where I've seen it take a second or two for an update or delete to show up in an other part of the view, but I haven't been able to actually inspect the process.
Does anyone know if the above is correct? Should I be doing the extra work to write through the collection references or does angularfire(and/or firestore) handle this and make them effectively the same operation under the hood?
AngularFire2 is a thin wrapper around RxFire, which itself is a relatively thin wrapper around the Firebase JavaScript SDK.
There should be no significant performance difference between updating a document through AngularFire or updating it directly through the JavaScript SDK. In both cases the majority of the time is spent in the JavaScript SDK, and on the wire between the client and server. For this reason I typically update directly through the JavaScript SDK, since it's often a bit more direct and the AngularFire abstraction has little advantage for me in write operations. Given that AngularFire is built on top of this SDK, it picks up the changes instantly even when they're not made through AngularFire.
If you have an instance where this does not seem to be the case, I recommend creating a question with the minimal, complete/standalone code that reproduces that problem.

Can Firebase transform data server-side before writing it?

According to this documentation, and this accompanying example, Firebase tends to follow the following flow when transforming newly written data:
Client writes data to Firebase, which is immediately accepted
The supplied Cloud Function is triggered, which transforms the data (in the example above, it removes swear words)
The transformed data is written again, overwriting the original data written in step 1
Maybe I'm missing something here, but this flow seems to present some problems. For example, if there is an error in step 2 above, and step 3 is never fired, the un-transformed data will just linger in the database. It seems like it would be better to transform the data as soon as it hits the server, but before writing. This would be followed by a single write operation, which will leave no loose artifacts behind if it fails. Is there any way in the current Firebase + Google Cloud Functions stack to add these types of pre-write data transforms?
My (tentative and weird) solution so far is to have a "shadow" /_temp/{endpoint} area in my Firebase db, so that when I want to write to /{endpoint}, I write there instead, which then triggers the relevant cloud function to do the transformation before writing to /{endpoint}. This at least prevents potentially incomplete data from leaking into my database, but it seems very inelegant and "hacky."
I'd also be interested to know if there are any server-side methods for transforming data before responding to read requests.
There is no hook in the Firebase Database (neither through Cloud Functions nor elsewhere) that allows you to modify values before they're written to the database. The temporary queue is the idiomatic way to address this use-case. It functions pretty similar to a moderator queue in most forum software.
You could use a HTTP Function to create an endpoint that your code calls and then perform the transformation there. You could use a similar pattern for reading data, although you'd have to rebuild the realtime synchronization capabilities of Firebase yourself.

Improving Performance on massive IndexedDB Insert

We are trying to pre-cache a large sum of data on load of our web application into indexed db. From my performance testing the speed is decent on a desktop browser (e.g. Internet Explorer) where I can insert 10,000 records in around 2 seconds. But comparing the exact same functionality on the iPad it drops to 30 seconds. That comparison just blew my mind.
Does anyone know of any hints or tricks to inserting large data sets into indexedDB. I dont know if it is possible at all but if we could build up a copy of an indexedDB server side with all the data prepopulated and then just shoot it over to the client and it just stores it down to the browser. Is anything along these lines doable?
Thanks
I had problems with massive bulk insert (100.000 - 200.000 records). I've solved all my IndexedDB performance problems using Dexie library. It has this important feature:
Dexie has a kick-ass performance. It's bulk methods take advantage of
a not well known feature in indexedDB that makes it possible to store
stuff without listening to every onsuccess event. This speeds up the
performance to a maximum.
Dexie: https://github.com/dfahlander/Dexie.js
Some pretty bad IndexedDB performance problems can be caused by a prolonged period of the browser just calling onsuccess callbacks and running into event loop overhead after the work is actually done. The performance pattern observed by my app which was doing this was that it did a bunch of work, then it just went answering thousands of callbacks very inefficiently:
The right hand part of this image is the callbacks on every request. The solution to doing that is, of course, to not put a callback on every request, but it was previously unclear to me how to do this.
The way that Dexie.js accomplishes this (for details, see src/dbcore/dbcore-indexeddb.ts) is that it saves the last request (e.g. IDBObjectStore.put, etc) sent and sets an onsuccess callback on that one, which then collects the results from the rest of the requests. Thus, it avoids the callback hell.
Another approach from this is to use the IDBTransaction.oncomplete event, and not worry about the callbacks on the individual requests at all.
(note: yes, I know how old this question is, I had this problem today and wanted to put something more useful for this question which is high in Google results)
How is your data stored in the indexeddb? Is everything in a single object store of do you use multiple objectstores. Do you need all the cached data immediatly?
If you only have a single object store you can start with storing all the data you initialy need, commit that transaction and start a new for all the rest. This way you can start retrieving the initial data while inserting the rest. IndexedDB is async so it should block you.
If you have multiple object stores you can use the same stratigy. First fill up the objectstore you need immediatly and delay the others.
Or maybe consider using the AppCache API instead of the indexeddb api. Using this you can just cache a javascriptfile containing all the json objects you want to cache. This is more the case when you don't need a lot of querying on the data.

When is it appropriate to use a setTimeout vs a Cron?

I am building a Meteor application that is using a mongo database.
I have a collection that could potentially have 1000s of documents that need to be updated at different times.
Do I run setTimeouts on creation or a cron job that runs every second and loops over every document?
What are the pros and cons of doing each?
To put this into context:
I am building an online tournament system. I can have 100s of tournaments running which means I could have 1000s of matches.
Each match needs to absolutely end at a specific time, and can end earlier under a condition.
Using an OS-level cron job won't work because you can only check with a 60-second resolution. So by "cron job", I think you mean a single setTimeout (or synced-cron). Here are some thoughts:
Single setTimeout
strategy: Every second wake up and check a large number of matches, updating those which are complete. If you have multiple servers, you can prevent all but one of them from doing the check via synced-cron.
The advantage of this strategy is that it's straightforward to implement. The disadvantages are:
You may end up doing a lot of unnecessary database reads.
You have to be extremely careful that your processing time does not exceed the length of the period between checks (one second).
I'd recommend this strategy if you are confident that the runtime can be controlled. For example, if you can index your matches on an endTime so only a few matches need to be checked in each cycle.
Multiple setTimeouts
strategy: Add a setTimeout for each match on creation or when the sever starts. As each timeout expires, update the corresponding match.
The advantage of this strategy is that it potentially removes a considerable amount of unnecessary database traffic. The disadvantages are:
It may be a little more tricky to implement. E.g. you have to consider what happens on a server restart.
The naive implementation doesn't scale past a single server (see 1).
I'd recommend this strategy if you think you will use a single server for the foreseeable future.
Those are the trade-offs which occurred to me given the choices you presented. A more robust solution would probably involve technology outside of the meteor/mongo stack. For example, storing match times in redis and then listening for keyspace notifications.
This is all a matter of preference, to be honest with you.
I'm a big fan of writing small, independent programs, that each do one thing, and do it well. If you're also like this, it's probably better to write separate programs to run periodically via cron.
This way you get guaranteed OS-controlled precision for the time, and small, simple programs that are easy to debug outside the context of your webapp.
This is just a preference though.

Node.js, MongoDB, and Concurrency

I'm working on a game prototype and worried about the following case: Browser does AJAX to Node.JS, which has to do several MongoDB operations using async.series.
What prevents multiple requests at the same time causing the database issues? New events (i.e. db operations) seem like they could be run out of order or in between the async.series steps.
In other words, what happens if a user does AJAX calls very quickly, before the prior ones have finished their async.series. Hopefully that makes sense.
If this is indeed an issue, what is the proper way to handle it?
First and foremost, #fmodos's comment should be completely disregarded. It is wrong on many levels but most simply you could have any number of nodes running (say on Heroku) and there is no guarantee that subsequent requests will hit the same node.
Now, I'm going to answer your question by asking more questions. (You really didn't give me a choice here)
What are these operations doing? Inserting documents? Updating existing documents? Removing documents? This is very important because if all you're doing is simply inserting documents then why does it matter if one finishes for before the other? If you're updating documents then you should NOT be issuing a find, grabbing a ref to the object, and then calling save. (I'm making the assumption you're using mongoose, if you're not, I would) Instead what you should be doing is using built in mongo functions like $inc which properly handle concurrent requests.
http://docs.mongodb.org/manual/reference/operator/update/inc/
Does that help at all? If not, please let me know and I will give it another shot.
Mongo has database wide read/write locks. It gives preference to writes of the same collection first then fulfills reads. So, if by chance, you have Bill writing to the db and Joe is reading at the same time, Bill's write will execute first while Joe waits until the write is complete and then he is given all the data (including Bill's).

Categories