Do I need synchronous calls / transactions using mongodb from node.js? - javascript

I'm currently experimenting with node.js and WebSocket, building a simple MMO server and client (nothing complex, just to learn node.js and HTML5). Basic functionality is complete except data persistence. So far, all data has been kept in memory, but now I would like to add persistent storage with mongodb (or something similar). My question is about how to realize the communication between the node.js application and the data base.
The main activity of the server can be described as follows:
receive incoming message (via WebSocket)
-->
read data from DB
computation on data from step 1
read more data from DB (data required depends on outcome of step 2)
computation on data from steps 1 and 3
write data to DB (create or update)
-->
send response message (via WebSocket)
(steps 3-5 do not always occur)
I guess this is a very common use case.
Questions:
Since the data read in steps 1 and 3 must be consistent AND the write in step 5 might no longer be valid if data in DB has changed between steps 3 and 5, it seems to me that I cannot use async calls to the DB at all (because then data in the DB might be changed by other code in between the above steps). Is that correct?
When thinking of a deployment with multiple instances of node.js working on the same DB (I think this is what nodejitsu calls "drones", right?) then I even have to use data base transactions spanning steps 1 through 5. Is that correct?
It seems to me that using synchronous calls to the DB and having transactions in all these cases would be poor design and introduce performance issues. Is there a better way to do this?
Any hints would be greatly appreaciated! Thanks so much in advance!!

Question 1:
node.js is asynchronous. That is a fact. If you are not able to design your app under this pattern, you cannot use node.js
Of course, there are techniques and patterns, that allow you to do what you want within an asynchronous environment. Here are some keywords, functions, and links that might lead you to the right direction:
Queueing
Optimistic concurrency control
mongoDB/findAndModify
redis/transactions
Question 2:
It might be nice to use transactions, but you cannot, because mongoDB does not have any. You have to program them yourself nodejs-side. That`s one of the mongoDB fundamentals: All power to the programmer (and not the database). Weal and woe!

Related

Redux heavy lifting: where should I perform the download and saving to FS of multiple files?

I have a React+Redux application (built and distributed with electron) that once a day, at a given hour in the night, should download and save to the user's filesystem multiple files.
The API calls and the overall number of operations seem to me a bit too much to be done in the reducers, so I'm here to ask if there is a better design pattern for this.
Just to give you an idea, here's the operations I should perform to complete this task:
[API call] get a list of folders from the remote service
[API call] for each folder, get a list of contents
[FS]: verify if the local content is present and the same version as the remote one
[API call] if not, download content
[FS] save content to filesystem
The number of involved folders ranges from 10 to 30, and the contents could easily go up to 100 or more.
Key points:
The user is not using the app during this operation, so no need for webworkers or other async black magic
The sync could be done by an external script in another language, but I'd rather keep all the logic in a single app for ease of distribution and setup
All the points above marked as [API call] are asynchronous in my current setup, so there's a bit of non-trivial callback management involved
Any idea on where I could put this whole bunch of code, while still keeping my code readable and maintainable? Should it be the reducer, action creator, container component, presentational component or something else?
Thanks!
I'd create a redux-saga for this task. Check out the readme, it gives pretty good idea what is it. It can be quite difficult to understand at first, but in the end you get a code that is easy to read and test.

How to tell whether my firebase node is used?

Firebase offers some overall Analytics in their App Dashboard, however, I need to know whether my stored data are ever used or they are just lying idly on a per node basis.
Why? It's simple: we are learning while developing, which makes the app a very fast evolving one. Not only the logic changes, but also the data stored need to be refactored from time to time. I would like to get rid of abandoned and forgotten data. Any ideas?
In best case, I would like to know this:
When was a node used last time? (was it used at all?)
How many times was it used in 1h/24h/1w/1M?
Differentiate between read/write operations
2017 update
Cloud Functions trigger automatically and run on server.
https://firebase.google.com/docs/functions/
https://howtofirebase.com/firebase-cloud-functions-753935e80323
2016 answer
So apparently the Firebase itself doesn't provide any of this.
The only way I can think of right now is to create wrappers for firebase query and write functions and either do the statistics in a client app or create a devoted node for storing the statistical data.
In case of storing the data in firebase, the wrapper for writing functions (set, update, push, remove, setWithPriority) is relatively easy. The query functions (on, once) will have to write in a successCallback.

Handling rollbacked MySQL transactions in Node.js

I'm dealing with a promblem for a couple of days, and I'm really hoping, you could help me.
It's a node.js based API using sequelize for MySQL.
On certain API calls the code starts SQL transactions which lock certain tables, and if I send multiple requests to the API simultaneously, I got LOCK_WAIT_TIMEOUT errors.
var SQLProcess = function () {
var self = this;
var _arguments = arguments;
return sequelize.transaction(function (transaction) {
return doSomething({transaction: transactioin});
})
.catch(function (error) {
if (error && error.original && error.original.code === 'ER_LOCK_WAIT_TIMEOUT') {
return Promise.delay(Math.random() * 1000)
.then(function () {
return SQLProcess.apply(self, _arguments);
});
} else {
throw error;
}
});
};
My problem is, the simultaneously running requests lock each other for a long time, and my request returns after a long-long time (~60 seconds).
I hope I could explain it clear and understandable, and you could offer me some solution.
This may not be a direct answer to your question, but maybe by looking at why you had this problem would also help.
1) What does that doSomething() do? Anyway we can do some improvements there?
First, a transaction that take 60 sec is suspicious.. If you lock a table for that long, chances are the design should be revisited. Given a typical db operation runs 10 - 100 ms.
Ideally, all the data preparation should be done outside of the transaction, including data read from database. And the transaction should be really for only transactional operations.
2) Is it possible to use mysql stored procedure?
True, the stored procedure for mysql is not compiled, as PL/SQL for Oracle. But it is still running on the database server. If your application is really complicated and contain a lot of back and force network traffic between database and your node application in that transaction, and considering there are so many layer of javascript calls, it could really slowing things down. If 1) doesn't save you a lot of time, consider using mysql stored procedure.
The drawback of this approach, obviously, is that it is harder to maintain the codes in both nodejs and mysql.
If 1) and 2) are definitely not possible, you may consider some kind of flow control or queuing tool. Either your app make sure the 2nd request doesn't go until the first one finishes, or your have some 3rd party queuing tools to handle that. Seems you don't need any parallelism in running those requests anyway.
The main reason for deadlocks is poor database design. Without further information about your database design and which exact queries might or might not lock each other it is impossible to give you a specific solution for your problem.
However I can give you a general advice/approach to solve this issue:
I would make sure that your database is normalized at least into Third Normal Form or, if that still isnt enough even further. There might be tools to automate this process for you.
Aside from reducing the likelihood of deadlocks this also helps keeping your data consistent, which is always a good thing.
Keep your transactions as slim as possible. If you are inserting new rows into your tables and update other tables accordingly you might want to use a Trigger rather than another SQL statement to do so. The same applies to reading rows and values. Such things can be done before or after your transaction.
Choose the correct Isolation Level. Possible isolation levels are:
READ_UNCOMMITTED
READ_COMMITTED
REPEATABLE_READ
SERIALIZABLE
Sequelize's official documentation describes how you can set the isolation level and lock/unlock transactions by yourself.
As I said, without further insight about your database and query design thats all I can do for you right now.
Hope this helps.

Save to 3 firebase locations with a slow internet connection

Sometimes I'm having issues with firebase when the user is on a slow mobile connection. When the user saves an entry to firebase I actually have to write to 3 different locations. Sometimes, the first one works, but if the connection is slow the 2nd and 3rd may fail.
This leaves me with entries in the first location that I constantly need to clean up.
Is there a way to help prevent this from happening?
var newTikiID = ref.child("tikis").push(tiki, function(error){
if(!error){
console.log("new tiki created")
var tikiID = newTikiID.key()
saveToUser(tikiID)
saveToGeoFire(tikiID, tiki.tikiAddress)
} else {
console.log("an error occurred during tiki save")
}
});
There is no Firebase method to write to multiple paths at once. Some future tools planned by the team (e.g. Triggers) may resolve this in the future.
This topic has been explored before and the firebase-multi-write README contains a lot of discussion on the topic. The repo also has a partial solution to client-only atomic writes. However, there is no perfect solution without a server process.
It's important to evaluate your use case and see if this really matters. If the second and third writes failed to write to a geo query, chances are, there's really no consequence. Most likely, it's essentially the same as if the first write had failed, or if all writes had failed; it won't appear in searches by geo location. Thus, the complexity of resolving this issue is probably a time sink.
Of course, it does cost a few bytes of storage. If we're working with millions of records, that may matter. A simple solution for this scenario would be to run and audit report that detects broken links between the data and geofire tables and cleans up old data.
If an atomic operation is really necessary, such as gaming mechanics where fairness or cheating could be an issue, or where integrity is lost by having partial results, there are a couple options:
1) Master Record approach
Pick a master path (the one that must exist) and use security rules to ensure other records cannot be written, unless the master path exists.
".write": "root.child('maste_path').child(newData.child('master_record_id')).exists()"
2) Server-side script approach
Instead of writing the paths separately, use a queue strategy.
Create an single event by writing a single event to a queue
Have a server-side process monitor the queue and process events
The server-side process does the multiple writes and ensures they
all succeed
If any fail, the server-side process handles
rollbacks or retries
By using the server-side queue, you remove the risk of a client going offline between writes. The server can safely survive restarts and retry events or failures when using the queue model.
I have had the same problem and I ended up choosing to use condition Conditional Request with the Firebase REST API in order to write data transactionally. See my question and answer. Firebase: How to update multiple nodes transactionally? Swift 3 .
If you need to write concurrently (but not transactionally) to several paths, you can do that now as Firebase supports multi-path updates. https://firebase.google.com/docs/database/rest/save-data
https://firebase.googleblog.com/2015/09/introducing-multi-location-updates-and_86.html

Node.js, MongoDB, and Concurrency

I'm working on a game prototype and worried about the following case: Browser does AJAX to Node.JS, which has to do several MongoDB operations using async.series.
What prevents multiple requests at the same time causing the database issues? New events (i.e. db operations) seem like they could be run out of order or in between the async.series steps.
In other words, what happens if a user does AJAX calls very quickly, before the prior ones have finished their async.series. Hopefully that makes sense.
If this is indeed an issue, what is the proper way to handle it?
First and foremost, #fmodos's comment should be completely disregarded. It is wrong on many levels but most simply you could have any number of nodes running (say on Heroku) and there is no guarantee that subsequent requests will hit the same node.
Now, I'm going to answer your question by asking more questions. (You really didn't give me a choice here)
What are these operations doing? Inserting documents? Updating existing documents? Removing documents? This is very important because if all you're doing is simply inserting documents then why does it matter if one finishes for before the other? If you're updating documents then you should NOT be issuing a find, grabbing a ref to the object, and then calling save. (I'm making the assumption you're using mongoose, if you're not, I would) Instead what you should be doing is using built in mongo functions like $inc which properly handle concurrent requests.
http://docs.mongodb.org/manual/reference/operator/update/inc/
Does that help at all? If not, please let me know and I will give it another shot.
Mongo has database wide read/write locks. It gives preference to writes of the same collection first then fulfills reads. So, if by chance, you have Bill writing to the db and Joe is reading at the same time, Bill's write will execute first while Joe waits until the write is complete and then he is given all the data (including Bill's).

Categories