Synchronizing MongoDB server data to an IndexedDB local store

Synchronizing MongoDB server data to an IndexedDB local store - javascript

I'm trying to evaluate using IndexedDB to solve the offline issue. It would be populated with data currently stored in a MongoDB database (as is).
Once data is stored in IndexedDB, it may be changed on the MongoDB server and I need to propagate those changes. Is there any existing framework or Library to do somehting like this for Mongo. I already know about CouchDB/PouchDB and am not exploring those two.

[Sync solution for 2021]
I know the question asked was for MongoDB specifically, but since this is an old thread I thought readers might be looking for other solutions for new apps or rebuilds. I can really recommend to check out AceBase because it does exactly what you were looking for back then.
AceBase is a free and open source realtime database that enables easy storage and synchronization between browser and server databases. It uses IndexedDB in the browser, its own binary db / SQL Server / SQLite storage on the server. Offline edits are synced upon reconnect and clients are notified of remote database changes in realtime through a websocket (FAST!).
On top of this, AceBase has a unique feature called "live data proxies" that allow you to have all changes to in-memory objects to be persisted and synced to local and server databases, and remote changes to automatically update your in-memory objects. This means you can forget about database coding altogether, and code as if you're only using local objects. No matter whether you're online or offline.
The following example shows how to create a local IndexedDB database in the browser, how to connect to a remote database server that syncs with the local database, and how to create a live data proxy that eliminates further database coding. AceBase supports authentication and authorization as well, but I left it out for simplicity.
const { AceBaseClient } = require('acebase-client');
const { AceBase } = require('acebase');
// Create local database with IndexedDB storage:
const cacheDb = AceBase.WithIndexedDB('mydb-local');
// Connect to server database, use local db for offline storage:
const db = new AceBaseClient({ dbname: 'mydb', host: 'db.myproject.com', port: 443, https: true, cache: { db: cacheDb } });
// Wait for remote database to be connected, or ready to use when offline:
db.ready(async () => {
// Create live data proxy for a chat:
const emptyChat = { title: 'New chat', messages: {} };
const proxy = await db.ref('chats/chatid1').proxy(emptyChat); // Use emptyChat if chat node doesn't exist
// Get object reference containing live data:
const chat = proxy.value;
// Update chat's properties to save to local database,
// sync to server AND all other clients monitoring this chat in realtime:
chat.title = `Changing the title`;
chat.messages.push({
from: 'ewout',
sent: new Date(),
text: `Sending a message that is stored in the database and synced automatically was never this easy!` +
`This message might have been sent while we were offline. Who knows!`
});
// To monitor and handle realtime changes to the chat:
chat.onChanged((val, prev, isRemoteChange, context) => {
if (val.title !== prev.title) {
alert(`Chat title changed to ${val.title} by ${isRemoteChange ? 'us' : 'someone else'}`);
}
});
});
For more examples and documentation, see AceBase realtime database engine at npmjs.com

Open up a changeStream with the resumeToken. There's no guarantee of causal consistency however since we're talking multiple disparate databases.

I haven't worked with IndexDB, but the design problem isn't that uncommon. My understanding of your app is that when the client makes the connection to MongoDB, you pull a set of documents down for local storage and disconnect. The client then can do things locally (not connected to the data server), and then push up the changes.
The way I see it you've got to handle two general cases:
when the MongoDB server is updated and breaks continuity with the client, the client will have to
poll for the data (timer?) or
keep a websocket open to let notifications free-flow over the pipe
when the user needs to push changed data back up the pipe
you can reconnect asynchronously, check for state changes, (resolving conflicts according to your business rules)
have a server side (light) interface for handling conflicts (depending on complexity of your app, comparing time stamps of state changes in MongoDB to IndexedDB updates should suffice)

Related

Confusion over session data being stored in a database being a violation of REST principles

I am writing a report about an application I have designed that includes, what I believe to be, a REST API on the backend.
The way the application authorises users to request resources from the database is by using session cookies. I understand there is a lot of debate about whether or not session cookies server-side violate REST, but I have not found any specific clarification that the way I am using them breaks REST rules.
I am using the node Express framework with the express-session package. The way the cookies are created and stored is through a middleware that saves the session data to my mongodb instance with connect-mongodb-session like so:
app.js
// app.js imports start
const mongoose = require("mongoose");
const session = require("express-session");
const config = require("config");
const MongoDBStore = require("connect-mongodb-session")
// app.js imports end
const mdbStore = new MongoDBStore({
uri: config.get("mongoURI"),
mongooseConnection: mongoose.connection,
collection: "sessions",
ttl: config.get("sessionLife") / 1000,
});
// Session middleware
app.use(
session({
name: config.get("sessionName"),
genid: function () {
return uuid.v4();
},
secret: config.get("sessionKey"),
resave: false,
saveUninitialized: false,
cookie: {
sameSite: true,
httpOnly: true,
maxAge: config.get("sessionLife"),
},
store: mdbStore,
})
);
This means that when a client request comes in, the client's authorisation data will be available via req.session, but that data is coming from my database, not being stored on the server anywhere.
So ultimately this means that my server doesn't store any user data directly, but has a dependency on the state of a session cookie stored in the database. Does this mean the API is not RESTful?
I have read through this SO article and only found a small mention of cookies stored in a database Do sessions really violate RESTfulness? but would still really appreciate any comments/clarifications/criticisms anyone has. Thanks

it is based on the nature of the front end
if you use mobile application deployed in a public store where anyone downloads it and auto register using social ID, then your technology is not good
Usually for a enterprise mobile application, the session Data should be encrypted and sent back and forth in the request response and maintained in the mobile code
if this is simply a web page and the REST also available in the same sever where the HTML is deployed then session can be stored in DB
If the REST is separated in another computer and you invoke it from the front end server side code via internal ip/host address which is not exposed to public, then your logic is not good
front end server side code - means you can have a dedicated server which responsible for react js execution which does not contains the database access code - only AJAX service it will have which is obviously REST, there can be another server which will again receive another REST call which will talk to another computer where MySQL or Oracle is installed
means 1 web server 1 app server and 1 database server - like real world enterprise applications
if your DB is not configured in the same computer then storing session in DB is not a good idea, create a cache DB server like redis or couchbase in the first computer and store the session there, leave the business DB alone separated from your UI logic and needs

sync data from mongoDB to firebase and vice-versa

My current situation:
I have created an application using React, NodeJS and Electron. Most of the users are a kind of offline users. They use my application offline.
Next plans:
Now, I am planning to create a mobile application for them. I plan to create that application using React-Native.
Since their database is offline, I planned to give them a sync to firebase button in desktop application. When he clicks on sync to firebase button, the data in their local mongodb should syncronize with firebase.
My thoughts:
when a new record is added to mongodb, I will store a new key with that record which will look like: new: true.
when a record is updated I will store a key named updated: true
similarly for delete...
And then when user presses Sync to firebase, I will search for those records and add/update/delete respective records on firebase and then I will remove those keys from mongodb database.
Problems in executing my thoughts:
At first it does not smell me a good thing as I think it is time consuming because I will perform operations on firebase as well as mongodb.
Another problem with this approach is that if I think the other way round, that when user add/update/delete a record from React-Native app, firebase will have those keys line new/updated/deleted and then when user presses sync button in desktop application, I will have to do same thing but in reverse.
Yet another problem is that if user accidently uninstalled my application and then reinstalls it, then what should I do?
And the biggest problem is managing all the things.
My Expectations:
So, I want a clean and maintainable approach. Does any one have any idea on how to sync data from mongodb to firebase and vice-versa?

Both database systems supports for some sort of operation log or trigger system. You can use these to live update changes to databases to sync them almost real time.
For MongoDB
You can use Oplog to see what changes made to database (insert/update/delete) and run a suitable function to sync firebase.
oplog
A capped collection that stores an ordered history of logical writes
to a MongoDB database. The oplog is the basic mechanism enabling
replication in MongoDB.
There are small libraries that help you easily subscribe to these events.
Example (mongo-oplog)
import MongoOplog from 'mongo-oplog'
const oplog = MongoOplog('mongodb://127.0.0.1:27017/local', { ns: 'test.posts' })
oplog.tail();
oplog.on('op', data => {
console.log(data);
});
oplog.on('insert', doc => {
console.log(doc);
});
oplog.on('update', doc => {
console.log(doc);
});
oplog.on('delete', doc => {
console.log(doc.o._id);
});
For Firebase
You can use Cloud Functions. With Cloud Functions you can watch triggers like Cloud Firestore Triggers or Realtime Database Triggers and run a function to sync MongoDB database.
With Cloud Functions, you can handle events in the Firebase Realtime
Database with no need to update client code. Cloud Functions lets you
run database operations with full administrative privileges, and
ensures that each change to the database is processed individually.
// Listens for new messages added to /messages/:pushId/original and creates an
// uppercase version of the message to /messages/:pushId/uppercase
exports.makeUppercase = functions.database.ref('/messages/{pushId}/original').onWrite((event) => {
// Grab the current value of what was written to the Realtime Database.
const original = event.data.val();
console.log('Uppercasing', event.params.pushId, original);
const uppercase = original.toUpperCase();
// You must return a Promise when performing asynchronous tasks inside a Functions such as
// writing to the Firebase Realtime Database.
// Setting an "uppercase" sibling in the Realtime Database returns a Promise.
return event.data.ref.parent.child('uppercase').set(uppercase);
});

How do I sync data with remote database in case of offline-first applications?

I am building a "TODO" application which uses Service Workers to cache the request's responses and in case a user is offline, the cached data is displayed to the user.
The Server exposes an REST-ful endpoint which has POST, PUT, DELETE and GET endpoints exposed for the resources.
Considering that when the user is offline and submitting a TODO item, I save that to local IndexedDB, but I can't send this POST request for the server since there is no network connection. The same is true for the PUT, DELETE requests where a user updates or deletes an existing TODO item
Questions
What patterns are in use to sync the pending requests with the REST-ful Server when the connection is back online?

What patterns are in use to sync the pending requests with the REST-ful Server when the connection is back online?
Background Sync API will be suitable for this scenario. It enables web applications to synchronize data in the background. With this, it can defer actions until the user has a reliable connection, ensuring that whatever the user wants to send is actually sent. Even if the user navigates away or closes the browser, the action is performed and you could notify the user if desired.
Since you're saving to IndexDB, you could register for a sync event when the user add, delete or update a TODO item
function addTodo(todo) {
return addToIndeDB(todo).then(() => {
// Wait for the scoped service worker registration to get a
// service worker with an active state
return navigator.serviceWorker.ready;
}).then(reg => {
return reg.sync.register('add-todo');
}).then(() => {
console.log('Sync registered!');
}).catch(() => {
console.log('Sync registration failed :(');
});
}
You've registered a sync event of type add-todo which you'll listen for in the service-worker and then when you get this event, you retrieve the data from the IndexDB and do a POST to your Restful API.
self.addEventListener('sync', event => {
if (event.tag == 'add-todo') {
event.waitUntil(
getTodo().then(todos => {
// Post the messages to the server
return fetch('/add', {
method: 'POST',
body: JSON.stringify(todos),
headers: { 'Content-Type': 'application/json' }
}).then(() => {
// Success!
});
})
})
);
}
});
This is just an example of how you could achieve it using Background Sync. Note that you'll have to handle conflict resolution on the server.
You could use PouchDB on the client and Couchbase or CouchDB on the server. With PouchDB on the client, you can save data on the client and set it to automatically sync/replicate the data whenever the user is online. When the database synchronizes and there are conflicting changes, CouchDB will detect this and will flag the affected document with the special attribute "_conflicts":true. It determines which one it'll use as the latest revision, and save the others as the previous revision of that record. It does not attempt to merge the conflicting revision. It is up to you to dictate how the merging should be done in your application. It's not so different from Couchbase too. See the links below for more on Conflict Resolution.
Conflict Management with CouchDB
Understanding CouchDB Conflict
Resolving Couchbase Conflict
Demystifying Conflict Resolution in Couchbase Mobile
I've used pouchDB and couchbase/couchdb/IBM cloudant but I've done that through Hoodie It has user authentication out-of-the box, handles conflict management, and a few more. Think of it like your backend. In your TODO application, Hoodie will be a great fit. I've written something on how to use Hoodie, see links Below:
How to build offline-smart application with Hoodie
Introduction to offline data storage and sync with PouchBD and Couchbase

At the moment I can think of two approaches and it depend on what storage options you are using at your backend.
If you are using an RDBMS to backup all data:
The problem with offline first systems in this approach is the possibility of conflict that you may face when posting new data or updating existing data.
As a first measure to avoid conflicts from happening you will have to generate unique IDs for all objects from your clients and in such a way that they remain unique when posted on the server and saved in a data base. For this you can safely rely on UUIDs for generating unique IDs for objects. UUID guarantees uniqueness across systems in a distributed system and depending on what your language of implementation is you will have methods to generate UUIDs without any hassle.
Design your local database such that you can use UUIDs as primary key in your local database. On the server end you can have both, an integer type auto incremented and indexed, primary key and a VARCHAR type to hold the UUIDs. The primary key on server uniquely identifies objects in that table while UUID uniquely identifies records across tables and databases.
So when posting your object to server at the time of syncing you will have to just check if any object with the UDID is already present and take appropriate action from there. When your are fetching objects from the server send both the primary key of the object from your table and the UDID for the objects. This why when you serialise the response in model objects or save them in local database you can tell the objects which have been synced from the ones which haven't as the objects that needs syncing will not have a primary key in your local database, just the UUID.
There may be a case when your server malfunctions and refuses to save data when you are syncing. In this case you can keep an integer variable in your objects that will keep a count of the number of times you have tried syncing it. If this number exceed by a certain value, say 3, you move on to sync the next object. Now what you do with the unsynced objects is up you the policy you have for such objects, as a solution you could discard them or keep them just locally.
If you are not using RDBMS
As an alternate approach, instead of keeping all objects you could keep transactions that each client perform locally to the server. Each client syncs just the transactions and the while fetching you get the current state by working all the transactions from bottom up. This is very similar to what Git uses. It saves changes in your repository in form of transactions like what has been added (or removed) and by whom. The current state of the repository for each user is worked from the transactions. This approach will not result in conflicts but as you can see its a little tricky to develop.

Google Cloud Storage change notifications with Node.js

I have Firebase storage bucket and I would like to use Node.js Google-cloud notification API in order to listen to changes in the storage.
What I have so far:
const gcloud = require('google-cloud');
const storage = gcloud.storage({
projectId: 'projectId',
credentials: serviceAccount
});
const storageBucket = storage.bucket('bucketId');
Now from what I understand I have to create a channel in order to listen to storage changes.
So I have:
const storageBucketNotificationChannel = storage.channel('channelId', 'resourceId');
This is the threshold where the docs stop being clear, as I can't figure out what channelId a resourceId stand for.
Nor do I understand how to declare listening to channel changes itself. Are there any lifecycle-type methods to do so?
Can I do something like?
storageBucketNotificationChannel.onMessage(message => { ... })

Based on the existing documentation of the Google Cloud Node.js Client and the feedback from this Github issue, there is presently no way for the node client to create a channel or subscribe to object change notifications.
One of the reasons being that the machine using the client may not necessarily be the machine on which the application runs, and thus a security risk. One can still however, subscribe to object change notifications for a given bucket and have notifications received a Node.js GAE application.
Using Objects: watchAll JSON API
When using gsutil to subscribe, gsutil sends a POST request to https://www.googleapis.com/storage/v1/b/bucket/o/watch where bucket is the name of the bucket to be watched. This is essentially a wrapper around the JSON API Objects: watchAll. Once a desired application/endpoint has been authorized as described in Notification Authorization, one can send the appropriate POST request to said API and provide the desired endpoint URL in address. For instance, address could be https://my-node-app.example.com/change.
The Node/Express application service would then need to listen to POST requests to path /change for notifications resembling this. The application would then act upon that data accordingly. Note, the application should respond to the request as described in Reliable Delivery for Cloud Storage to retry if it failed or stop retrying if it succeeded.

Node JS live text update with CloudMQTT

I have a node server which is connecting to CloudMQTT and receiving messages in app.js. I have my client web app running on the same node server and want to display my messages received in app.js elsewhere in a .ejs file, I'm struggling as to how best to do this.
app.js
// Create a MQTT Client
var mqtt = require('mqtt');
// Create a client connection to CloudMQTT for live data
var client = mqtt.connect('xxxxxxxxxxx', {
username: 'xxxxx',
password: 'xxxxxxx'
});
client.on('connect', function() { // When connected
console.log("Connected to CloudMQTT");
// Subscribe to the temperature
client.subscribe('Motion', function() {
// When a message arrives, do something with it
client.on('message', function(topic, message, packet) {
// ** Need to pass message out **
});
});
});

Basically you need a way for the client (browser code with EJS - HTML, CSS and JS) to receive live updates. There are basically two ways to do this from the client to the node service:
A websocket session instantiated by the client.
A polling approach.
What's the difference?
Under the hood, a websocket is full-duplex communication mechanism. That means that you can open a socket from the client (browser) to the node server and they can talk to each other both ways over a long-lived session. The pro is that updates are often times instantaneous without having to incur the cost of making another HTTP request as in the polling case. The con is that it uses a socket connection that may be long-lived, and there is typically a socket pool on any server that has limited ability to deal with many sockets. There are ways to scale around this issue, but if it's a big concern for you, you may want to go with polling.
Polling is where you set up an endpoint on your server that the client JS code hits every now and then. That endpoint will return you the updated information. The con is that you are now making a new request in order to get updates, which may not be desirable if a lot of updates are expected to come through and the app is expected to be updated in the timeliest manner possible (most of the time polling is sufficient though). The pro is that you do not have a live connection open on the server indefinitely.
Again, there are many more pros and cons, these are just the obvious ones. You decide how to implement it. When the client receives the data from either of these mechanisms, you may update the UI in any suitable manner.
From the server end, you will need a way to persist the information coming from CloudMQTT. There are multiple ways to do this. If you do not care about memory consumption and are ok with potentially throwing away old data if a client does not ask for it for a while, then it may be ok to just store this in memory in a regular javascript object {}. If you do care about persisting the data between server restarts/crashes (probably best), then you can persist to something like Redis, Mongo, any of the SQL stores if your data is relational in nature, or even a regular JSON file on disk (see fs.writeFile).
Hope this helped give you a step in the right direction!

We Keep Coding

JavaScript is the programming language of the Web.