How do I properly design Aggregate in DDD, Event-sourcing

How do I properly design Aggregate in DDD, Event-sourcing - javascript

Suppose I want to make an e-commerce system. I have 2 aggregates here ProductAggregate and UserAggregate. Product aggregate contains productId, price. User aggregate contains userId and balance. Here's the problem, in event-sourcing we should not rely on the read model since there might be eventual consistency problem. Ok so we should rely on the command model right I guess?, but this two command model is different. I read from somewhere else they told me that aggregate should only rely on its state. Let's say the user want to buy a product I have to check if he has enough balance and in order to do that I need to know the price of product. So read model not allowed, aggregate query not allowed. what options do I have here?
const ProductAggregate = {
state: {
productId: "product-1",
price: 100
}
}
const UserAggregate = {
state: {
userId: "userId-1",
balance: 50
},
handlePurchase: ({ userId, productId }) => {
// todo I got productId from the client, but how can I retrieve its price ?
if (this.state.balance < price) {
throw "Insufficient balance bro."
}
}
}
So I though it must be my bad aggregate design which makes UserAggregate requires state from outside of its context. So in this situation how do I properly design an Aggregate for User and Product.
edited:
I have been thinking all day long for the solution and I came up with this approach. So instead of putting purchase command in the UserAggregate I put it in the ProductAggregate and call it OrderProductCommand which is a bit weird for me since the product itself can't create an order, but the user can (it seems to work anyway I don't even know?). So with this approach I can now retrieve the price and send another command DeductBalanceCommand which will deduct amount of money from the user.
const ProductAggregate = {
state: {
productId: "product-1",
price: 100
},
handleOrder: ({productId, userId}) => {
await commandBus.send({
command: "handleDeduct",
params: {
userId: userId,
amount: this.state.price
}
})
.then(r => eventBus.publish({
event: "OrderCreated",
params: {
productId: productId,
userId: userId
}
}))
.catch(e => {
throw "Unable to create order due to " + e.message
})
}
}
const UserAggregate = {
state: {
userId: "userId-1",
balance: 50
},
handleDeduct: ({ userId, amount }) => {
if (this.state.balance < amount) {
throw "Insufficient balance bro."
}
eventBus.publish({
event: "BalanceDeducted",
params: {
userId: userId,
amount: amount
}
})
}
}
Is it fine and correct to use this approach? it's a bit weird for me or maybe it's just a way of thinking in DDD world?
ps. I added javascript tag so my code can have colors and easy to read.

First of all, regarding your handle, you're not stupid :)
A few points:
In many situations you can query the read model even though there's eventual consistency. If you reject a command that would have been accepted had a pending update become visible in the read model, that can typically be retried. If you accept a command that would have been rejected, there's often a compensating action that can be applied after the fact (e.g. a delay between ordering a physical product and that product being delivered).
There are a couple of patterns that can be useful. One is the saga pattern where you would model the process of a purchase. Rather than "user A buys product X", you might have an aggregate corresponding to "user A's attempt to purchase product X", which validates and reserves that user A is able to buy X and that X is able to be purchased.
Every write model with an aggregate implies the existence of one sufficiently consistent read model for that aggregate. One can thus define queries or "read-only" commands against the write model. CQRS (IMO) shouldn't be interpreted as "don't query the write model" but "before trying to optimize the write model for reads (whether ease, performance, etc.), give strong consideration to handling that query with a read model": i.e. if you're querying the write model, you give up some of the right to complain about the queries being slow or difficult. Depending on how you're implementing aggregates this option may or may not be easy to do.

Related

prevent firestore number from going nevgative?

I'm New to firestore. I need a little help.
I got a function, that updates the amount of "stock shares". its a "sell" function, so the amount can only go down.
The problem is. I don't wanna go below 0.
So I wanna get the PREVIOUS amount of shares. before I update to the new amount.
So I can make sure I don't go below 0.
There are 2 ways to do this. 1 is by using firestore rules, 2 is by getting the Prev amount like i said.
Can you guys help me get the Prev amount before the UPDATE stage?
code:
function sellStock(){
db.collection("Stocks")
.where("ticker", "==", props.name)
.get()
.then((querySnapshot) => {
if(!querySnapshot.empty){
querySnapshot.forEach(function(doc){
db.collection("myStocks")
.doc(doc.id)
.update({
shares: doc.data().shares - amount
})
"shares" will be the prev amount.
"amount" will be the amount of shares we wanna sell.

Try this way:
.update({
shares: (doc.data().shares - amount) >= 0 ? (doc.data().shares - amount) : 0
})

Updated after discussion in the comments.
There is an important aspect to consider with your business logic: Do you need to execute an atomic operation on several documents? Example: You are subtracting the value of amount to the value of shares, since it is a selling operation but I guess that somewhere else (in another document) you are also adding some value, for example in the bank account of the seller.
In such a case you should use a Transaction: "if a transaction reads documents and another client modifies any of those documents, Cloud Firestore retries the transaction". You need to include in the Transaction all the documents that need to be locked while the operation is ongoing (i.e. all the docs that are involved in the operation, the ones on which you subtract and the ones on which you add).
However, since you want to update several documents, returned by a query, you cannot use a Transaction of one of the mobile/web SDKs (e.g. iOS, Android, Web/JavaScript), because the mobile/web SDKs use optimistic concurrency controls to resolve data contention.
What you can do is to use one of the Admin SDKs, like the Node.js one, since it uses pessimistic concurrency controls and therefore offers the possibility to run a Transaction on a query (see that you can pass a query to the get() method). So you could do that in a Callable Cloud Function.
Here is an example of a Transaction that will atomically update all the docs on which you substract. Since you didn't share the entire business logic (we don't know which are the docs that you need to update by adding a value) it's a bit difficult to go deeper in the example.
exports.updateTickers = functions.https.onCall((data, context) => {
// get the value of the filter (i.e. props.name) via the data Object
const filter = data.filter;
const amount = data.amount;
const db = admin.firestore();
return db.runTransaction(transaction => {
let queryRef = db.collection("Stocks").where("ticker", "==", filter);
return transaction.get(queryRef)
.then((querySnapshot) => {
querySnapshot.forEach((doc) => {
const currentValue = doc.get('amount');
if (currentValue - amount > 0) (
transaction.update(doc.ref, { likes: currentValue - amount })
)
});
});
})
.then(() => {
return { result: "Amounts update successful" }
})
});

The only sure-fire way to prevent the shares count from going below 0 is to enforce that in security rules:
allow write: if request.resource.data.shares >= 0;
Any other method can be bypassed by a malicious user who uses your project configuration data with their code.
With this in place, the simplest (and fastest) way to subtract the amount from the shares count is with an atomic increment operation:
db.collection("myStocks")
.doc(doc.id)
.update({
shares: firebase.firestore.FieldValue.increment(-1 * amount)
})

Sequelize Associations: How to update parent model when creating child?

It seems i have misunderstood sequelize .hasMany() and .belongsTo() associations and how to use them in service. I have two models:
const User = db.sequelize.define("user", {
uid: { /*...*/ },
createdQuestions: {
type: db.DataTypes.ARRAY(db.DataTypes.UUID),
unique: true,
allowNull: true,
},
});
const Question = db.sequelize.define("question", {
qid: { /*...*/ },
uid: {
type: db.DataTypes.TEXT,
},
});
Given that one user can have many questions and each question belongs to only one user I have the following associatons:
User.hasMany(Question, {
sourceKey: "createdQuestions",
foreignKey: "uid",
constraints: false,
});
Question.belongsTo(User, {
foreignKey: "uid",
targetKey: "createdQuestions",
constraints: false,
});
What I want to achieve is this: After creation of a question object, the qid should reside in the user object under "createdQuestions" - just as the uid resides in the question object under uid. What I thought sequelize associations would do for me is to save individual calling and updating the user object. Is there a corresponding method? What I have so far is:
const create_question = async (question_data) => {
const question = { /*... question body containing uid and so forth*/ };
return new Promise((resolve, rejected) => {
Question.sync({ alter: true }).then(
async () =>
await db.sequelize
.transaction(async (t) => {
const created_question = await Question.create(question, {
transaction: t,
});
})
.then(() => resolve())
.catch((e) => rejected(e))
);
});
};
This however only creates a question object but does not update the user. What am I missing here?

Modelling a One-to-many relationship in SQL
SQL vs NoSQL
In SQL, contrary to how it is in NoSQL, every attribute has a fixed data type with a fixed limit of bits. That's manifested by the SQL command when creating a new table:
CREATE TABLE teachers (
name VARCHAR(32),
department VARCHAR(64),
age INTEGER
);
The reason behind this is to allow us to easily access any attribute from the database by knowing the length of each row. In our case, each row will need the space needed to store:
32 bytes (name) + 64 bytes (department) + 4 bytes (age) = 100 byes
This is a very powerful feature in Relation Databases as it minimizes the time needed to retrieve data to Constant time since we knew where each piece of data is located in the memory.
One-to-Many Relationship: Case Study
Now, let's consider we have these 3 tables
Let's say we want to create a one-to-many relation between classes and teachers where a Teacher can give many classes.
We can think of it this way. But, this model is not possible for 2 main reasons:
It will make us lose our constant-time retrieval since we don't know the size of the list anymore
We fear that the amount of space given to the list attribute won't be enough for future data. Let's say we allocate space needed for 10 classes and we end up with a teacher giving 11 classes. This will push us to recreate our database to increase the column size.
Another way would be this:
While this approach will fix the limited column size problem, we no longer have a single source of truth. The same data is duplicated and stored multiple times.
That's why for this one-to-many relationship, we'll need to store the Id of the teacher inside this class table.
This way, we still can find all the classes a teacher can teach by running
SELECT *
FROM classes
WHERE teacherID = teacher_id
And we'll avoid all the problems discussed earlier.

Your relation is a oneToMany relation. One User can have multiple Questions. In SQL, this kind of relation is modelled by adding an attribute to Question called userId or Uid as you did. In Sequelize, this would be achieved through a hasMany or BelongsTo like this:
User.hasMany(Question)
Question.belongsTo(User, {
foreignKey: 'userId',
constraints: false
})
In other words, I don't think you need the CreatedQuestions attribute under User. Only one foreign key is needed to model the oneToMany relation.
Now, when creating a new question, you just need to add the userId this way
createNewQuestion = async (userId, title, body) => {
const question = await Question.create({
userId: userId, // or just userId
title: title, // or just title
body: body // or just body
})
return question
}
Remember, we do not store arrays in SQL. Even if we can find a way to do it, it is not what we need. There must be always a better way.

Sequelize dynamic seeding

I'm currently seeding data with Sequelize.js and using hard coded values for association IDs. This is not ideal because I really should be able to do this dynamically right? For example, associating users and profiles with a "has one" and "belongs to" association. I don't necessarily want to seed users with a hard coded profileId. I'd rather do that in the profiles seeds after I create profiles. Adding the profileId to a user dynamically once profiles have been created. Is this possible and the normal convention when working with Sequelize.js? Or is it more common to just hard code association IDs when seeding with Sequelize?
Perhaps I'm going about seeding wrong? Should I have a one-to-one number of seeds files with migrations files using Sequelize? In Rails, there is usually only 1 seeds file you have the option of breaking out into multiple files if you want.
In general, just looking for guidance and advice here. These are my files:
users.js
// User seeds
'use strict';
module.exports = {
up: function (queryInterface, Sequelize) {
/*
Add altering commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkInsert('Person', [{
name: 'John Doe',
isBetaMember: false
}], {});
*/
var users = [];
for (let i = 0; i < 10; i++) {
users.push({
fname: "Foo",
lname: "Bar",
username: `foobar${i}`,
email: `foobar${i}#gmail.com`,
profileId: i + 1
});
}
return queryInterface.bulkInsert('Users', users);
},
down: function (queryInterface, Sequelize) {
/*
Add reverting commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkDelete('Person', null, {});
*/
return queryInterface.bulkDelete('Users', null, {});
}
};
profiles.js
// Profile seeds
'use strict';
var models = require('./../models');
var User = models.User;
var Profile = models.Profile;
module.exports = {
up: function (queryInterface, Sequelize) {
/*
Add altering commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkInsert('Person', [{
name: 'John Doe',
isBetaMember: false
}], {});
*/
var profiles = [];
var genders = ['m', 'f'];
for (let i = 0; i < 10; i++) {
profiles.push({
birthday: new Date(),
gender: genders[Math.round(Math.random())],
occupation: 'Dev',
description: 'Cool yo',
userId: i + 1
});
}
return queryInterface.bulkInsert('Profiles', profiles);
},
down: function (queryInterface, Sequelize) {
/*
Add reverting commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkDelete('Person', null, {});
*/
return queryInterface.bulkDelete('Profiles', null, {});
}
};
As you can see I'm just using a hard coded for loop for both (not ideal).

WARNING: after working with sequelize for over a year, I've come to realize that my suggestion is a very bad practice. I'll explain at the bottom.
tl;dr:
never use seeders, only use migrations
never use your sequelize models in migrations, only write explicit SQL
My other suggestion still holds up that you use some "configuration" to drive the generation of seed data. (But that seed data should be inserted via migration.)
vv DO NOT DO THIS vv
Here's another pattern, which I prefer, because I believe it is more flexible and more readily understood. I offer it here as an alternative to the accepted answer (which seems fine to me, btw), in case others find it a better fit for their circumstances.
The strategy is to leverage the sqlz models you've already defined to fetch data that was created by other seeders, use that data to generate whatever new associations you want, and then use bulkInsert to insert the new rows.
In this example, I'm tracking a set of people and the cars they own. My models/tables:
Driver: a real person, who may own one or more real cars
Car: not a specific car, but a type of car that could be owned by someone (i.e. make + model)
DriverCar: a real car owned by a real person, with a color and a year they bought it
We will assume a previous seeder has stocked the database with all known Car types: that information is already available and we don't want to burden users with unnecessary data entry when we can bundle that data in the system. We will also assume there are already Driver rows in there, either through seeding or because the system is in-use.
The goal is to generate a whole bunch of fake-but-plausible DriverCar relationships from those two data sources, in an automated way.
const {
Driver,
Car
} = require('models')
module.exports = {
up: async (queryInterface, Sequelize) => {
// fetch base entities that were created by previous seeders
// these will be used to create seed relationships
const [ drivers , cars ] = await Promise.all([
Driver.findAll({ /* limit ? */ order: Sequelize.fn( 'RANDOM' ) }),
Car.findAll({ /* limit ? */ order: Sequelize.fn( 'RANDOM' ) })
])
const fakeDriverCars = Array(30).fill().map((_, i) => {
// create new tuples that reference drivers & cars,
// and which reflect the schema of the DriverCar table
})
return queryInterface.bulkInsert( 'DriverCar', fakeDriverCars );
},
down: (queryInterface, Sequelize) => {
return queryInterface.bulkDelete('DriverCar');
}
}
That's a partial implementation. However, it omits some key details, because there are a million ways to skin that cat. Those pieces can all be gathered under the heading "configuration," and we should talk about it now.
When you generate seed data, you usually have requirements like:
I want to create at least a hundred of them, or
I want their properties determined randomly from an acceptable set, or
I want to create a web of relationships shaped exactly like this
You could try to hard-code that stuff into your algorithm, but that's the hard way. What I like to do is declare "configuration" at the top of the seeder, to capture the skeleton of the desired seed data. Then, within the tuple-generation function, I use that config to procedurally generate real rows. That configuration can obviously be expressed however you like. I try to put it all into a single CONFIG object so it all stays together and so I can easily locate all the references within the seeder implementation.
Your configuration will probably imply reasonable limit values for your findAll calls. It will also probably specify all the factors that should be used to calculate the number of seed rows to generate (either by explicitly stating quantity: 30, or through a combinatoric algorithm).
As food for thought, here is an example of a very simple config that I used with this DriverCar system to ensure that I had 2 drivers who each owned one overlapping car (with the specific cars to be chosen randomly at runtime):
const CONFIG = {
ownership: [
[ 'a', 'b', 'c', 'd' ], // driver 1 linked to cars a, b, c, and d
[ 'b' ], // driver 2 linked to car b
[ 'b', 'b' ] // driver 3 has two of the same kind of car
]
};
I actually used those letters, too. At runtime, the seeder implementation would determine that only 3 unique Driver rows and 4 unique Car rows were needed, and apply limit: 3 to Driver.findAll, and limit: 4 to Car.findAll. Then it would assign a real, randomly-chosen Car instance to each unique string. Finally, when generating association tuples, it uses the string to look up the chosen Car from which to pull foreign keys and other values.
There are undoubtedly fancier ways of specifying a template for seed data. Skin that cat however you like. Hopefully this makes it clear how you'd marry your chosen algorithm to your actual sqlz implementation to generate coherent seed data.
Why the above is bad
If you use your sequelize models in migration or seeder files, you will inevitably create a situation in which the application will not build successfully from a clean slate.
How to avoid madness:
Never use seeders, only use migrations
(Anything you can do in a seeder, you can do in a migration. Bear that in mind as I enumerate the problems with seeders, because that means none of these problems gain you anything.)
By default, sequelize does not keep records of which seeders have been run. Yes, you can configure it to keep records, but if the app has already been deployed without that setting, then when you deploy your app with the new setting, it'll still re-run all your seeders one last time. If that's not safe, your app will blow up. My experience is that seed data can't and shouldn't be duplicated: if it doesn't immediately violate uniqueness constraints, it'll create duplicate rows.
Running seeders is a separate command, which you then need to integrate into your startup scripts. It's easy for that to lead to a proliferation of npm scripts that make app startup harder to follow. In one project, I converted the only 2 seeders into migrations, and reduced the number of startup-related npm scripts from 13 to 5.
It's been hard to pin down, but it can be hard to make sense of the order in which seeders are run. Remember also that the commands are separate for running migrations and seeders, which means you can't interleave them efficiently. You'll have to run all migrations first, then run all seeders. As the database changes over time, you'll run into the problem I describe next:
Never use your sequelize models in your migrations
When you use a sequelize model to fetch records, it explicitly fetches every column it knows about. So, imagine a migration sequence like this:
M1: create tables Car & Driver
M2: use Car & Driver models to generate seed data
That will work. Fast-forward to a date when you add a new column to Car (say, isElectric). That involves: (1) creating a migraiton to add the column, and (2) declaring the new column on the sequelize model. Now your migration process looks like this:
M1: create tables Car & Driver
M2: use Car & Driver models to generate seed data
M3: add isElectric to Car
The problem is that your sequelize models always reflect the final schema, without acknowledging the fact that the actual database is built by ordered accretion of mutations. So, in our example, M2 will fail because any built-in selection method (e.g. Car.findOne) will execute a SQL query like:
SELECT
"Car"."make" AS "Car.make",
"Car"."isElectric" AS "Car.isElectric"
FROM
"Car"
Your DB will throw because Car doesn't have an isElectric column when M2 executes.
The problem won't occur in environments that are only one migration behind, but you're boned if you hire a new developer or nuke the database on your local workstation and build the app from scratch.

Instead of using different seeds for Users and Profiles you could seed them together in one file using sequelizes create-with-association feature.
And additionaly, when using a series of create() you must wrap those in a Promise.all(), because the seeding interface expects a Promise as return value.
up: function (queryInterface, Sequelize) {
return Promise.all([
models.Profile.create({
data: 'profile stuff',
users: [{
name: "name",
...
}, {
name: 'another user',
...
}]}, {
include: [ model.users]
}
),
models.Profile.create({
data: 'another profile',
users: [{
name: "more users",
...
}, {
name: 'another user',
...
}]}, {
include: [ model.users]
}
)
])
}
Not sure if this is really the best solution, but thats how I got around maintaining foreign keys myself in seeding files.

$inc follower count, or should I use an aggregate to track them?

I"m loading products via an infinite scroll in chunks of 12 at a time.
At times, I may want to sort these by how many followers they have.
Below is how i'm tracking how many followers each product has.
Follows are in a separate collection, because of the 16mb data cap, and the amount of follows should be unlimited.
follow schema:
var FollowSchema = new mongoose.Schema({
user: {
type: mongoose.Schema.ObjectId,
ref: 'User'
},
product: {
type: mongoose.Schema.ObjectId,
ref: 'Product'
},
timestamp: {
type: Date,
default: Date.now
}
});
Product that is followed schema:
var ProductSchema = new mongoose.Schema({
name: {
type: String,
unique: true,
required: true
},
followers: {
type: Number,
default: 0
}
});
Whenever a user follows / unfollows a product, I run this function:
ProductSchema.statics.updateFollowers = function (productId, val) {
return Product
.findOneAndUpdateAsync({
_id: productId
}, {
$inc: {
'followers': val
}
}, {
upsert: true,
'new': true
})
.then(function (updatedProduct) {
return updatedProduct;
})
.catch(function (err) {
console.log('Product follower update err : ', err);
})
};
My questions about this:
1: Is there a chance that the incremented "follower" value within product could hit some sort of error, resulting in un matching / inconsistent data?
2: would it be better to write an aggregate to count followers for each Product, or would that be too expensive / slow?
Eventually, I'll probably rewrite this in a graphDB, as it seems better suited, but for now -- this is an exercise in mastering MongoDB.

1 If you increment after inserting or decrement after removing, these is a chance resulting in inconsistent data. For example, insertion succeed but incrementing failed.
2 Intuitively, aggregation is much more expensive than find in this case. I did a benchmark to prove it.
First generate 1000 users, 1000 products and 10000 followers randomly. Then, use this code to benchmark.
import timeit
from pymongo import MongoClient
db = MongoClient('mongodb://127.0.0.1/test', tz_aware=True).get_default_database()
def foo():
result = list(db.products.find().sort('followers', -1).limit(12).skip(12))
def bar():
result = list(db.follows.aggregate([
{'$group': {'_id': '$product', 'followers': {'$sum': 1}}},
{'$sort': {'followers': -1}},
{'$skip': 12},
{'$limit': 12}
]))
if __name__ == '__main__':
t = timeit.timeit('foo()', 'from __main__ import foo', number=100)
print('time: %f' % t)
t = timeit.timeit('bar()', 'from __main__ import bar', number=100)
print('time: %f' % t)
output:
time: 1.230138
time: 3.620147
Creating index can speed up find query.
db.products.createIndex({followers: 1})
time: 0.174761
time: 3.604628
And If you need attributes from product such as name, you need another O(n) query.
I guess that when data scale up, aggregation will be much more slow. If need, I can benchmark on big scale data.

For number 1, if the only operations on that field are incrementing and decrementing, I think you'd be okay. If you start replicating that data or using it in joins for some reason, you'd run the risk of inconsistent data.
For number 2, I'd recommend you run both scenarios in the mongo shell to test them out. You can also review the individual explain plans for both queries to get an idea of which one would perform better. I'm just guessing, but it seems like the update route would perform well.
Also, the amount of expected data makes a difference. It might intially perform well one way, but after a million records the other route might be the way to go. If you have a test environment, that'd be a good thing to check.

1) This relies on the application layer to enforce consistency, and as such there is going to be a chance that you end up with inconsistencies. The questions I would ask are: how important is consistency in this case, and how likely is it that there will be a large inconsistency? My thought is that being off by one follower isn't as important as making your infinite scroll load as fast as possible to improve the user's experience.
2) Probably worth looking at the performance, but if I had to guess I would say this approach is going to be to slow.

How do I update multiple documents in MongoDB with non similar update criteria in one command?

Consider we have a salary upgrade to all employees, where the salary increase is not fixed to all people, and it depends on some fields in the same employee document, how can I update all the salaries to mongo documents (one per employee) with one command?
Update
Consider I have the employee id or name, and the salary upgrade, and want to update all the documents with one command
Sample documents
{
_id : "11111",
salary : (metric_1/metric_2),
name : "Androw",
metric_1 : 12345,
metric_2 : 222,
...
}
{
_id : "22222",
salary : (metric_1/metric_2),
name : "John",
metric_1 : 999,
metric_2 : 223,
...
}
where metric_1 and metric_2 are random factors related to user interactions, and salary is a function of them.

The below command work great, and can do the required operation as needed, considering you have a list of users' ids or names you want to update (may be all)
db.salaries.aggregate( [ {$match : { _id:{$in:[ObjectId("563e1d9d04aa90562201fd5f"),ObjectId("564657f88f71450300e1fe0b")]} } } , {$project: { rating: {$divide:["$metric_1","$metric_2"]} } } , {$out:"new_salaries"} ] )
The downside of the above command is that you have to have a new collection to insert the new updated fields, and if name the existing collection (salaries in this case), it will delete all the existing fields and just add the newly computed documents, which is unsafe to do, since other operations might have occurred during the new salaries computation.
Better approach
A better thing to do is to combine the aggregation pipelining with mongo's bulk operations to do batch update to our exiting collection. This way:
var salaries_to_update = db.salaries.aggregate( [ {$match : { _id:{$in:[ObjectId("563e1d9d04aa90562201fd5f"),ObjectId("564657f88f71450300e1fe0b")]} } } , {$project: { rating: {$divide:["$metric_1","$metric_2"]} } } ] )
Then we do bulk update operation, which does batch of updates at once without a lot of processing and traffic headache back and forth.
var bulk = db.salaries.initializeUnorderedBulkOp()
salaries_to_update.forEach(salary){
bulk.find( _id: salary._id).updateOne({$set:{salary:salary.salary}})
}
bulk.execute()
Ordered bulk operations are executed in an order (thus the name),
halting when there’s an error.
Unordered bulk operations are executed
in no particular order (potentially in parallel) and these operations
do not stop when an error occurs.
Thus, we use unordered bulk update here.
That's all

We Keep Coding

JavaScript is the programming language of the Web.