Sequelize Associations: How to update parent model when creating child?

Sequelize Associations: How to update parent model when creating child? - javascript

It seems i have misunderstood sequelize .hasMany() and .belongsTo() associations and how to use them in service. I have two models:
const User = db.sequelize.define("user", {
uid: { /*...*/ },
createdQuestions: {
type: db.DataTypes.ARRAY(db.DataTypes.UUID),
unique: true,
allowNull: true,
},
});
const Question = db.sequelize.define("question", {
qid: { /*...*/ },
uid: {
type: db.DataTypes.TEXT,
},
});
Given that one user can have many questions and each question belongs to only one user I have the following associatons:
User.hasMany(Question, {
sourceKey: "createdQuestions",
foreignKey: "uid",
constraints: false,
});
Question.belongsTo(User, {
foreignKey: "uid",
targetKey: "createdQuestions",
constraints: false,
});
What I want to achieve is this: After creation of a question object, the qid should reside in the user object under "createdQuestions" - just as the uid resides in the question object under uid. What I thought sequelize associations would do for me is to save individual calling and updating the user object. Is there a corresponding method? What I have so far is:
const create_question = async (question_data) => {
const question = { /*... question body containing uid and so forth*/ };
return new Promise((resolve, rejected) => {
Question.sync({ alter: true }).then(
async () =>
await db.sequelize
.transaction(async (t) => {
const created_question = await Question.create(question, {
transaction: t,
});
})
.then(() => resolve())
.catch((e) => rejected(e))
);
});
};
This however only creates a question object but does not update the user. What am I missing here?

Modelling a One-to-many relationship in SQL
SQL vs NoSQL
In SQL, contrary to how it is in NoSQL, every attribute has a fixed data type with a fixed limit of bits. That's manifested by the SQL command when creating a new table:
CREATE TABLE teachers (
name VARCHAR(32),
department VARCHAR(64),
age INTEGER
);
The reason behind this is to allow us to easily access any attribute from the database by knowing the length of each row. In our case, each row will need the space needed to store:
32 bytes (name) + 64 bytes (department) + 4 bytes (age) = 100 byes
This is a very powerful feature in Relation Databases as it minimizes the time needed to retrieve data to Constant time since we knew where each piece of data is located in the memory.
One-to-Many Relationship: Case Study
Now, let's consider we have these 3 tables
Let's say we want to create a one-to-many relation between classes and teachers where a Teacher can give many classes.
We can think of it this way. But, this model is not possible for 2 main reasons:
It will make us lose our constant-time retrieval since we don't know the size of the list anymore
We fear that the amount of space given to the list attribute won't be enough for future data. Let's say we allocate space needed for 10 classes and we end up with a teacher giving 11 classes. This will push us to recreate our database to increase the column size.
Another way would be this:
While this approach will fix the limited column size problem, we no longer have a single source of truth. The same data is duplicated and stored multiple times.
That's why for this one-to-many relationship, we'll need to store the Id of the teacher inside this class table.
This way, we still can find all the classes a teacher can teach by running
SELECT *
FROM classes
WHERE teacherID = teacher_id
And we'll avoid all the problems discussed earlier.

Your relation is a oneToMany relation. One User can have multiple Questions. In SQL, this kind of relation is modelled by adding an attribute to Question called userId or Uid as you did. In Sequelize, this would be achieved through a hasMany or BelongsTo like this:
User.hasMany(Question)
Question.belongsTo(User, {
foreignKey: 'userId',
constraints: false
})
In other words, I don't think you need the CreatedQuestions attribute under User. Only one foreign key is needed to model the oneToMany relation.
Now, when creating a new question, you just need to add the userId this way
createNewQuestion = async (userId, title, body) => {
const question = await Question.create({
userId: userId, // or just userId
title: title, // or just title
body: body // or just body
})
return question
}
Remember, we do not store arrays in SQL. Even if we can find a way to do it, it is not what we need. There must be always a better way.

Related

How do I properly design Aggregate in DDD, Event-sourcing

Suppose I want to make an e-commerce system. I have 2 aggregates here ProductAggregate and UserAggregate. Product aggregate contains productId, price. User aggregate contains userId and balance. Here's the problem, in event-sourcing we should not rely on the read model since there might be eventual consistency problem. Ok so we should rely on the command model right I guess?, but this two command model is different. I read from somewhere else they told me that aggregate should only rely on its state. Let's say the user want to buy a product I have to check if he has enough balance and in order to do that I need to know the price of product. So read model not allowed, aggregate query not allowed. what options do I have here?
const ProductAggregate = {
state: {
productId: "product-1",
price: 100
}
}
const UserAggregate = {
state: {
userId: "userId-1",
balance: 50
},
handlePurchase: ({ userId, productId }) => {
// todo I got productId from the client, but how can I retrieve its price ?
if (this.state.balance < price) {
throw "Insufficient balance bro."
}
}
}
So I though it must be my bad aggregate design which makes UserAggregate requires state from outside of its context. So in this situation how do I properly design an Aggregate for User and Product.
edited:
I have been thinking all day long for the solution and I came up with this approach. So instead of putting purchase command in the UserAggregate I put it in the ProductAggregate and call it OrderProductCommand which is a bit weird for me since the product itself can't create an order, but the user can (it seems to work anyway I don't even know?). So with this approach I can now retrieve the price and send another command DeductBalanceCommand which will deduct amount of money from the user.
const ProductAggregate = {
state: {
productId: "product-1",
price: 100
},
handleOrder: ({productId, userId}) => {
await commandBus.send({
command: "handleDeduct",
params: {
userId: userId,
amount: this.state.price
}
})
.then(r => eventBus.publish({
event: "OrderCreated",
params: {
productId: productId,
userId: userId
}
}))
.catch(e => {
throw "Unable to create order due to " + e.message
})
}
}
const UserAggregate = {
state: {
userId: "userId-1",
balance: 50
},
handleDeduct: ({ userId, amount }) => {
if (this.state.balance < amount) {
throw "Insufficient balance bro."
}
eventBus.publish({
event: "BalanceDeducted",
params: {
userId: userId,
amount: amount
}
})
}
}
Is it fine and correct to use this approach? it's a bit weird for me or maybe it's just a way of thinking in DDD world?
ps. I added javascript tag so my code can have colors and easy to read.

First of all, regarding your handle, you're not stupid :)
A few points:
In many situations you can query the read model even though there's eventual consistency. If you reject a command that would have been accepted had a pending update become visible in the read model, that can typically be retried. If you accept a command that would have been rejected, there's often a compensating action that can be applied after the fact (e.g. a delay between ordering a physical product and that product being delivered).
There are a couple of patterns that can be useful. One is the saga pattern where you would model the process of a purchase. Rather than "user A buys product X", you might have an aggregate corresponding to "user A's attempt to purchase product X", which validates and reserves that user A is able to buy X and that X is able to be purchased.
Every write model with an aggregate implies the existence of one sufficiently consistent read model for that aggregate. One can thus define queries or "read-only" commands against the write model. CQRS (IMO) shouldn't be interpreted as "don't query the write model" but "before trying to optimize the write model for reads (whether ease, performance, etc.), give strong consideration to handling that query with a read model": i.e. if you're querying the write model, you give up some of the right to complain about the queries being slow or difficult. Depending on how you're implementing aggregates this option may or may not be easy to do.

GraphQL Unions and Sequelize

I'm having trouble understanding how to retrieve information from a GraphQL Union. I have something in place like this:
const Profile = StudentProfile | TeacherProfile
Then in my resolver I have:
Profile: {
__resolveType(obj, context, info) {
if (obj.studentId) {
return 'StudentProfile'
} else if (obj.salaryGrade) {
return 'TeacherProfile'
}
},
},
This doesn't throw any errors, but when I run a query like this:
query {
listUsers {
id
firstName
lastName
email
password
profile {
__typename
... on StudentProfile {
studentId
}
... on TeacherProfile {
salaryGrade
}
}
}
}
This returns everything except for profile which just returns null. I'm using Sequelize to handle my database work, but my understanding of Unions was that it would simply look up the relevant type for the ID being queried and return the appropriate details in the query.
If I'm mistaken, how can I get this query to work?
edit:
My list user resolver:
const listUsers = async (root, { filter }, { models }) => {
const Op = Sequelize.Op
return models.User.findAll(
filter
? {
where: {
[Op.or]: [
{
email: filter,
},
{
firstName: filter,
},
{
lastName: filter,
},
],
},
}
: {},
)
}
User model relations (very simple and has no relation to profiles):
User.associate = function(models) {
User.belongsTo(models.UserType)
User.belongsTo(models.UserRole)
}
and my generic user resolvers:
User: {
async type(type) {
return type.getUserType()
},
async role(role) {
return role.getUserRole()
},
},

The easiest way to go about this is to utilize a single table (i.e. single table inheritance).
Create a table that includes columns for all the types. For example, it would include both student_id and salary_grade columns, even though these will be exposed as fields on separate types in your schema.
Add a "type" column that identifies each row's actual type. In practice, it's helpful to name this column __typename (more on that later).
Create a Sequelize model for your table. Again, this model will include all attributes, even if they don't apply to a specific type.
Define your GraphQL types and your interface/union type. You can provide a __resolveType method that returns the appropriate type name based on the "type" field you added. However, if you named this field __typename and populated it with the names of the GraphQL types you are exposing, you can actually skip this step!
You can use your model like normal, utilizing find methods to query your table or creating associations with it. For example, you might add a relationship like User.belongsTo(Profile) and then lazy load it: User.findAll({ include: [Profile] }).
The biggest drawback to this approach is you lose database- and model-level validation. Maybe salary_grade should never be null for a TeacherProfile but you cannot enforce this with a constraint or set the allowNull property for the attribute to false. At best, you can only rely on GraphQL's type system to enforce validation but this is not ideal.
You can take this a step further and create additional Sequelize models for each individual "type". These models would still point to the same table, but would only include attributes specific to the fields you're exposing for each type. This way, you could at least enforce "required" attributes at the model level. Then, for example, you use your Profile model for querying all profiles, but use the TeacherProfile when inserting or updating a teacher profile. This works pretty well, just be mindful that you cannot use the sync method when structuring your models like this -- you'll need to handle migrations manually. You shouldn't use sync in production anyway, so it's not a huge deal, but definitely something to be mindful of.

How do I retrieve all records that are NOT in a many to many association using Sequelize

Sequelize gives you the ability to define a many to many association between tables which adds some extra functionality to a Model instance.
I have a Users table and I have defined a self-association on the table like so:
User.belongsToMany(models.User, { through: 'Friends', as: 'friends', foreignKey: 'userId' });
This gives the instance of the User model a couple of extra methods like user.getFriends(). So far so good.
What I want to do is to get all users who aren't friends of our instance. Something like user.getNonFriends(). Would that be possible using Sequelize?

A quick solution I can think of is, you could get the list of friends of the user A from the database. Using that result you can get the friends list that is not in the user's A list. Here is an example in code
const friends = user.getFriends();
const friendIds friends.map(friend => friend.id)
Friend.findAll({ where: {
id: { $notIn: [...friendIds] }
}
})

How to improve the performance of this MongoDB / Mongoose query?

I have a GET all products endpoint which is taking an extremely long time to return responses:
Product.find(find, function(err, _products) {
if (err) {
res.status(400).json({ error: err })
return
}
res.json({ data: _products })
}).sort( [['_id', -1]] ).populate([
{ path: 'colors', model: 'Color' },
{ path: 'size', model: 'Size' },
{ path: 'price', model: 'Price' }
]).lean()
This query is taking up to 4 seconds, despite there only being 60 documents in the products collection.
This query came from a previous developer, and I'm not so familiar with Mongoose.
What are the performance consequences of sort and populate? I assume populate is to blame here? I am not really sure what populate is doing, so I'm unclear how to either avoid it or index at a DB level to improve performance.

From the Mongoose docs, "Population is the process of automatically replacing the specified paths in the document with document(s) from other collection(s)"
So your ObjectId reference on your model gets replaced by an entire Mongoose document. Doing so on multiple paths in one query will therefore slow down your app. If you want to keep the same code structure, you can use select to specify what fields of the document that should be populated, i.e. { path: 'colors', model: 'Color', select: 'name' }. So instead of returning all the data of the Color document here, you just get the name.
You can also call cursor() to stream query results from MongoDB:
var cursor = Person.find().cursor();
cursor.on('data', function(doc) {
// Called once for every document
});
cursor.on('close', function() {
// Called when done
});
You can read more about the cursor function in the Mongoose documentation here.
In general, try to only use populate for specific tasks like getting the name of a color for only one product.
sort will not cause any major performance issues until you reach much larger databases.
Hope it helps!

Sequelize dynamic seeding

I'm currently seeding data with Sequelize.js and using hard coded values for association IDs. This is not ideal because I really should be able to do this dynamically right? For example, associating users and profiles with a "has one" and "belongs to" association. I don't necessarily want to seed users with a hard coded profileId. I'd rather do that in the profiles seeds after I create profiles. Adding the profileId to a user dynamically once profiles have been created. Is this possible and the normal convention when working with Sequelize.js? Or is it more common to just hard code association IDs when seeding with Sequelize?
Perhaps I'm going about seeding wrong? Should I have a one-to-one number of seeds files with migrations files using Sequelize? In Rails, there is usually only 1 seeds file you have the option of breaking out into multiple files if you want.
In general, just looking for guidance and advice here. These are my files:
users.js
// User seeds
'use strict';
module.exports = {
up: function (queryInterface, Sequelize) {
/*
Add altering commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkInsert('Person', [{
name: 'John Doe',
isBetaMember: false
}], {});
*/
var users = [];
for (let i = 0; i < 10; i++) {
users.push({
fname: "Foo",
lname: "Bar",
username: `foobar${i}`,
email: `foobar${i}#gmail.com`,
profileId: i + 1
});
}
return queryInterface.bulkInsert('Users', users);
},
down: function (queryInterface, Sequelize) {
/*
Add reverting commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkDelete('Person', null, {});
*/
return queryInterface.bulkDelete('Users', null, {});
}
};
profiles.js
// Profile seeds
'use strict';
var models = require('./../models');
var User = models.User;
var Profile = models.Profile;
module.exports = {
up: function (queryInterface, Sequelize) {
/*
Add altering commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkInsert('Person', [{
name: 'John Doe',
isBetaMember: false
}], {});
*/
var profiles = [];
var genders = ['m', 'f'];
for (let i = 0; i < 10; i++) {
profiles.push({
birthday: new Date(),
gender: genders[Math.round(Math.random())],
occupation: 'Dev',
description: 'Cool yo',
userId: i + 1
});
}
return queryInterface.bulkInsert('Profiles', profiles);
},
down: function (queryInterface, Sequelize) {
/*
Add reverting commands here.
Return a promise to correctly handle asynchronicity.
Example:
return queryInterface.bulkDelete('Person', null, {});
*/
return queryInterface.bulkDelete('Profiles', null, {});
}
};
As you can see I'm just using a hard coded for loop for both (not ideal).

WARNING: after working with sequelize for over a year, I've come to realize that my suggestion is a very bad practice. I'll explain at the bottom.
tl;dr:
never use seeders, only use migrations
never use your sequelize models in migrations, only write explicit SQL
My other suggestion still holds up that you use some "configuration" to drive the generation of seed data. (But that seed data should be inserted via migration.)
vv DO NOT DO THIS vv
Here's another pattern, which I prefer, because I believe it is more flexible and more readily understood. I offer it here as an alternative to the accepted answer (which seems fine to me, btw), in case others find it a better fit for their circumstances.
The strategy is to leverage the sqlz models you've already defined to fetch data that was created by other seeders, use that data to generate whatever new associations you want, and then use bulkInsert to insert the new rows.
In this example, I'm tracking a set of people and the cars they own. My models/tables:
Driver: a real person, who may own one or more real cars
Car: not a specific car, but a type of car that could be owned by someone (i.e. make + model)
DriverCar: a real car owned by a real person, with a color and a year they bought it
We will assume a previous seeder has stocked the database with all known Car types: that information is already available and we don't want to burden users with unnecessary data entry when we can bundle that data in the system. We will also assume there are already Driver rows in there, either through seeding or because the system is in-use.
The goal is to generate a whole bunch of fake-but-plausible DriverCar relationships from those two data sources, in an automated way.
const {
Driver,
Car
} = require('models')
module.exports = {
up: async (queryInterface, Sequelize) => {
// fetch base entities that were created by previous seeders
// these will be used to create seed relationships
const [ drivers , cars ] = await Promise.all([
Driver.findAll({ /* limit ? */ order: Sequelize.fn( 'RANDOM' ) }),
Car.findAll({ /* limit ? */ order: Sequelize.fn( 'RANDOM' ) })
])
const fakeDriverCars = Array(30).fill().map((_, i) => {
// create new tuples that reference drivers & cars,
// and which reflect the schema of the DriverCar table
})
return queryInterface.bulkInsert( 'DriverCar', fakeDriverCars );
},
down: (queryInterface, Sequelize) => {
return queryInterface.bulkDelete('DriverCar');
}
}
That's a partial implementation. However, it omits some key details, because there are a million ways to skin that cat. Those pieces can all be gathered under the heading "configuration," and we should talk about it now.
When you generate seed data, you usually have requirements like:
I want to create at least a hundred of them, or
I want their properties determined randomly from an acceptable set, or
I want to create a web of relationships shaped exactly like this
You could try to hard-code that stuff into your algorithm, but that's the hard way. What I like to do is declare "configuration" at the top of the seeder, to capture the skeleton of the desired seed data. Then, within the tuple-generation function, I use that config to procedurally generate real rows. That configuration can obviously be expressed however you like. I try to put it all into a single CONFIG object so it all stays together and so I can easily locate all the references within the seeder implementation.
Your configuration will probably imply reasonable limit values for your findAll calls. It will also probably specify all the factors that should be used to calculate the number of seed rows to generate (either by explicitly stating quantity: 30, or through a combinatoric algorithm).
As food for thought, here is an example of a very simple config that I used with this DriverCar system to ensure that I had 2 drivers who each owned one overlapping car (with the specific cars to be chosen randomly at runtime):
const CONFIG = {
ownership: [
[ 'a', 'b', 'c', 'd' ], // driver 1 linked to cars a, b, c, and d
[ 'b' ], // driver 2 linked to car b
[ 'b', 'b' ] // driver 3 has two of the same kind of car
]
};
I actually used those letters, too. At runtime, the seeder implementation would determine that only 3 unique Driver rows and 4 unique Car rows were needed, and apply limit: 3 to Driver.findAll, and limit: 4 to Car.findAll. Then it would assign a real, randomly-chosen Car instance to each unique string. Finally, when generating association tuples, it uses the string to look up the chosen Car from which to pull foreign keys and other values.
There are undoubtedly fancier ways of specifying a template for seed data. Skin that cat however you like. Hopefully this makes it clear how you'd marry your chosen algorithm to your actual sqlz implementation to generate coherent seed data.
Why the above is bad
If you use your sequelize models in migration or seeder files, you will inevitably create a situation in which the application will not build successfully from a clean slate.
How to avoid madness:
Never use seeders, only use migrations
(Anything you can do in a seeder, you can do in a migration. Bear that in mind as I enumerate the problems with seeders, because that means none of these problems gain you anything.)
By default, sequelize does not keep records of which seeders have been run. Yes, you can configure it to keep records, but if the app has already been deployed without that setting, then when you deploy your app with the new setting, it'll still re-run all your seeders one last time. If that's not safe, your app will blow up. My experience is that seed data can't and shouldn't be duplicated: if it doesn't immediately violate uniqueness constraints, it'll create duplicate rows.
Running seeders is a separate command, which you then need to integrate into your startup scripts. It's easy for that to lead to a proliferation of npm scripts that make app startup harder to follow. In one project, I converted the only 2 seeders into migrations, and reduced the number of startup-related npm scripts from 13 to 5.
It's been hard to pin down, but it can be hard to make sense of the order in which seeders are run. Remember also that the commands are separate for running migrations and seeders, which means you can't interleave them efficiently. You'll have to run all migrations first, then run all seeders. As the database changes over time, you'll run into the problem I describe next:
Never use your sequelize models in your migrations
When you use a sequelize model to fetch records, it explicitly fetches every column it knows about. So, imagine a migration sequence like this:
M1: create tables Car & Driver
M2: use Car & Driver models to generate seed data
That will work. Fast-forward to a date when you add a new column to Car (say, isElectric). That involves: (1) creating a migraiton to add the column, and (2) declaring the new column on the sequelize model. Now your migration process looks like this:
M1: create tables Car & Driver
M2: use Car & Driver models to generate seed data
M3: add isElectric to Car
The problem is that your sequelize models always reflect the final schema, without acknowledging the fact that the actual database is built by ordered accretion of mutations. So, in our example, M2 will fail because any built-in selection method (e.g. Car.findOne) will execute a SQL query like:
SELECT
"Car"."make" AS "Car.make",
"Car"."isElectric" AS "Car.isElectric"
FROM
"Car"
Your DB will throw because Car doesn't have an isElectric column when M2 executes.
The problem won't occur in environments that are only one migration behind, but you're boned if you hire a new developer or nuke the database on your local workstation and build the app from scratch.

Instead of using different seeds for Users and Profiles you could seed them together in one file using sequelizes create-with-association feature.
And additionaly, when using a series of create() you must wrap those in a Promise.all(), because the seeding interface expects a Promise as return value.
up: function (queryInterface, Sequelize) {
return Promise.all([
models.Profile.create({
data: 'profile stuff',
users: [{
name: "name",
...
}, {
name: 'another user',
...
}]}, {
include: [ model.users]
}
),
models.Profile.create({
data: 'another profile',
users: [{
name: "more users",
...
}, {
name: 'another user',
...
}]}, {
include: [ model.users]
}
)
])
}
Not sure if this is really the best solution, but thats how I got around maintaining foreign keys myself in seeding files.

We Keep Coding

JavaScript is the programming language of the Web.