MongoDB data modelling performance - javascript

I'm currently trying to figure out at mongodb what's the best way in terms of performance cost and redundancy the best way of building a big document data schema. The final JSON from my rest -> app will be likely how it is structured.
Now internally the data will not be used as many to many that's why i binded it into a single document. Only the id will be used as a reference in another collections.
What you guys think, is it better to spit as relational way, with multiple collection to store the content inside of deliverable and use reference or just embedded. (since NoSQL has no joins i though this way will speed up)
Current using mongoose at node app
The Schema:
projectSchema = new Schema({
name: {
type: String,
required: true,
minlength: 3,
maxlength: 50
},
companyId: {
type: mongoose.Types.ObjectId,
ref: 'companies',
required: true
},
deleted: {
type: Number,
enum: [0, 1],
default: 0
},
predictedStartDate: {
type: Date,
default: ""
},
predictedEndDate: {
type: Date,
default: ""
},
realStartDate: {
type: Date,
default: ""
},
realEndDate: {
type: Date,
default: ""
},
//not final version
riskRegister: [{
name: String,
wpId: {
type: mongoose.Types.ObjectId,
ref: 'projects.deliverables.workPackages.id',
required: true
},
probability: String,
impact: String,
riskOwner: String,
response: String,
duration: String,
trigger: String,
status: String,
plannedTimming: String
}],
deliverables: [{
body: String,
workPackages: [{
body: String,
activities: [{
body: String,
tasks: [{
content: String,
properties: [{
dependecies: Array,
risk: {
type: Number,
enum: [0,1],
required: true
},
estimatedTime: {
type: Number,
required: true
},
realTime: {
required: true,
default: 0,
type: Number
},
responsible: {
id: {
type: Number,
default: -1
},
type: {
type: String,
enum: [0, 1], //0 - user, 1 - team
default: -1
}
},
materialCosts: {
type: Number,
default: 0
},
status: {
type: Number,
default: 0
},
approval: {
type: Number,
default: 0
},
startDate: {
type: Date,
default: ""
},
finishDate: {
type: Date,
default: ""
},
endDate: {
type: Date,
default: ""
},
userStartDate: {
type: Date,
default: ""
},
endStartDate: {
type: Date,
default: ""
},
taskNum: {
type: Number,
required: true
},
lessonsLearn: {
insertedAt: {
type: Date,
default: Date.now
},
creatorId: {
type: mongoose.Types.ObjectId,
ref: 'users',
required: true
},
situation: {
type: String,
required: true
},
solution: {
type: String,
required: true
},
attachments: Array
}
}]
}]
}]
}]
}]
})

The only concern I would raise would be regarding deliverables. If in the future there is a use case to do some CRUD operation regarding activities or tasks on the workPackage, the mongodb position operator $ does not support inner arrays, so you would be forced to extract all the deliverables and in memory iterate over all and only after update the deliverables.
My sugestion would be to support only arrays in the first level on the object. The inner objects should be moduled in separate collection ( activities and tasks ). In latest versions of mongodb you now have support to transactions so you can implement ACID on your operations against database, so the manipulation of all this information can be done in an atomic way.

Related

How to use mongoose transactions with updateMany?

I am using the mongoose updateMany() method and I also want to keep it a part of transaction. The documentation shows the example of save() where I can do something like Model.save({session: mySession}) but don't really know how to use it with for example Model.updateMany()
UPDATE:
For example I have two models called SubDomain and Service and they look like this respectively:
SUB-DOMAIN
{
name: {
type: String,
required: true,
},
url: {
type: String,
required: true,
unique: true,
},
services: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "Service",
},
],
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User",
},
}
SERVICE:
{
name: {
type: String,
required: true,
},
description: {
type: String,
required: true,
},
price: { type: Number },
tags: { type: Array },
packages: [
{
name: { type: String, required: true },
description: { type: String, required: true },
price: { type: Number, required: true },
},
],
map: { type: String },
isHidden: {
type: Boolean,
required: true,
default: false,
},
sortingOrder: { type: Number },
isForDomain: { type: Boolean, required: false, default: false },
isForSubDomain: { type: Boolean, required: false, default: false },
subDomains: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "SubDomain",
},
],
}
Now the main field here is the services field in SubDomain and subDomains field in Service.
The complicated part😅:
Whenever the user wants to create new service, I want to $push that service's _id into the array of services of all the subDomains inside that new service
And for that, I am using the updateMany() like this:
const sess = await mongoose.startSession();
sess.startTransaction();
const newService = new Service({
_id: mongoose.Types.ObjectId(),
subDomains: req.body.subDomains
...foo
})
await SubDomain.updateMany(
{ _id: { $in: req.body.subDomains } },
{ $push: { services: newService._id } }
);
The problem starts here, of course I can do:
newService.save({session: sess})
but how do I keep my SubDomain's updateMany in the same transaction (i.e sess)
I know my example is difficult to wrap your head around but I have tried to pick a simplest example rather than copying the exact same code which would have been a lot more difficult

MongoDB Schema for interview time slot availability

I'm building a website where admins can create interviews by selecting participants,
interview start time and end time. I have divided the participants into two groups(collections) - Applicants and Team_Members.
I tried creating a 3rd collection called Interviews to keep track of the start and end times for each interview but I don't think that there's a need for a 3rd collection now.
So far, these are the schemas I have come up with -
const applicantSchema = new Schema({
name: {
type: String,
trim: true,
required: [true, "Name is required"],
},
image: {
type: String,
},
interviews: [
{
start_time: String,
end_time: String,
},
],
});
const interviewerSchema = new Schema({
name: {
type: String,
trim: true,
required: [true, "Name is required"],
},
image: {
type: String,
default: "download.png",
},
interviews: [{
start_time: String,
end_time: String,
}, ],
});
How should I update the interviews property once each new interview is booked? And am I going in the right direction in terms of forming the schemas for the problem required?
You could use the same schema for both. Just add interviewee: { type: boolean, required: true } and add that criteria when you do a search.
for the interviews start and end times, change the values to Date, like that you will be able to search them and find dates > $gt or < $st to make sure you don't double book a time slot. For marking booked, simply add another value called `booked: { type: boolean, default: false'
const interviewSchema = new Schema({
name: {
type: String,
trim: true,
required: [true, "Name is required"],
},
image: {
type: String,
},
interviewee: {
type: Boolean,
required: true
},
interviews: [
{
start_time: {
type: Date,
default: new Date()
},
end_time: {
type: Date,
default: new Date()
},
booked: {
type: Boolean,
default: false
}
},
],
});

How to optimize performance with CREATE, PUT, and DELETE requests on MongoDB?

I have a database named "reviews" with a 9.7GB size. It has a collection name products. I was able to optimize the READ request using indexing technical by running the command db.products.ensureIndex({product_name: 1}); When I run the following command db.products.find({product_name:"nobis"}).explain("executionStats"); in MongoDB terminal, it shows that my execution time reduces from 28334ms to 3301ms.
I have the following 2 questions:
1) How do I use explain("executionStats"); on CREATE, PUT and DELETE requests? For example, I got this following error [thread1] TypeError: db.products.insert(...).explain is not a function when I tried to use the following insert function
db.products.insert({"product_id": 10000002,"product_name": "tissue","review": [{"review_id": 30000001,"user": {"user_id": 30000001,"firstname": "Peter","lastname": "Chen","gender": "Male","nickname": "Superman","email": "hongkongbboy#gmail.com","password": "123"},"opinion": "It's good","text": "It's bad","rating_overall": 3,"doesRecommended": true,"rating_size": "a size too big","rating_width": "Slightly wide","rating_comfort": "Uncomfortable","rating_quality": "What I expected","isHelpful": 23,"isNotHelpful": 17,"created_at": "2007-10-19T09:03:29.967Z","review_photo_path": [{"review_photo_id": 60000001,"review_photo_url": "https://sdcuserphotos.s3.us-west-1.amazonaws.com/741.jpg"}, {"review_photo_id": 60000002,"review_photo_url": "https://sdcuserphotos.s3.us-west-1.amazonaws.com/741.jpg"}]}, {"review_id": 30000002,"user": {"user_id": 30000002,"firstname": "Peter","lastname": "Chen","gender": "Male","nickname": "Superman","email": "hongkongbboy#gmail.com","password": "123"},"opinion": "It's good","text": "It's bad","rating_overall": 3,"doesRecommended": true,"rating_size": "a size too big","rating_width": "Slightly wide","rating_comfort": "Uncomfortable","rating_quality": "What I expected","isHelpful": 23,"isNotHelpful": 17,"created_at": "2007-10-19T09:03:29.967Z","review_photo_path": [{"review_photo_id": 60000003,"review_photo_url": "https://sdcuserphotos.s3.us-west-1.amazonaws.com/741.jpg"}]}]}).explain("executionStats");
2) Is there any performance Optimization method I can use for the CREATE, PUT and DELETE requests? For example, I am able to use POSTMAN to get the response time of a DELETE request, but the response time takes 38.73seconds.
const deleteReview = (request, response) => {
const id = parseInt(request.params.id);
Model.ProductModel.findOneAndDelete({ "review.review_id": id}, (error, results) => {
if (error) {
response.status(500).send(error);
} else {
response.status(200).send(results);
}
});
};
This is my MongoDB schema:
const mongoose = require('mongoose');
mongoose.connect('mongodb://localhost/reviews', { useNewUrlParser: true, useUnifiedTopology: true, useCreateIndex: true });
const Schema = mongoose.Schema;
const productSchema = new Schema({
product_id: { type: Number, required: true, unique: true },
product_name: { type: String, required: true, unique: true },
review: [{
review_id: { type: Number, required: true, unique: true },
user: {
user_id: { type: Number },
firstname: { type: String },
lastname: { type: String },
gender: { type: String, enum: ['Male', 'Female', 'Other'] },
nickname: { type: String },
email: { type: String, required: true },
password: { type: String, required: true },
},
opinion: { type: String, required: true },
text: { type: String },
rating_overall: { type: Number, min: 1, max: 5, required: true },
doesRecommended: { type: Boolean, required: true },
rating_size: { type: String, enum: ['a size too small', '1/2 a size too small', 'Perfect', '1/2 a size too big', 'a size too big'], required: true },
rating_width: { type: String, enum: ['Too narrow', 'Slightly narrow', 'Perfect', 'Slightly wide', 'Too wide'], required: true },
rating_comfort: { type: String, enum: ['Uncomfortable', 'Slightly uncomfortable', 'Ok', 'Comfortable', 'Perfect'], required: true },
rating_quality: { type: String, enum: ['Poor', 'Below average', 'What I expected', 'Pretty great', 'Perfect'], required: true },
isHelpful: { type: Number, required: true, default: 0 },
isNotHelpful: { type: Number, required: true, default: 0 },
created_at: { type: Date, required: true },
review_photo_path: [{
review_photo_id: { type: Number },
review_photo_url: { type: String }
}]
}]
});
const ProductModel = mongoose.model('product', productSchema);
module.exports = { ProductModel };
If you do not have one, ensure you have an index of review.review_id on your products collection. You're using that to look up what to delete so it should be indexed.
I read your deleteReview function as deleting the product document that contains the review, not just removing the individual review -- is that what you expect?
You should be able to just $pull the review from the reviews array to get rid of it.
You can use explain on an update like so:
db.products.explain().update({...}, {...});
See: https://docs.mongodb.com/manual/reference/method/db.collection.explain/
You can explain:
aggregate()
count()
find()
remove()
update()
distinct()
findAndModify()

How to store history of the documents in Mongoose/MongoDB?

I have the following Schema -
const leadSchema = new Schema(
{
emails: [{ type: Email, default: null }],
name: { type: String },
country: { type: String },
city: { type: String, index: true },
source: {
type: Number,
min: 1,
max: leadConfig.sources.length,
required: true
},
course: { type: Schema.Types.ObjectId, ref: 'courses',required: true},
gender: { type: String, enum: leadConfig.gender },
status: {type: Schema.Types.ObjectId, ref: 'status' },
dob: Date,
parent_name: String,
counselor: { type: Schema.Types.ObjectId, ref: 'users', default: null },
consultant_amount: { type: Number, min: 0, default: 0 },
consultant_amount_paid: { type: Number, min: 0, default: 0 },
loan: { type: Boolean, default: false },
reported: { type: Boolean, default: false },
scholarship: { type: Number, default: 0 },
student_id: { type: Number, default: null },
next_interection_deadline: { type: Date, default: null },
session: { type: Schema.Types.ObjectId, ref: 'session' }
},
{ timestamps: true }
);
module.exports = mongoose.model('leads', leadSchema);
I want to store the update history of all the documents of this collection.
For Example -
If I change the name field of a lead from 'John' to 'Jane' then a record should be saved in a history table with the following schema -
{
_id:(ObjectId),
collectionName:"lead"
column_name:"name"
oldValue - 'John',
newValue - 'Jane'
updateAt - Date()
}
I googled some plugins like mongoose-diff-history and it serves the purpose well but the only drawback was that it only worked with .save() method and not with mongodb updates methods.
I have been working on this problem for so many days but couldn't find a correct and efficient solution. Any solutions to this problem will be very much appreciated.
Have you looked into the midldeware hooks? Usually what you want could be handled there. For example look into Mongoose hooks: http://mongoosejs.com/docs/middleware.html
You have basically "events" which allow you do intercept records just before "save" etc and do something (like in your case store/log somewhere).
Here is an example from their docs:
var schema = new Schema(..);
schema.pre('save', function(next) {
// do stuff
next();
})
Here is one for the 'update':
schema.pre('update', function() {
this.update({},{ $set: { updatedAt: new Date() } });
});

Javascript member variables enclosed with square brackets

I'm trying to debug an issue with Node.JS and Mongoose not saving the data correctly. I have a large object with multiple nested objects and arrays of objects. I'm viewing it in Chrome's devtools and the Node devtools right before saving. It's only a couple of nested objects that have this issue. The member variables in the array of objects are enclosed with square brackets []'s. I discovered this while viewing the data in Node devtools right before saving.
0: Object
[attributeName]: "Grow raccoon tail"
[componentId]: "58918f2c6f92704b0868aa30"
[derivativeName]: ""
[entityId]: "58918f9d6f92704b0868aa3e"
[isStateVariable]: "false"
[name]: "Feather"
[parentId]: "0"
[startValue]: "0"
[variableName]: "tail"
Here is the schema of the problem object
var componentVariableSchema = mongoose.Schema({
componentId: { type: ObjectId },
entityId: { type: ObjectId },
name: String,
attributeName: String,
isStateVariable: { type: Boolean, default: false },
variableName: String,
derivativeName: String,
startValue: { type: Number, default: 0.0 }
},
{
timestamps: true
});
Here is the schema of the parent object
var simulationSchema = mongoose.Schema({
createdDate: { type: Date, default: Date.now },
modifiedDate: { type: Date, default: Date.now },
name: { type: String, default: Date.now },
description: { type: String },
parentId: { type: ObjectId, ref: 'Project' },
// Variables
componentVariables: [componentVariableSchema],
simulationVariables: [simulationVariableSchema],
// Integrator
integratorType: { type: String },
integratorParams: [Number],
// Conditions
startTime: 0,
stopTime: 0,
initialValues: [Number],
// Containers for the code portion of the simulation
initializationCode: { type: String },
preFireCommandCode: { type: String },
staveVariableDerivativesCode: { type: String },
postFireCommandCode: { type: String },
// A simulation can be run multiple times with different sets of results
resultsList: [simulationResultSchema],
createdByUserId: { type: ObjectId, ref: 'User' },
isViewableToOthers: Boolean,
deletedBy: { type: ObjectId },
deletedDate: { type: Date }
},
{
timestamps: true
});
The simulationVariables array has the same problem. The parent object simulationSchema is within an array in it's parent object, but all of the member variables look correct. I've never seen this before, and I can't figure out what's causing it.

Categories