I have a huge experience with MySql and Oracle as relational DBs, but I'm very confused about how properly create collections in MongoDB.
I was reading so many articles and watching youtube videos, but didn't get any real example of how properly create the structure and relationship between two or three collections (properly, best practice or whatever...)
For example... Assume we have three collections Users, Comments, and Posts.
What will be right design to use? If the Comments is embedded inside of Posts, then what I have to do in case an User changed his name? Should I run through all comments related to a post in order to update his name in Comments collection?
If it's a referenced one, then how to fetch data from all three collections (Post->Comment->User)... Aggregation? If it does, then how MongoDB will behave if the collection will grow up and reach, let's say, 100,000 documents...
Well... I hope you've got my point.
I'll be glad if you guys will post your comments and thoughts about all this stuff and "clarify" all this.
Tnx.
You are still thinking SQL, there are no 3 collections, just one or two if the number of posts is huge and so is the number of comments.
Let's look at the Posts as a collection.
Post is created by user and so are the comments (different users).
User will not change his name that often if at all and when one does just run an update.
{
_id : PostID,
title: string,
body: string,
meta: {...},
user: {
id : UserId,
name: string
},
comments: [
{
id: CommentId,
by: {
id : UserId,
name: string
}
},
...
]
}
If a user will change his name then you run two updates. One for owner and one for comments using positional operator. Personally I don't think this will happen often.
If the number of posts is huge and they are active with comments and such, then one can think about 2 collections one for posts and one for comments or consider sharding but not as a first option. 100K is not a very big number at all.
I think you can embed comments with user ID inside Post, then used a separate doc of userid inside post for collecting all users contribute to the post (comments etc). Since userid is key, when you return post you can flat taht doc with user's info. Then on client side you can use user document with names etc to recreate the user name etc with the post and comments.
Related
I am currently working on a project where we use microservice architecture. I am also somewhat new to this architecture and have had a few concerns. I understand the concept of microservices in general and also how we can have one database per service. This brings me to a point where I get confused on how to pull data from different databases for a particular user.
Scenario
Assuming I have a Users and a Posts service with their schema like this
User
const schema = {
name: String
id: String
...
}
Post
const schema = {
text: String
user: Id // reference of the user who made this post.
}
Now on the UI, I want to load a set of posts and the associated users who made the post, how do I get a Post alongside the User who made the respective Post. I am using MongoDB, how do I populate data that are stored in other databases? I am also using Kafka handle async operations, how do I leverage Kafka for this usecase? Or is there a much better way of doing this? The final response of a Post could be something like this.
{
text: 'Some random message',
user: {
name: 'John Doe',
id: 1234
}
}
Also, I know I could make a call to the User service to get the User, then make a call to the Post service to get the Post and merge both objects together, but is there a much better option than this basically? I am thinking in cases where I want to do multiple lookups for a user, e.g to get a User and their associated Posts, Messages, etc, how can I handle scenarios like this, are their any techniques I could leverage for situations like this?
Thank you in advance!
I think your issue is service boundaries are too granular. I would recommend aligning your services to bounded contexts (https://martinfowler.com/bliki/BoundedContext.html). For example if you have a "blog" service with posts and users, its quite alright for the blog service to contain both a mongo and relational database for the different models.
Then you ask the service "give me posts for a user" and it is responsible for combining that data as part of its logic.
If you MUST keep them separate (which i would not recommend for the exact problem you are having) then I would keep a lightweight cache of usernames inside the posts service.
Use that to populate the usernames into the post when you return one. You can either update the cache on a regular basis using events, polling, batches. Or just query the user service on a cache-miss.
When dealing with distributed systems you cannot rely on consistency and synchronous, stable communication like you can in a monolith.
I have a dilemma on how to solve possible redundant data querying.
I am using MongoDB with Apollo server and client. My MongoDB has several collections of data. The main collection consists of IDs pointing to supporting collections.
I am not sure about how to solve the mapping of IDs of my main collection to supporting collections IDs to retrieve the actual values. The thing is that mostly I already have data of supporting collections cached in Apollo client cache.
Do you think I should only query the IDs in my main collection and map IDs to values on the frontend using cached data? Or should I have a resolver that takes IDs in main collection, makes database queries to supporting collections to get value for each ID and then sends prepared data to frontend?
I appreciate any insight! Thank you.
As always, it depends. I assume that this is your setup, with a main collection.
type OtherDoc {
id: String
field: String
}
type MainDoc {
id: String
otherDocs(param: String): [OtherDoc]
}
type Query {
mainDocs: [MainDoc]
}
In such case, querying for mainDocs { id otherDocs("...") { id field } } is definitely a natural way to get this data. It might be redundant, in terms of getting OtherDoc when different param result in the same docs. If so, you may think about querying only their IDs and then querying for separate docs, if the client doesn't have them.
I'd say it's a valid solution, but definitely not something you should consider from the beginning. This optimization will definitely limit the bandwidth, but increase the number of requests. What is more, you don't know when to actually refetch OtherDoc. Well, maybe you do, but you have to think about and build it, where without you have it out-of-the-box.
A different approach, a more cache-friendly one, may change the schema to limit such situations, where your data overlap. This is not always possible due to the business logic, but worth considering if it is.
plnkr
I am trying to traverse through a collection, and update each document respectively.
My UserProfile collection consists of multiple JSON objects of userProfiles. As you can see, each profile has a lot of the same information. The only difference is the personal information. (This is just a test case of hard coded objects. The real data will be in an SQL DB managed by a sysadmin).
What I am trying to do is write a function (replaceTopics) that will take in an array of topics and replace each topic that matches in the collection. So if the system admin makes a change to a topic/s, he will send me the topic/s and I will be checking each document in my userProfile collection to see if that document has the matching topic (by matching topicIDs), if so, I need to replace that entire topic with the editedTopic.
I have tried this but with no luck. You can take a look at my function.
So I am working on an application and once a user connects (via soundcloud), the following object:
{userid: userid, username: username, genre: genre, followings: followings}
is pushed into an array:
var users = [];
What I want to be able to do is for when a new user connects, to store this new user's profile object in the users array, however I want this new connector's profile to append previously logged in users in the array. So that basically the more users that log in, the bigger the array gets, creating a database of users if that makes sense?
Is there a way of doing this in JavaScript?
Thanks!
The best way to accomplish this is what you somewhat hinted at - a database. There are several databases that you could essentially store locally or in the browser cache, jStorage is one I've heard good things about. A simple search for javascript database will probably give you many other good answers.
You could also create a remote db using sqlite or any other database engine to create your container. Note that would would have to either work with an API or define some sort of content management system so you could perform the CRUD operations on the database.
The layout you provided ( {userid: userid, username: username, genre: genre, followings: followings} ) would work fairly well in a database table. You will have to define what your data types are for each of these fields, probably text or number for the user id, text for username, etc, so you can create the tables with the correct data type.
The followings field seems like it will have more than one entry, i.e. it will be a list or array, so you would probably want to create another table to house those entries and then use a primary key or some other identifier to link to it in your first table.
This question may be of some use to you: How to store a list in a column of a database table
You should create a database and store this information in it. I dont think there is a way to save users info only with JS.
well, i am creating a network that allows users creating posts and like them.
Asking on stackoverflow i've understood how to structure my database:
A collection which includes a document for each post.
A collection which includes a document for each like, in each of these documents there is a reference to post is referenced to.
When i want to get ALL likes about a post i can query the like collection looking for the reference to that post.
And till here i am ok. But assuming i'll have millions documents in like collection, i wondered how could i query and search among them in not too long time.
And i was advised of ensureIndex, in this case, i have to ensureindex of the field which contains reference to a post.
But when do i have to create this index? is enough to create it once (for example when i set up my database) and it will be as default in mongodb or do i have to do it during application life-time? thank you
But assuming i'll have millions documents in like collection, i wondered how could i query and search among them in not too long time.
I assume you would most likely want to do a count on the likes as an example?
You can't, instead you use optimizations to combat this. A count on millions of rows might get a bit slow.
A typical scenario are counters in SQL techs that you use to amend the parent row with a sum figure of its children.
Same applies to MongoDB.
You would aggregate important data to the top.
If you require to actually query the likes to show some who have liked it then you limit those likes. Google+ and other networks tend to limit the amount of likes they show to about 1,000.
And i was advised of ensureIndex,
Adding indexes to a database does help with actually searching for documents.
But when do i have to create this index? is enough to create it once
Yes, MongoDB will manage the index itself. You only need to ensure it once.