I'm working on a simple JavaScript Twitter clone utilizing Firebase as the backend storage mechanism (JSON). I am familiar with relational databases (SQL) but not with non-relational databases. I am currently trying to work out how to design the structure of the dataset in Firebase, as there is no foreign key relationships or table joining.
The app has three tables, users, tweets, and followers. Users can post tweets, as well as follow other users and see a feed of their tweets. The problem comes when attempting to create the data structure, as I have no idea how I will join the necessary tables. For instance, how will I be able to implement the user-follower functionality?
Here is the ERD that I am using to give myself a starting point:
As I've been trying to wrap my head around this whole JSON thing, this is the closest that I could relate it to a relational database, while still using the Firebase .push() functions to add to the lists in the database (as seen from the Firebase dashboard):
I've seen some people attempting to solve this by "de-normalizing" that data, citing that Firebase doesn't have any query mechanisms. However, these articles are all primarily before 2014, which is when Firebase did add queries. That being said, I don't understand how using the queries could help me, and I think I'm getting stuck with the generated keys.
How should I best structure my data to work with the Firebase JSON lists? I've read some of their documentation but haven't been able to locate anything that uses what I'm looking for and the generated keys.
Should I be attempting to use the .set() method somehow, and using the email addresses as the "primary keys" for the users instead of the generated key? [As mentioned below, this is something I do plan to avoid. Thanks #metame ]
Update
Is this more what I should be looking at for the data structure? And then querying by the keys?
users: {
Hj83Kd83: {
username: "test",
password: "2K44Djl3",
email: "a#b.c"
},
J3kk0dK4: {
username: "another",
password: "33kj4l5K",
email: "d#e.f"
}
}
tweets: {
Jkk3ld92: {
userkey: "Hj83Kd83",
message: "Test message here!"
},
K3Ljdi94: {
userkey: "J3kk0dK4",
message: "Another message here!"
}
}
followers: {
Lk3jdke9: {
userkey: "Hj83Kd83",
followerkey: "J3kk0dK4"
}
}
Let me know if I should include anything else!
Representing relationships in non-relational or noSQL databases, in general, is solved either through embedding documents (noSQL verbiage for rows) or through document references as you have done in your example solution.
The MongoDB site has some decent articles that are mostly applicable to all non-relational databases including Model One-to-Many Relationships with Document References, which I think is most relevant to your issue.
As far as the reference key, it is typically best practice to use the generated IDs as you have assurance that they are unique.
Related
I am currently working on a project where we use microservice architecture. I am also somewhat new to this architecture and have had a few concerns. I understand the concept of microservices in general and also how we can have one database per service. This brings me to a point where I get confused on how to pull data from different databases for a particular user.
Scenario
Assuming I have a Users and a Posts service with their schema like this
User
const schema = {
name: String
id: String
...
}
Post
const schema = {
text: String
user: Id // reference of the user who made this post.
}
Now on the UI, I want to load a set of posts and the associated users who made the post, how do I get a Post alongside the User who made the respective Post. I am using MongoDB, how do I populate data that are stored in other databases? I am also using Kafka handle async operations, how do I leverage Kafka for this usecase? Or is there a much better way of doing this? The final response of a Post could be something like this.
{
text: 'Some random message',
user: {
name: 'John Doe',
id: 1234
}
}
Also, I know I could make a call to the User service to get the User, then make a call to the Post service to get the Post and merge both objects together, but is there a much better option than this basically? I am thinking in cases where I want to do multiple lookups for a user, e.g to get a User and their associated Posts, Messages, etc, how can I handle scenarios like this, are their any techniques I could leverage for situations like this?
Thank you in advance!
I think your issue is service boundaries are too granular. I would recommend aligning your services to bounded contexts (https://martinfowler.com/bliki/BoundedContext.html). For example if you have a "blog" service with posts and users, its quite alright for the blog service to contain both a mongo and relational database for the different models.
Then you ask the service "give me posts for a user" and it is responsible for combining that data as part of its logic.
If you MUST keep them separate (which i would not recommend for the exact problem you are having) then I would keep a lightweight cache of usernames inside the posts service.
Use that to populate the usernames into the post when you return one. You can either update the cache on a regular basis using events, polling, batches. Or just query the user service on a cache-miss.
When dealing with distributed systems you cannot rely on consistency and synchronous, stable communication like you can in a monolith.
I am using Azure Functions and Cosmos DB SQL to create a serverless application with javascript.
I have the following database schema of a user item:
{
"id": "user_id_2",
"username": "username_2",
"pass": "pass_1",
"feed": [],
"followed": [
"username_1",
"username_3",
"username_4"
],
"followers": [
"username_3",
"username_4"
]
}
Currently when a user_1 follows another user_2 I update the database document for user_1 - no problem. But now I also need to update the document for user_2, particularly the array of Strings - followers. How can I do that through an azure function with bindings? The only way I came up with is to query the database for the whole document, update it in the client-side and then PUT back in the database, overwriting the previous document. However, this seems ridiculous...
Cosmos DB does not support partial updates at this moment, so pulling the document, adding the item to the array and then doing the PUT is the only option.
Having said that, the problem with your data design is that followed and followers are unbounded arrays, where your user size can grow unchecked. The bigger the size, the more RUs operations will take.
Please see When not to embed here https://learn.microsoft.com/azure/cosmos-db/modeling-data#when-not-to-embed
I wrote a design doc for social apps back when I worked in creating a social application: https://learn.microsoft.com/azure/cosmos-db/social-media-apps
Ideally, the relationship of A follows B would be a document of its own. You could store them on a Graph account for optimal performance, since what you are really building here is a graph, and the queries you will be running are "Who follows B?" or "Who follows my followers?".
I have a huge experience with MySql and Oracle as relational DBs, but I'm very confused about how properly create collections in MongoDB.
I was reading so many articles and watching youtube videos, but didn't get any real example of how properly create the structure and relationship between two or three collections (properly, best practice or whatever...)
For example... Assume we have three collections Users, Comments, and Posts.
What will be right design to use? If the Comments is embedded inside of Posts, then what I have to do in case an User changed his name? Should I run through all comments related to a post in order to update his name in Comments collection?
If it's a referenced one, then how to fetch data from all three collections (Post->Comment->User)... Aggregation? If it does, then how MongoDB will behave if the collection will grow up and reach, let's say, 100,000 documents...
Well... I hope you've got my point.
I'll be glad if you guys will post your comments and thoughts about all this stuff and "clarify" all this.
Tnx.
You are still thinking SQL, there are no 3 collections, just one or two if the number of posts is huge and so is the number of comments.
Let's look at the Posts as a collection.
Post is created by user and so are the comments (different users).
User will not change his name that often if at all and when one does just run an update.
{
_id : PostID,
title: string,
body: string,
meta: {...},
user: {
id : UserId,
name: string
},
comments: [
{
id: CommentId,
by: {
id : UserId,
name: string
}
},
...
]
}
If a user will change his name then you run two updates. One for owner and one for comments using positional operator. Personally I don't think this will happen often.
If the number of posts is huge and they are active with comments and such, then one can think about 2 collections one for posts and one for comments or consider sharding but not as a first option. 100K is not a very big number at all.
I think you can embed comments with user ID inside Post, then used a separate doc of userid inside post for collecting all users contribute to the post (comments etc). Since userid is key, when you return post you can flat taht doc with user's info. Then on client side you can use user document with names etc to recreate the user name etc with the post and comments.
I am new to Jquery JSON. I wanted to know is it possible to use JSON as a database to store data and retrieve data whenever it is needed. Like instead of using mysql,or mssql or anything else is there any way to use only JSON(i.e. .json file)??
If yes can you please guide me? if no can anybody suggest a better way to store data?
And i need to know this for asp.net webforms.
You could use MongoDB instead of MySQL. MongoDB is based on "JSON" document storage. Example of CRUD with Mongo: insert and find data.
db.inventory.insert({
item: "ABC1",
details: {
model: "14Q3",
manufacturer: "XYZ Company"
}
})
db.inventory.find( { item: "ABC1" } )
From mongodb.com
The MongoDB BSON implementation is lightweight, fast and highly
traversable. Like JSON, MongoDB's BSON implementation supports
embedding objects and arrays within other objects and arrays – MongoDB
can even "reach inside" BSON objects to build indexes and match
objects against query expressions on both top-level and nested BSON
keys. This means that MongoDB gives users the ease of use and
flexibility of JSON documents together with the speed and richness of
a lightweight binary format. Read our MongoDB overview to learn more
about these features.
There is also a similar question on SO here
It is definitely possible and it depends, if it suits your requirement then go ahead and use mongodb. To have json as your datastore I would recommend to use MongoDB one of the popular NOSQL(Not only SQL) DB.
But if your data has lots of relation between different entities then it's not recommended to use MongoDB.
Go through this article, it gives you detailed explanation when to and when not to use NoSQL DB
http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
I am creating a mock app with user creation/auth/friend in a node js learning exercise. Having spent my time mostly at the front end of things, I am a n00b as far as DBs are concerned. I want to create a user database where I want to keep track of user profiles and their connections/friends.
Primary objective is to load/store users connections in the database.
Fetch this information and give it to the user most efficiently in least number of queries.
I'd really appreciate some help with a DB structure I should be using that can accomplish this. I am using mongodb and node.
Off the top of my head: I can store the user's connections in an object in the "connections" field. But this will involve making a lot of queries to fetch connections' details like their "about me" information - which I can also store in the same object as well.
Confused. Would really appreciate some pointers.
Take a look at the Mongoose ORM. It has a populate method that grabs foreign documents. Lots of other great stuff too.
You could say
Users.find({}).populate('connections').exec(function(err,users) { ... });
Before popualte the users' array of connections was an array of IDs, after, its an array of user documents.