Context
Hi! I made something like graphql but with just Sequelize. I mean, Sequelize query options are JSON objects so, the client could send the options directly (with correct sanitization).
What I have done
Just for curiosity, I built that, and it works just fine. Now my doubt is: how bad is that?
this is an example of the client using this API
const res = await http.post(APIs.FINDER, {
model: 'User',
options: {
where: {
id: someId
},
attributes: ['name', 'active']
},
include: [
{
as: 'zone',
attributes: ['name']
}
],
order: [['createdAt', 'DESC']]
});
nice right?
Sanitization/Constraints
About sanitization, I have to:
check that the includes have a known limit, eg.: no more than 10 nested includes
check that the params are not SQL strings or other hacks (Sequelize take care about that)
don't allow Sequelize functions, just simple queries
Questions
with that in mind, I think this could be used in production.
Have I missed something that could reject this idea from production usage? (security/usage/etc)
Have graphql some specific feature that makes me prefer it against this custom solution?
Would you use it in a production environment? I can't imagine why not
My thought to the questions:
I don't recommend this style of API. It will expose your backend implementation to the public which make you have difficulty dealing with every security conditions, not to mention the business logic and authorization. Also, it would be hard to improve your performance because the behavior is tightly coupled with the sequelize package.
You can consider this post: GraphQL Mutation Design: Anemic Mutations. A good GraphQL API should be driven by domain and requirement instead of be driven by data.
NO! I've experienced a hard time dealing with this api style.
Actually, this is not a good idea. If you are organizing an one-man full-stack project, it may seem fast in the first place, but the cost of development would skyrocket until you cannot move on. If you are working as a group, you can notice that the client side is tightly coupled with the server side, which is very bad for developing.
In the client side, it only need a finite set of apis instead of apis with infinite possibilities.
In the server side, you can do nothing but hand the data over to sequelize, which make it hard to improve your performance, to add logic layer and to introduce another database system like elastic search into your codebase.
When it comes to designing an API, you can consider Domain Driven Design which known as DDD. It's preferred to use GET Friends?limit=10 api than to use GET { type: 'User", where: ..., include: ..., order: ..., limit: 10 }
By the way, GraphQL is not just a query language, it is a thin API layer in essence (ref). So don't use it as a JSON database but treat it as an API which focuses on the business need.
For example, here is a User model:
{
"User": {
"id": 1,
"name": "Mary",
"age": 20,
"friendIds": [2, 3, 4]
}
}
But in GraphQL, based on what you need, it may become :
type User {
id: ID
name: String
friends: [User]
friendCount: Int
friendsOverAge18: [User]
}
You can read this great article about how to design a GraphQL API: Shopify/graphql-design-tutorial
Related
I am currently working on a project where we use microservice architecture. I am also somewhat new to this architecture and have had a few concerns. I understand the concept of microservices in general and also how we can have one database per service. This brings me to a point where I get confused on how to pull data from different databases for a particular user.
Scenario
Assuming I have a Users and a Posts service with their schema like this
User
const schema = {
name: String
id: String
...
}
Post
const schema = {
text: String
user: Id // reference of the user who made this post.
}
Now on the UI, I want to load a set of posts and the associated users who made the post, how do I get a Post alongside the User who made the respective Post. I am using MongoDB, how do I populate data that are stored in other databases? I am also using Kafka handle async operations, how do I leverage Kafka for this usecase? Or is there a much better way of doing this? The final response of a Post could be something like this.
{
text: 'Some random message',
user: {
name: 'John Doe',
id: 1234
}
}
Also, I know I could make a call to the User service to get the User, then make a call to the Post service to get the Post and merge both objects together, but is there a much better option than this basically? I am thinking in cases where I want to do multiple lookups for a user, e.g to get a User and their associated Posts, Messages, etc, how can I handle scenarios like this, are their any techniques I could leverage for situations like this?
Thank you in advance!
I think your issue is service boundaries are too granular. I would recommend aligning your services to bounded contexts (https://martinfowler.com/bliki/BoundedContext.html). For example if you have a "blog" service with posts and users, its quite alright for the blog service to contain both a mongo and relational database for the different models.
Then you ask the service "give me posts for a user" and it is responsible for combining that data as part of its logic.
If you MUST keep them separate (which i would not recommend for the exact problem you are having) then I would keep a lightweight cache of usernames inside the posts service.
Use that to populate the usernames into the post when you return one. You can either update the cache on a regular basis using events, polling, batches. Or just query the user service on a cache-miss.
When dealing with distributed systems you cannot rely on consistency and synchronous, stable communication like you can in a monolith.
I'm new to ArangoDB and javascript (and pretty much everything else too). I'm watching a tutorial that uses Vue, GraphQL, Mongoose and MongoDB. I would like to use a very similar technology stack, except with ArangoDB.
However, when using Arangojs (the javascript driver for ArangoDB), I find myself having to work a lot harder and write a lot more code to accomplish the same thing compared to the tutorial, and this difference comes down to Mongoose. I'm looking for a solution or development approach that will allow easy evolution and maintenance of the code base, above all else.
In particular, when defining a Mongoose schema, we can do things like this:
const UserSchema = new mongoose.Schema({
username: {
type: String,
required: true,
unique: true,
trim: true
},
//...
favorites: {
type: [mongoose.Schema.Types.ObjectId],
required: true,
ref: "Post"
}
});
The ref type provides a lot of convenience, for example. I guess I'm looking for something like Mongoose for ArangoDB and javascript, but I don't think such a thing exists. So I'm asking how (at a high level) can I approach my development to achieve a similar level of coding functionality.
For example, can GraphQL's SDL schemas achieve the same thing as a Mongoose Schema, and nearly as easily?
Is there a way to use a GraphQL schema across both the frontend and backend of a javascript application, eliminating javascript typeDefs and models in the process? The goal here, as suggested above, is to prioritize ease of working with the code as features are added and changes made to the project requirements.
Update: this looks like a similar question (without an answer):
Are there ways to populate a schema in Joi from another schema (like Mongoose does)? - Stack Overflow
I have the following react-apollo-wrapped GraphQL query:
user(id: 1) {
name
friends {
id
name
}
}
As semantically represented, it fetches the user with ID 1, returns its name, and returns the id and name of all of its friends.
I then render this in a component structure like the following:
graphql(ParentComponent)
-> UserInfo
-> ListOfFriends (with the list of friends passed in)
This is all working for me. However, I wish to be able to refetch the list of friends for the current user.
I can do this.props.data.refetch() on the parent component and updates will be propagated; however, I'm not sure this is the best practice, given that my GraphQL query looks something more like this,
user(id: 1) {
name
foo1
foo2
foo3
foo4
foo5
...
friends {
id
name
}
}
Whilst the only thing I wish to refetch is the list of friends.
What is the best way to cleanly architect this? I'm thinking along the lines of binding an initially skipped GraphQL fetcher to the ListOfFriends component, which can be triggered as necessary, but would like some guidance on how this should be best done.
Thanks in advance.
I don't know why you question is downvoted because I think it is a very valid question to ask. One of GraphQL's selling points is "fetch less and more at once". A client can decide very granually what it needs from the backend. Using deeply nested graphlike queries that previously required multiple endpoints can now be expressed in a single query. At the same time over-fetching can be avoided. Now you find yourself with a big query, everything loads at once and there are no n+1 query waterfalls. But now you know that a few fields in your big query are subject to change from now and then and you want to actively update the cache with new data from the server. Apollo offers the refetch field but it loads the whole query which clearly is overfetching that was sold to me as not being a concern anymore in GraphQL. Let me offer some solutions:
Premature Optimisation?
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming. - Donald Knuth
Sometimes we try to optimise too much without measuring first. Write it the easy way first and then see if it is really an issue. What exactly is slow? The network? A particular field in the query? The sheer size of the query?
After you analized what exactly is slow we can start looking into improving:
Refetch and include/skip directives
Using directives you can exclude fields from a query depending on variables. The refetch function can specify different variables than the initial query. This way you can exclude fields when you refetch the query.
Splitting up Queries
Single page apps are a great idea. The HTML is generated client side and the page does not have to make expensive trips to the server to render a new page. But soon SPAs got to big and code splitting became an issue. And now we are basically back to server side rendering and splitting the app into pages. The same might apply to GraphQL. Sometimes queries are too big and should be split. You could split up the queries for UserInfo and ListOfFriends. Inside of the cache the fields will be merged. With query batching both queries will be send in the same request and a GraphQL server that implements per request resource caching correctly (e.g. with Dataloader) will barely notice a difference.
Subscriptions
Maybe you are ready to use subscriptions already. Subscriptions send updates from the server for fields that have changed. This way you could subscribe to a user's friends and get updates in real time. The good news is that Apollo Client, Relay and many server implementations offer support for subscriptions already. The bad news is that it needs websockets that usually put different requirements on your technology stack than pure HTTP.
withApollo() -> this.client.query
This should only be your last resort! Using react-apollo's withApollo higher order component you can directly inject the ApolloClient instance. You can now execute queries using this.client.query(). { user(id: 1) { friendlist { ... } } } can be used to just fetch the friend list and update the cache which will lead to an update of your component. This might look like what you want but can haunt you in later stages of the app.
I'm working on a simple JavaScript Twitter clone utilizing Firebase as the backend storage mechanism (JSON). I am familiar with relational databases (SQL) but not with non-relational databases. I am currently trying to work out how to design the structure of the dataset in Firebase, as there is no foreign key relationships or table joining.
The app has three tables, users, tweets, and followers. Users can post tweets, as well as follow other users and see a feed of their tweets. The problem comes when attempting to create the data structure, as I have no idea how I will join the necessary tables. For instance, how will I be able to implement the user-follower functionality?
Here is the ERD that I am using to give myself a starting point:
As I've been trying to wrap my head around this whole JSON thing, this is the closest that I could relate it to a relational database, while still using the Firebase .push() functions to add to the lists in the database (as seen from the Firebase dashboard):
I've seen some people attempting to solve this by "de-normalizing" that data, citing that Firebase doesn't have any query mechanisms. However, these articles are all primarily before 2014, which is when Firebase did add queries. That being said, I don't understand how using the queries could help me, and I think I'm getting stuck with the generated keys.
How should I best structure my data to work with the Firebase JSON lists? I've read some of their documentation but haven't been able to locate anything that uses what I'm looking for and the generated keys.
Should I be attempting to use the .set() method somehow, and using the email addresses as the "primary keys" for the users instead of the generated key? [As mentioned below, this is something I do plan to avoid. Thanks #metame ]
Update
Is this more what I should be looking at for the data structure? And then querying by the keys?
users: {
Hj83Kd83: {
username: "test",
password: "2K44Djl3",
email: "a#b.c"
},
J3kk0dK4: {
username: "another",
password: "33kj4l5K",
email: "d#e.f"
}
}
tweets: {
Jkk3ld92: {
userkey: "Hj83Kd83",
message: "Test message here!"
},
K3Ljdi94: {
userkey: "J3kk0dK4",
message: "Another message here!"
}
}
followers: {
Lk3jdke9: {
userkey: "Hj83Kd83",
followerkey: "J3kk0dK4"
}
}
Let me know if I should include anything else!
Representing relationships in non-relational or noSQL databases, in general, is solved either through embedding documents (noSQL verbiage for rows) or through document references as you have done in your example solution.
The MongoDB site has some decent articles that are mostly applicable to all non-relational databases including Model One-to-Many Relationships with Document References, which I think is most relevant to your issue.
As far as the reference key, it is typically best practice to use the generated IDs as you have assurance that they are unique.
I am building a (RESTful) api (using HTTP) that I want to use with javascript.
I find my self writing stuff in javascript like
function getPost(id)
{
$.getJSON("/api/post/" + id, function(data){
// Success!
});
}
There must be a smarter way than hardcoding the api in javascript, maybe something like querying the api itself for how the getPost url should look like?
function getApiPostUrl()
{
$.getJSON("/api/getpost/");
}
returning something like
url: "/api/post/:id"
Which can be parsed by javascript to obtain the url for actually getting the post with id=:id. I like this approach.
Is a standard way of doing this? I am looking for a good approach so I don't have to invent it all, if there already exists a good solution.
Well, per definition, a RESTful API shall contain the full URI - a Resource Identifier, and not only the resource's path. Thus your question is more a question on how you're designing your whole API.
So, for example, your API could contain a http://fqdn/api/posts that contains a list of all the posts within your site, e.g.:
[ "http://fqdn/api/posts/1",
"http://fqdn/api/posts/2",
"http://fqdn/api/posts/3" ]
and then your javascript only iterates over the values within the list, never needing to craft the path for each resource. You only need to have one known entry point. This is the HATEOAS concept, that uses hyperlinks as API to identifies states of an application.
All in all, it's a good idea to thoroughly think your application (you can use UML tools like the state machine or sequence diagrams..) so that you can cover all your use cases with a simple yet efficient set of sequences defining your API. Then for each sequence, it's a good idea to have a single first state, and you can have a single first step linking to all the sequences.
Resources:
ACM Article
Restful API Design Second Edition Slides
Restful API design blog
Yes, there are quite a few standard ways of doing this. What you want to look for is "hypermedia APIs" - that is, APIs with embedded hypermedia elements such as link templates like yours, but also plain links and even more advanced actions (forms for APIs).
Here is an example representation using Mason to embed a link template in a response:
{
id: 1234,
title: "This is issue no. 1",
"#link-templates": {
"is:issue-query": {
"template": "http://issue-tracker.org/mason-demo/issues-query?text={text}&severity={severity}&project={pid}",
"title": "Search for issues",
"description": "This is a simple search that do not check attachments.",
"parameters": [
{
"name": "text",
"description": "Substring search for text in title and description"
},
{
"name": "severity",
"description": "Issue severity (exact value, 1..5)"
},
{
"name": "pid",
"description": "Project ID"
}
]
}
}
}
The URL template format is standardized in RFC 6570.
Mason is not the only available media type for hypermedia APIs. There is also HAL, Sirene, Collection-JSON and Hydra.
And here is a discussion about the benefits of hypermedia.
Your code clearly violates the self-descriptive messages and the hypermedia as the engine of application state (abbr. HATEOAS) of the uniform interface constraint of REST.
According to HATEOAS you should send back hyperlinks, and the client should follow them, so it won't break by changes of the API. A hyperlink does not equal with an URL. It contains an URL, a HTTP method, maybe the content-type of the body, possibly input fields, etc...
According to self-descriptive messages you should add semantics to the data, the links, the input fields, etc... The client should understand that semantics and behave accordingly. So for example you can add a "create-post" API specific link relation to your hyperlink so the client will understand that it is for creation of posts. Your client should always use these kind of semantics instead of parsing the URLs.
URLs are always API specific, semantics not necessarily, so these constraints decouple the client from the API. After that the client won't break by URL changes or not even data structure changes, because it will use a standard hypermedia format (for example HAL, JSON-LD, ATOM or even HTML) and semantics (probably RDF) to parse the response body.