I would like to know what do you think about the following task. I want to write data from JSON object in a database. I would like to separate the SQL logic with the business logic.
I read t'hi strategy has not good performance, when the file js contain a lot of queries.
Which approach is the best practice in your opinion? Can you provide a little example?
Your performance question is definitely a 'race your horses' scenario (i.e. test it and see). But in general, if you're going to do this I'd simply export an object with all your named queries like so:
module.exports = {
getAllUsers: "SELECT username, email, displayName FROM users;",
/* other queries */
}
Your calling code can then just require that file and get what it needs:
const queries = require('./db/queries');
queries.getAllUsers // <-- this is now that string
Performance should be about as good as it gets, since your require cache will ensure the file is only read once, and a key-based lookup in JS is pretty quick, even with a thousand or two entries.
I think is always a good practice to separate DB code from business code, and from API code if it exists.
Creating these different layers, you get different advantages:
Testing every layer separately (with unit tests), mocking other layers. With this you can detect errors very fast when you make changes in your code.
You can change very easy your DB connector, or even your database, without impacting your business code (e.g. MySQL by MongoDB)
You can change your API or add a new one without changing your business code (e.g. REST API by/and GraphQL)
If you want to see a project with this layers, we published recently a simple project that allow you to create a collaborative newsletter. You can check backend part, which has db folder, domain folder and api folder. Those are the 3 layers I was talking about:
Colaborative newsletter
Hope it helps you
Related
In aws dynamo db we cannot store more than 400KB data in a single record [Reference].
Based on suggestions online I can either compress the data before storing or upload part of it to aws s3 bucket which I am fine by
But my application (javascript/express server plus many js lambdas/microservices) is too large and adding the above logic which require a heavy re-write and extensive testing. Currently there is an immediate requirement from a big client that demands >400KB storage in db, so is there any alternative way to solve the problem that doesn't make me change my existing code to fetch the record from db.
I was thinking more in these lines:
My backend makes a dynamo db call to fetch the record as its doing now (we use a mix of vogels and aws-sdk to make db calls) -> The call is intercepted by a lambda (or something else) which handles the necessary compression/decompression/s3 with dynamodb and returns the data to the backend.
Is the above approach possible to do and if yes then how can i go about implementing it? Or if you have a better way, please do tell.
PS. Going forward I will definitely re-write my codebase to take care of this, what I am asking for is an immediate stopgap solution.
Split the data into multiple items. You’ll have to change a little client code but hopefully you have a data access layer so it’s just a small change in one place. If you don’t have a DAL, from now on always have a DAL. :)
For the payload of a big item, use the regular item as the manifest which can point at the segmented items. Then batch get items those segmented items.
This assumes compression alone isn’t always sufficient. If it is, do that.
I have the following react-apollo-wrapped GraphQL query:
user(id: 1) {
name
friends {
id
name
}
}
As semantically represented, it fetches the user with ID 1, returns its name, and returns the id and name of all of its friends.
I then render this in a component structure like the following:
graphql(ParentComponent)
-> UserInfo
-> ListOfFriends (with the list of friends passed in)
This is all working for me. However, I wish to be able to refetch the list of friends for the current user.
I can do this.props.data.refetch() on the parent component and updates will be propagated; however, I'm not sure this is the best practice, given that my GraphQL query looks something more like this,
user(id: 1) {
name
foo1
foo2
foo3
foo4
foo5
...
friends {
id
name
}
}
Whilst the only thing I wish to refetch is the list of friends.
What is the best way to cleanly architect this? I'm thinking along the lines of binding an initially skipped GraphQL fetcher to the ListOfFriends component, which can be triggered as necessary, but would like some guidance on how this should be best done.
Thanks in advance.
I don't know why you question is downvoted because I think it is a very valid question to ask. One of GraphQL's selling points is "fetch less and more at once". A client can decide very granually what it needs from the backend. Using deeply nested graphlike queries that previously required multiple endpoints can now be expressed in a single query. At the same time over-fetching can be avoided. Now you find yourself with a big query, everything loads at once and there are no n+1 query waterfalls. But now you know that a few fields in your big query are subject to change from now and then and you want to actively update the cache with new data from the server. Apollo offers the refetch field but it loads the whole query which clearly is overfetching that was sold to me as not being a concern anymore in GraphQL. Let me offer some solutions:
Premature Optimisation?
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming. - Donald Knuth
Sometimes we try to optimise too much without measuring first. Write it the easy way first and then see if it is really an issue. What exactly is slow? The network? A particular field in the query? The sheer size of the query?
After you analized what exactly is slow we can start looking into improving:
Refetch and include/skip directives
Using directives you can exclude fields from a query depending on variables. The refetch function can specify different variables than the initial query. This way you can exclude fields when you refetch the query.
Splitting up Queries
Single page apps are a great idea. The HTML is generated client side and the page does not have to make expensive trips to the server to render a new page. But soon SPAs got to big and code splitting became an issue. And now we are basically back to server side rendering and splitting the app into pages. The same might apply to GraphQL. Sometimes queries are too big and should be split. You could split up the queries for UserInfo and ListOfFriends. Inside of the cache the fields will be merged. With query batching both queries will be send in the same request and a GraphQL server that implements per request resource caching correctly (e.g. with Dataloader) will barely notice a difference.
Subscriptions
Maybe you are ready to use subscriptions already. Subscriptions send updates from the server for fields that have changed. This way you could subscribe to a user's friends and get updates in real time. The good news is that Apollo Client, Relay and many server implementations offer support for subscriptions already. The bad news is that it needs websockets that usually put different requirements on your technology stack than pure HTTP.
withApollo() -> this.client.query
This should only be your last resort! Using react-apollo's withApollo higher order component you can directly inject the ApolloClient instance. You can now execute queries using this.client.query(). { user(id: 1) { friendlist { ... } } } can be used to just fetch the friend list and update the cache which will lead to an update of your component. This might look like what you want but can haunt you in later stages of the app.
In my application I receive json data in a post request that I store as raw json data in a table. I use postgresql (9.5) and node.js .
In this example, the data is an array of about 10 quiz questions experienced by a user, that looks like this:
[{"QuestionId":1, "score":1, "answerList":["1"], "startTime":"2015-12-14T11:26:54.505Z", "clickNb":1, "endTime":"2015-12-14T11:26:57.226Z"},
{"QuestionId":2, "score":1, "answerList":["3", "2"], "startTime":"2015-12-14T11:27:54.505Z", "clickNb":1, "endTime":"2015-12-14T11:27:57.226Z"}]
I need to store (temporarily or permanently) several indicators computed by aggregating data from this json at quizz level, as I need these indicators to perform other procedures in my database.
As of now I was computing the indicators using javascript functions at the time of handling the post request and inserting the values in my table alongside the raw json data. I'm wondering if it wouldn't be more performant to have the calculation performed by a stored trigger function in my postgresql db (knowing that the sql function would need to retrieve the data from inside the json raw data).
I have read other posts on this topic, but it was asked many years ago and not with node.js, so I thought people might have some new insight on the pros and cons of using sql stored procedures vs server-side javascript functions.
edit: I should probably have mentioned that most of my application's logic already mostly lies in postgresql stored procedures and views.
Generally, I would not use that approach due to the risk of getting the triggers out of sync with the code. In general, the single responsibility principle should be the guide: DB to store data and code to manipulate it. Unless you have a really pressing business need to break this pattern, I'd advise against it.
Do you have a migration that will recreate the triggers if you wipe the DB and start from scratch? Will you or a coworker not realise they are there at a later point when reading the app code and wonder what is going on? If there is a standardised way to manage the triggers where the configuration will be stored as code with the rest of your app, then maybe not a problem. If not, be wary. A small performance gain may well not be worth the potential for lost developer time and shipping bugs.
Currently working somewhere that has gone all-in on SQL functions.. We have over a thousand.. I'd strongly advise against it.
Having logic split between Javascript and SQL is a real pain when debugging issues especially if, like me, you are much more familiar with JS.
The functions are at least all tracked in source control and get updated/created in the DB as part of the deployment process but this means you have 2 places to look at when trying to follow the code.
I fully agree with the other answer, single responsibility principle, DB for storage, server/app for logic.
I recently followed a tutorial to create a node.js server connecting to orchestrate.io database. The problem is I now want to point the server at a mongodb hosted on mongolab - currently I am declaring a variable:
var db = require('orchestrate')(APIKEY);
which allows me to retrieve data using something like:
db.get('collection', key)
.then(function(result){
console.log(result.body);
});
My question is - Is there any way I can switch the value of 'db' to point at a mongolab database without changing the structure of the get request?
I work at Orchestrate and we do not believe in data lock-in. I hope you'll reconsider using our service, but here's some advice if you choose to leave...
It sounds like your code is fairly minimal, so you may be best off recreating your Node server with another tutorial specific to Mongo.
That said, if you are using simple key-value storage, it should be as easy as rewriting the db.get Orchestrate lines to be db.find functions from MongoDB. If you've loaded a lot of data you could export it from Orchestrate, then import into Mongo (either manually, or using another tool).
If you're using some advanced, built-in Orchestrate features, such as full-text search, relation graphing, time-series data, and geographic look-ups, it may take some more effort (and MongoDB experience) to switch. If you'd like these features in a highly scalable database-as-a-service that you don't have to maintain, you know where to find us.
I am playing around with CouchDB to test if it is "possible" [1] to store scientific data (simulated and experimental raw data + metadata). A big pro is the schema-less approach of CouchDB: we have to be very flexible with the metadata, as the set of parameters changes very often.
Up to now I have some code to feed raw data, plots (both as attachments), and hierarchical metadata (as JSON) into CouchDB documents, and have written some prototype Javascript for filtering and showing. But the filtering is done on the client side (a.k.a. browser): The map function simply returns everything.
How could I change the (or push a second) map function of a specific _design-document with simple browser-JS?
I do not think that a temporary view would yield any performance gain...
Thanks for your time and answers.
[1]: of course it is possible, but is it also useful? feasible? reasonable?
[added]
Ah, the jquery.couch.js (version 0.9.0) provides a saveDoc() function, which could update the _design document with the new map function.
But I also tried out the query function, which uses a temporary view. Okay, "do not use this in the real product, only during development"... But scientific research is steady development, right?
Temporary views are getting cached, as I noticed, and it works well for ~1000 documents per DB. A second plus: all users (think of 1 to 3, so a big user management is quit of an overkill) can work with their own temporary view.
Never ever use temporary views. They are really only there for dev and debugging purposes. For more information, see http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views (specifically the bold "NOTE").
And yes, because design documents are really just documents with special powers, you can run you GET/POST/PUT/DELETE methods on them. However, you will usually need admin privileges to do this. So, if you are allowing a client side piece of software to do that, you are making your entire database public for read/write access - this may be fine for your application, but is important to remember.
Ex., if you restrict access to your database, but put the username and password in client side javascript, then anyone can see that username and password.
Cheers.
I´ve written an helper functions for jquery.couch and design docs, take a look at:
https://github.com/grischaandreew/jquery.couch.js