I am trying to implement caching in Apollo client with the AWS AppSync JavaScript SDK but I am struggling to understand first the best way to use the cache and second what if any changes I need to make to adapt the Apollo V2 tutorials to work with the AppSync SDK.
With regards to using the cache, I have a list of objects that I get, I then want to view and modify a single object from this list. There are lots of tutorials on how to update something in a list, but I would rather run a second query that gets a single object by its ID so that the page will always work without having to go through the list first.
Is the cache smart enough to know that object X got through queries Y and Z is the same object and will be updated at the same time? If not, is there any documentation on how to write an update that will update the object in the list and by itself at the same time?
If no documentation exists then I will try and work it out on my own and post the code (because it will most likely not work).
With regards to the second question I have got the application working and querying the API using Amplify for authentication but I am unsure as to how to correctly implement the cache. Do I need to specify the cache when creating the client or does the SDK have a built-in cache? How do I access the cache? Is it just by querying the client as in these tutorials? https://www.apollographql.com/docs/react/advanced/caching.html
I am going to answer your second question first:
With regards to the second question I have got the application working and querying the API using Amplify for authentication but I am unsure as to how to correctly implement the cache. Do I need to specify the cache when creating the client or does the SDK have a built-in cache? How do I access the cache? Is it just by querying the client as in these tutorials?
Ok. So here is where is gets a little hairy -- it looks like AppSync was deployed at a time when the major client libraries for GraphQL (Apollo, Relay, etc.) were getting an overhaul, so AWS actually created a wrapper around the Apollo Client (probably for stable API purposes) and then exposed their own way of doing things. Just a quick rundown through the code looks like they have their own proprietary and undocumented way of doing things that involves websockets, their authentication protocols, a redux store, offline functionality, ssr, etc). Thus, if it is not explicitly explained here or here, you're in uncharted territory.
Fortunately, all of the stuff that they provided (and much, much more) has now been implemented in the underlying Apollo Client in a documented way. Even more fortunately, it looks like the AppSync client forwards most of the actual GraphQL related stuff directly to the internal Apollo Cache and allows you to pass in config options under cacheOptions, so most of the configuration you can do with the Apollo Client you can do with the AppSync Client (more below).
Unfortunately, you cannot access the cache directly with the AppSync client (they've hidden it to make sure their public API remains stable in the fluctuating ecosystem). However, if you really need more control, most of the stuff they have implemented in the AppSync client could easily be replicated in your own instantiation of an Apollo Client wherein you'd unlock full control (you can use the open-source AppSync code as a foundation). Since GraphQL frontends and backends are decoupled, there is no reason why you couldn't use your own Apollo Client to connect with the AppSync server (for a large, serious project, this is what I would do as the Apollo Client is much better documented and under active development).
Is the cache smart enough to know that object X got through queries Y and Z is the same object and will be updated at the same time? If not, is there any documentation on how to write an update that will update the object in the list and by itself at the same time?
This first part pertains to both the Apollo Client and the AppSync client.
Yes! That's one of the great things about Apollo client - every time you make a query it tries to update the cache. The cache is a normalized key-value store where all of the objects are stored at the top level which the key being a combination of the __typename and id properties of the object. The Apollo client will automatically add __typename to all of your queries (though you will have to add id to your queries manually - otherwise it falls back to just the query path itself as the key [which is not very robust]).
The docs provide a very good overview of the mechanism.
Now, you may need to do some more advanced stuff. For example, if your GraphQL schema uses some unique object identifier other than id, you'll have to provide some function to the dataIdFromObject that maps to it.
Additionally, sometimes when making queries, it is difficult for the cache to know exactly what you are asking for to check the cache before making a network request. To alleviate this problem, they provide the cache redirect mechanism.
Finally, and perhaps most complicated, is how to work with updating the order of stuff in paginated queries (e.g., anything that is in an ordered list). To do this, you'll have to use the #connection directive. Since this is based on the relay connection spec, I'd recommend giving that a skim.
Bonus: To see the cache in action, I'd recommend the Apollo client dev tools. It's a little buggy, but it will at least give you some insight into what it actually happening to the cache locally -- this will not work if using AppSync.
So besides the above information which is all about setting up and configuring the cache, you can also control the data and access to the cache during the runtime of your app (if using Apollo Client directly and not the AppSyncClient).
The Direct Cache Access docs specify the available methods. However, since most of the updates happen automatically just based on the queries you make, you shouldn't have to use these often. However, one use for them is for complicated UI updates. For example, if you make a mutation that deletes an item from a list, instead of requerying for the entire list (which would update the cache, though at the expense of more network data, parsing, and normalization) you could defined a custom cache update using readQuery/writeQuery and the update mutation option. This also plays nicely with optimisticResponse which you should use if you're looking for optimistic UI.
Additionally, you can choose whether you want to use or bypass the cache (or some more advanced strategy) using one of the fetchPolicy or errorPolicy options.
Related
I am using Cloud Firestore in a React Native app and I am trying to reduce the read/writes operations to a minimum. I just thought of using a local DB so that all data fetched from the cloud are saved in the local storage but I would add a snapshot listener to listen for changes whenever the user starts the app.
Is this a good approach for what I am aiming? If not, why? And if yes, do you have any suggestion related to its implementation?
I feel compelled to point out that the other (currently accepted) answer here is flat out incorrect, or at least misleading for a few reasons.
First, Firestore doesn't use HTTP, and the results of queries are never going to be maintained by your typical browser cache. The claims the answer makes about HTTP caching semantics simply do not apply.
Second, the Firestore SDK uses an internal cache, which is enabled by default on Android and iOS, because its sense of cache is almost always going to benefit the end user. Web applications would do well to enable this cache as well. It requires one line of code. This cache will be queried when the client is offline, an can be queried directly if cached results are desired.
Third, adding an additional layer of cache or persistence is actually very necessary for applications that must be fully usable offline. Firestore was not designed to be use fully offline, so having a local-first option is necessary for some applications. The additional cache can be synchronized with Firebase as a sort of cloud backup.
All told, the question is technically too broad for Stack Overflow, and it requires conversation to understand if it's worthwhile to enable Firstore's cache, or add an additional cache on top of that. But it's not patently false that client caching is a bad idea.
No, it's not a good approach.
Caching data is generally a good idea, however implementing this at the DBMS tier will involve writing a lot of code to implement a caching mechanism you have yet to define. The reason it's a bad idea is because JavaScript running on a client already has access to a data tier with very well defined caching semantics already implemented in the the runtime environment - http
We are in the process of slowly adding the graphql to our react project and replacing the existing redux integration. So I am trying to understand caching in the apollo and saw two things.
apollo-cache-inmemory (https://www.npmjs.com/package/apollo-cache-inmemory)
apollo-link-state (https://github.com/apollographql/apollo-link-state)
We have query to call list of apps on the home page and this list of apps will be using on some other page. So one option i tried is to call the list of apps query in the parent container and use the client.readQuery in the child pages, so that the call to the graphql server will be happened only in the container and in the other pages it will get called from the cache. But i saw some posts regarding the use of apollo-link-state in scenarios similar to this. So what is the best method to use here and when to use apollo-cache-inmemory and when to use apollo-link-state?
You shouldn't compare apollo-cache-inmemory directly to apollo-link-state. apollo-cache-inmemory is used to handle caching on Apollo Client, you don't have to do write any custom code for it to work (apart from just telling Apollo Client to use it). Any data that you fetch from api are cached automatically.
apollo-link-state however is meant for client-side caching, e.g. the NetworkStatus of the browser, or current active tab. States that usually not send back to a backend server.
So, you only need to consider whether you need client-side caching or not. In most cases that I saw, a project would eventually end up using both.
I plan to implement a GraphQL API in .NET on IIS and dataLoader API as a Node.js app server. GraphQL will interface to dataLoader to SQL Server.
All applications will be on a single physical server for now, but may possibly be separated in the future if scalability requires.
My reasons for this:
Existing code depends on IIS/COM/DCOM/ActiveX/.NET/ASP/ASPX
Simpler to implement and reason
Access control (web server doesn't need to see dataLoader code and ACLs can be implemented in dataLoader)
Makes it easier if I get the chance to interface with a different db (redis, mongodb, etc)
I can gradually slice and port parts of the code to allow easier code sharing (with separate Linux servers)
(I like) Node.js open to exploration, but cannot opt-in yet
First off, does this make sense or am I asking for trouble?
Would it make sense to use a binary serialization format between GraphQL and dataLoader? Or perhaps just a simple web service would be simpler?
Am I risking performance problems from more round-tripping? (Question too open-ended? Intuitively it seems like this would scale better eventually)
Is there a need for explicit authentication between GraphQL and dataLoader? Or can I just send session data (with username) through as-is and just let dataLoader trust the username given as context? Maybe pass a token? Are JWT tokens useful here?
GraphQL-dotnet has matured a bit since then and is looking pretty good.
I've since looked into solutions like AWS's API Gateway GraphQL support, and some Azure Functions solutions that support GraphQL.
Some of the techniques and design choices involved in these things were helpful here and there. But due to practical reasons, this never really came to fruition and most of these concerns never became relevant.
I have a bit of conceptual question regarding the structure of users and their documents.
Is it a good practice to give each user within CouchDB their own database which hold their document?
I have read that couchDB can handle thousands of Databases and that It is not that uncommon for each user to have their database.
Reason:
The reason for asking this question is that I am trying to create a system where a logged in user can only view their own document and can't view any other users document.
Any suggestions.
Thank you in advance.
It’s rather common scenario to create CouchDB bucket (DB) for each user. Although there are some drawbacks:
You must keep ddocs in sync in each user bucket, so deployment of ddoc changes across multiple buckets may become a real adventure.
If docs are shared between users in some way, you get doc and viewindex dupes in each bucket.
You must block _info requests to avoid user list leak (or you must name buckets using hashes).
In any case, you need some proxy in front of Couch to create and prepare a new bucket on user registration.
You better protect Couch from running out of capacity when it receives to many requests – it also requires proxy.
Per-doc read ACL can be implemented using _list functions, but this approach has some drawbacks and it also requires a proxy, at least a web-server, in front of CouchDB. See CouchDb read authentication using lists for more details.
Also you can try to play with CoverCouch which implements a full per-doc read ACL, keeping original CouchDB API untouched, but it’s in very early beta.
This is quite a common use case, especially in mobile environments, where the data for each user is synchronized to the device using one of the Android, iOS or JavaScript (pouchdb) libraries.
So in concept, this is fine but I would still recommend testing thoroughly before going into production.
Note that one downside of multiple databases is that you can't write queries that span multiple database. There are some workarounds though - for more information see Cloudant: Searching across databases.
Update 17 March 2017:
Please take a look at Cloudant Envoy for more information on this approach.
Database-per-user is a common pattern with CouchDB when there is a requirement for each application user to have their own set of documents which can be synced (e.g. to a mobile device or browser). On the surface, this is a good solution - Cloudant handles a large number of databases within a single installation very well. However ...
Source: https://github.com/cloudant-labs/envoy
The solution is as old as web applications - if you think of a mySQL database there is nothing in the database to stop user B viewing records belonging to user A - it is all coded in the application layer.
In CouchDB there is likewise no completely secure way to prevent user B from accessing documents written by user A. You would need to code this in your application layer just as before.
Provided you have a web application between CouchDB and the users you have no problem. The issue comes when you allow CouchDB to serve requests directly.
Using multiple database for multiple users have some important drawbacks:
queries over data in different databases are not possible with the native couchdb API. Analysis on your website overall status are quite impossible!
maintenance will soon becomes very hard: let's think of replicating/compacting thousands of database each time you want to perform a backup
It depends on your use case, but I think that a nice approach can be:
allow access only through virtual host. This can be achieved using a proxy or much more simply by using a couchdb hosting provider which lets you fine-tune your "domains->path" mapping
use design docs / couchapps, instead of direct document CRUD API, for read/write operations
2.1. using _rewrite handler to allow only valid requests: in this way you can instantly block access to sensible handlers like _all_docs, _all_dbs and others
2.2. using _list and _view handlers for read doc/role based ACLs as described in CouchDb read authentication using list
2.3. using _update handlers for write doc/role based ACLs
2.4. using authenticated rewriting rules for read/write role based ACL.
2.3. filtered _changes handler is another way of retrieving all user's data with read doc/role based ACL. Depending on your use case this can effectively simplify as much as possible your read API, letting you concentrate on your update API.
I am planning to use Qualtrics REST API, in order to get the data collected from a survey. Can i still retain Meteor's reactivity directly thru the rest api or should I save the data from the rest API into MongoDB to enable for real time updates within the app?
Any advice and further reading will be great.
This will sound like a noob question probably but I am just starting off with Meteor and JS as server side code and never used a web api before.
It entirely depends on what you do with the data it returns. Assuming you're either polling periodically or the API has some kind of push service (I've never heard of it before, so I have no idea), you would need to store the data it returns in a reactive data source: probably a Collection or Session variable, depending on how much persistence is required. Any Meteor templates that access these structures have reactivity built in, as documented here.
Obviously, you will probably need to be polling the API at an appropriately regular interval for this set up to work though. Take a look at Meteor.setInterval, or the meteor-cron package, which is probably preferable.