I'm wondering if there's any consensus out there with regard to how best to handle GraphQL field arguments when using Dataloader. The batchFn batch function that Dataloader needs expects to receive Array<key> and returns an Array<Promise>, and usually one would just call load( parent.id ) where parent is the first parameter of the resolver for a given field. In most cases, this is fine, but what if you need to provide arguments to a nested field?
For example, say I have a SQL database with tables for Users, Books, and a relationship table called BooksRead that represent a 1:many relationship between Users:Books.
I might run the following query to see, for all users, what books they have read:
query {
users {
id
first_name
books_read {
title
author {
name
}
year_published
}
}
}
Let's say that there's a BooksReadLoader available within the context, such that the resolver for books_read might look like this:
const UserResolvers = {
books_read: async function getBooksRead( user, args, context ) {
return await context.loaders.booksRead.load( user.id );
}
};
The batch load function for the BooksReadLoader would make an async call to a data access layer method, which would run some SQL like:
SELECT B.* FROM Books B INNER JOIN BooksRead BR ON B.id = BR.book_id WHERE BR.user_id IN(?);
We would create some Book instances from the resulting rows, group by user_id, then return keys.map(fn) to make sure we assign the right books to each user_id key in the loader's cache.
Now suppose I add an argument to books_read, asking for all the books a user has read that were published before 1950:
query {
users {
id
first_name
books_read(published_before: 1950) {
title
author {
name
}
year_published
}
}
}
In theory, we could run the same SQL statement, and handle the argument in the resolver:
const UserResolvers = {
books_read: async function getBooksRead( user, args, context ) {
const books_read = await context.loaders.booksRead.load( user.id );
return books_read.filter( function ( book ) {
return book.year_published < args.published_before;
});
}
};
But, this isn't ideal, because we're still fetching a potentially huge number of rows from the Books table, when maybe only a handful of rows actually satisfy the argument. Much better to execute this SQL statement instead:
SELECT B.* FROM Books B INNER JOIN BooksRead BR ON B.id = BR.book_id WHERE BR.user_id IN(?) AND B.year_published < ?;
My question is, does the cacheKeyFn option available via new DataLoader( batchFn[, options] ) allow the field's argument to be passed down to construct a dynamic SQL statement in the data access layer? I've reviewed https://github.com/graphql/dataloader/issues/75 but I'm still unclear if cacheKeyFn is the way to go. I'm using apollo-server-express. There is this other SO question: Passing down arguments using Facebook's DataLoader but it has no answers and I'm having a hard time finding other sources that get into this.
Thanks!
Pass the id and params as a single object to the load function, something like this:
const UserResolvers = {
books_read: async function getBooksRead( user, args, context ) {
return context.loaders.booksRead.load({id: user.id, ...args});
}
};
Then let the batch load function figure out how to satisfy it in an optimal way.
You'll also want to do some memoisation for the construction of the object, because otherwise dataloader's caching won't work properly (I think it works based on identity rather than deep equality).
Related
Let's say I have the following code:
db.task(t => {
return t.none('set search_path to myschema').then(() => {
return t.any('select * from mytable').then(results => {
return t.none('set search_path to originalschema').then(() => {
return results
})
})
})
})
Could a query outside of db.task(), that happened to run in between of the two search_path changes inside db.task(), actually access the data in 'myschema' instead of 'originalschema'?
Could a query outside of db.task(), that happened to run in between of the two search_path changes inside db.task(), actually access the data in 'myschema' instead of 'originalschema'?
No.
SET search_path is a session-based operation, i.e. it applies only to the current connection, which the task allocates exclusively for the entire duration of its execution.
Once the task has finished, it releases the connection back to the pool. At that point, any query that gets that same connection will be working with the alternative schema, unless it is another task that sets the schema again. This gets tricky, if you are setting schema in just one task, and generally not recommended.
Here's how it should be instead:
If you want to access a special-case schema inside just one task, best is to specify the schema name explicitly in the query.
If you want to set custom schema(s) dynamically, for the entire app, best is to use option schema, of the Initialization Options. This will propagate the schema automatically through all new connections.
If you want to set schema statically, there are queries for setting schema permanently.
Addition:
And if you have a very special case, whereby you have a task that needs to run reusable queries inside an alternative schema, then you would set the schema in the beginning of the task, and then restore it to the default schema at the end, so any other query that picks up that connection later won't try to use the wrong schema.
Extra:
Example below creates your own task method (I called it taskEx), consistent across the entire protocol, which accepts new option schema, to set the optional schema inside the task:
const initOptions = {
extend(obj) {
obj.taskEx = function () {
const args = pgp.utils.taskArgs(arguments); // parse arguments
const {schema} = args.options;
delete args.options.schema; // to avoid error thrown
if (schema) {
return obj.task.call(this, args.options, t => {
return t.none('SET search_path to $1:name', [schema])
.then(args.cb.bind(t, t));
});
}
return obj.task.apply(this, args);
}
}
});
const pgp = require('pg-promise')(initOptions);
So you can use anywhere in your code:
const schema = 'public';
// or as an array: ['public', 'my_schema'];
db.taskEx({schema}, t => {
// schema set inside task already;
});
Note that taskEx implementation assumes that the schema is fully dynamic. If it is static, then there is no point re-issuing SET search_path on every task execution, and you would want to do it only for fresh connections, based on the following check:
const isFreshConnection = t.ctx.useCount === 0;
However, in that case you would be better off using initialization option schema instead, as explained earlier.
I am incredibly new with javascript and I do not entirely understand promises. For simple operations like read or write, I understand that a promise is needed before the code can continue, but I am not entirely sure how to deal with multiple promises at once, specifically after calling .get().
My goal is to query documents quite deep within my Firestore db, and the names of documents in subcollections higher up are not known to me, as they are created by the users. Each user will have places and these places will have guests, and my function intends to search through the guests and select them according to a field value. My code so far is this. Is there a better way?
async function getGuests() {
var results = [];
var users = await db.collection('users').get();
users.forEach(async function(doc) {
var places = await doc.collection('places').get();
places.forEach(async function(doc2) {
var guests = await doc2.collection('guests').where(...).get();
return results.concat(guests);
});
return results;
});
return results;
}
hierarchy looks like:
users/{userID}/places/{place_name}/guests/{guest}
Sounds like you just want to do a collection group query instead. What you're doing right now is massively inefficient if you just want to make a query among all subcollections called "guests".
const querySnapshot = await db.collectionGroup('guests').get();
// iterate queryShapshot to get document contents
I'm having trouble understanding how to retrieve information from a GraphQL Union. I have something in place like this:
const Profile = StudentProfile | TeacherProfile
Then in my resolver I have:
Profile: {
__resolveType(obj, context, info) {
if (obj.studentId) {
return 'StudentProfile'
} else if (obj.salaryGrade) {
return 'TeacherProfile'
}
},
},
This doesn't throw any errors, but when I run a query like this:
query {
listUsers {
id
firstName
lastName
email
password
profile {
__typename
... on StudentProfile {
studentId
}
... on TeacherProfile {
salaryGrade
}
}
}
}
This returns everything except for profile which just returns null. I'm using Sequelize to handle my database work, but my understanding of Unions was that it would simply look up the relevant type for the ID being queried and return the appropriate details in the query.
If I'm mistaken, how can I get this query to work?
edit:
My list user resolver:
const listUsers = async (root, { filter }, { models }) => {
const Op = Sequelize.Op
return models.User.findAll(
filter
? {
where: {
[Op.or]: [
{
email: filter,
},
{
firstName: filter,
},
{
lastName: filter,
},
],
},
}
: {},
)
}
User model relations (very simple and has no relation to profiles):
User.associate = function(models) {
User.belongsTo(models.UserType)
User.belongsTo(models.UserRole)
}
and my generic user resolvers:
User: {
async type(type) {
return type.getUserType()
},
async role(role) {
return role.getUserRole()
},
},
The easiest way to go about this is to utilize a single table (i.e. single table inheritance).
Create a table that includes columns for all the types. For example, it would include both student_id and salary_grade columns, even though these will be exposed as fields on separate types in your schema.
Add a "type" column that identifies each row's actual type. In practice, it's helpful to name this column __typename (more on that later).
Create a Sequelize model for your table. Again, this model will include all attributes, even if they don't apply to a specific type.
Define your GraphQL types and your interface/union type. You can provide a __resolveType method that returns the appropriate type name based on the "type" field you added. However, if you named this field __typename and populated it with the names of the GraphQL types you are exposing, you can actually skip this step!
You can use your model like normal, utilizing find methods to query your table or creating associations with it. For example, you might add a relationship like User.belongsTo(Profile) and then lazy load it: User.findAll({ include: [Profile] }).
The biggest drawback to this approach is you lose database- and model-level validation. Maybe salary_grade should never be null for a TeacherProfile but you cannot enforce this with a constraint or set the allowNull property for the attribute to false. At best, you can only rely on GraphQL's type system to enforce validation but this is not ideal.
You can take this a step further and create additional Sequelize models for each individual "type". These models would still point to the same table, but would only include attributes specific to the fields you're exposing for each type. This way, you could at least enforce "required" attributes at the model level. Then, for example, you use your Profile model for querying all profiles, but use the TeacherProfile when inserting or updating a teacher profile. This works pretty well, just be mindful that you cannot use the sync method when structuring your models like this -- you'll need to handle migrations manually. You shouldn't use sync in production anyway, so it's not a huge deal, but definitely something to be mindful of.
I have a content model "category" that contains products (also a content model). Now I have to get the category in which a product is linked. For this I got the product url_name (unique).
I searched in the kentico-delivery-sdk (JS) docs for a filter, that can go deeply inside an object/linked content model.
categoryByProduct: async (
_,
{ product, limit = 1,depth, order, language }
) => {
const query = deliveryClient.items();
language && query.languageParameter(getKcCodeFromLang(language));
limit && query.limitParameter(limit);
depth && query.depthParameter(depth);
order && query.orderParameter(order);
query.containsFilter("elements.produkte[].url_name", product)
const response = await query.getPromise();
return response.items;
},
With this approach I never get a response from GraphQL. Is this the wrong filter?
Filtering for Kentico Cloud API does not currently allow you to specify filters on nested properties and thefore filter such as elements.produkte[].url_name gives this exception when run against Delivery API directly:
Operator '[].produkte' was not recognized as a valid operator.
What you are trying to do is perfectly valid scenario, though currently you will have to make additional request on your products, filter it and combine results of your queries to give one final result.
It's possible to query for a parent which has the current item as a subpage with the following query;
let items = (
await kontent
.items<IContentItem>
.containsFilter('elements.subpages', [currentItem.system.codename])
.toPromise()).data.items;
Where currentItem is the one I want to find the parent for.
I want to delete from an articles table using knex by article_id. This already exists in comments table as a foreign key.
How can I test that data has been deleted and how can I send that to the user.
I decided to approach this by writing a function to delete from both functions with a .then. Does this look like I am on the right lines?
exports.deleteArticleById = function (req, res, next) {
const { article_id } = req.params;
return connection('comments')
.where('comments.article_id', article_id)
.del()
.returning('*')
.then((deleted) => {
console.log(deleted);
return connection('articles')
.where('articles.article_id', article_id)
.del()
.returning('*');
})
.then((article) => {
console.log(article);
return res.status(204).send('article deleted');
})
.catch(err => next(err));
};
At the moment I am getting the correct data with the logs but I am getting a status 500 but I think I need to be trying to get a 204?
Any help would be much appreciated.
What you're trying to do is called a cascading deletion.
These are better (and almost always) handled at the database level instead of the application level.
It's the job of the DBMS to enforce this kind of referential integrity assuming you define your schema correctly so that entities are correctly linked together, via foreign keys.
In short, you should define your database schema as such that when you delete an Article, it's associated Comments also get deleted for you.
Here's how I would do it using knex.js migrations:
// Define Article.
db.schema.createTableIfNotExists('article', t => {
t.increments('article_id').primary()
t.text('content')
})
// Define Comment.
// Each Comment is associated with an Article (1 - many).
db.schema.createTableIfNotExists('comment', t => {
t.increments('comment_id').primary() // Add an autoincrement primary key (PK).
t.integer('article_id').unsigned() // Add a foreign key (FK)...
.references('article.article_id') // ...which references Article PK.
.onUpdate('CASCADE') // If Article PK is changed, update FK as well.
.onDelete('CASCADE') // If Article is deleted, delete Comment as well.
t.text('content')
})
So when you run this to delete an Article:
await db('article').where({ article_id: 1 }).del()
All Comments associated with that Article also get deleted, automatically.
Don't try to perform cascading deletions yourself by writing application code. The DBMS is specifically designed with intricate mechanisms to ensure that deletions always happen in a consistent manner; It's purpose is to handle these operations for you. it would be wasteful, complicated and quite error-prone to attempt to replicate this functionality yourself.