If I have a collection, and collection contain several documents.
Every document contain field age
The id of collection is data
There are 20 documents.
db.collection('data').orderBy('age')
My question is:
How to get certain range of that documents which is already order by age.
Example:
After order by age
[doc5, doc12, doc9, doc4, doc1, doc15, doc7, doc14, doc11, doc17,
doc3, doc2, doc13, doc8, doc6, doc18, doc20, doc16, doc19, doc10]
Fourth to Sixth (doc4, doc1, doc15)
Eleventh to Twelfth (doc3, doc2)
Fourteenth (doc8)
Firestore does not offer (actually, no longer offers) client-side queries that let you jump to an offset within the query results. You have to start from the beginning, and manually skip each result that you're not interested in, keeping track of the current index as you go. Yes, this will cost you excess document reads, but you don't have an alternative if you're not able to assign index values of your own for each document.
If you want to perform the query on a backend, you have offset() available, but the documents skipped are still counted as reads, and is neither an efficient or cheap way of skipping unwanted results.
Related
I would like to create two queries, with pagination option. On the first one I would like to get the first ten records and the second one I would like to get the other all records:
.startAt(0)
.limit(10)
.startAt(9)
.limit(null)
Can anyone confirm that above code is correct for both condition?
Firestore does not support index or offset based pagination. Your query will not work with these values.
Please read the documentation on pagination carefully. Pagination requires that you provide a document reference (or field values in that document) that defines the next page to query. This means that your pagination will typically start at the beginning of the query results, then progress through them using the last document you see in the prior page.
From CollectionReference:
offset(offset) → {Query}
Specifies the offset of the returned results.
As Doug mentioned, Firestore does not support Index/offset - BUT you can get similar effects using combinations of what it does support.
Firestore has it's own internal sort order (usually the document.id), but any query can be sorted .orderBy(), and the first document will be relative to that sorting - only an orderBy() query has a real concept of a "0" position.
Firestore also allows you to limit the number of documents returned .limit(n)
.endAt(), .endBefore(), .startAt(), .startBefore() all need either an object of the same fields as the orderBy, or a DocumentSnapshot - NOT an index
what I would do is create a Query:
const MyOrderedQuery = FirebaseInstance.collection().orderBy()
Then first execute
MyOrderedQuery.limit(n).get()
or
MyOrderedQuery.limit(n).get().onSnapshot()
which will return one way or the other a QuerySnapshot, which will contain an array of the DocumentSnapshots. Let's save that array
let ArrayOfDocumentSnapshots = QuerySnapshot.docs;
Warning Will Robinson! javascript settings is usually by reference,
and even with spread operator pretty shallow - make sure your code actually
copies the full deep structure or that the reference is kept around!
Then to get the "rest" of the documents as you ask above, I would do:
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).get()
or
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).onSnapshot()
which will start AFTER the last returned document snapshot of the FIRST query. Note the re-use of the MyOrderedQuery
You can get something like a "pagination" by saving the ordered Query as above, then repeatedly use the returned Snapshot and the original query
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).limit(n).get() // page forward
MyOrderedQuery.endBefore(ArrayOfDocumentSnapshots[0]).limit(n).get() // page back
This does make your state management more complex - you have to hold onto the ordered Query, and the last returned QuerySnapshot - but hey, now you're paginating.
BIG NOTE
This is not terribly efficient - setting up a listener is fairly "expensive" for Firestore, so you don't want to do it often. Depending on your document size(s), you may want to "listen" to larger sections of your collections, and handle more of the paging locally (Redux or whatever) - Firestore Documentation indicates you want your listeners around at least 30 seconds for efficiency. For some applications, even pages of 10 can be efficient; for others you may need 500 or more stored locally and paged in smaller chucks.
I want to retrieve the last 20 documents in my large collection in an efficient manner.
This SO post offered this performant solution - but it does not answer my question because my question is specifically dealing with _id index - :
db.collectionName.find().min(minCriteria).hint(yourIndex).limit(N)
However, my collection just contains the default index (_id). I'm just not sure what min criteria would be - I obviously don't want to hardcode an _id value, as the collection is periodically emptied.
itemsCollection.find().min(<minCriteria>).hint({_id:1}).limit(20)
Is there any way to use min with the _id index? Or is my only option creating a new index?
Yes, you can use min with the _id index, as long as your <minCriteria> only reference the _id field.
If your min criteria is on something other than _id, you will need to create an index on that criteria in order to avoid this query being a full collection scan.
The min() cursor method is for establishing a lower bound for the index scan that will service the query. This is probably not what you are looking for to retrieve the most recently added documents.
Assuming each document's _id field contains an ObjectId or some other value that sorts in the order they were inserted, then you can, as noted in the comments, do a reverse sort on _id and limit to the number of documents desired, which can be very efficient.
This query should automatically use the _id index:
db.itemsCollection.find().sort({_id:-1}).limit(20)
The date part of the ObjectId is determined by the system creating the value, which in some cases is a client/application server. This means that clock drift may affect the ordering.
If you want to get the documents that were most recently inserted into the collection, you can use natural order:
db.itemsCollection.find().sort({$natural:-1}).limit(20)
This doesn't use an index, but it should still be fairly performant because it will only scan the number of documents you want to return.
I am struggling to find good material on best practices for filtering data using firebase firestore. I want to filter my data based on the categories selected by the user. I have a collection of documents stored on my firestore database and each document have an array which has all the appropriate categories for that single document. For the sake of filtering, I'm keeping a local array with a user's preferred categories as well. All I want to do is to filter the data based on the user's preferred categories.
firestore categories field
consider I have the user's preferred categories stored as an array of strings ( ["Film", "Music"] ) .I was planning on using firestore's 'array-contains' method like
db.collection(collectioname)
.where('categoriesArray', 'array-contains', ["Film", "Music"])
Later I found out that I can't use 'array-contains' against an array itself and after investigating on this issue, I decided to change my data structure as mentioned here.
categories changed to Map
Once I changed the categories from an array to map, I thought I could use multiple where conditions to filter the documents
let query = db.collection(collectionName)
.where(somefield, '==', true)
this.props.data.filterCategories.forEach((val) => {
query = query.where(`categories.${val}`, '==', true);
});
query = query
.orderBy(someOtherField, "desc")
.limit(itemsPerPage)
const snapshot = await query.get()
Now problem number 2, firebase requires to add indexes for compound queries. The categories I have saved within each document is dynamic and there's no way I can add these indexes in advance. What would be the ideal solution in such cases? Any help would be deeply appreciated.
This is a new feature of Firebase JavaScript SDK launched at November 7, 2019:
Version 7.3.0 - November 7, 2019
array-contains-any
"array-contains-any operator to combine up to 10 array-contains clauses on the same field with a logical OR. An array-contains-any query returns documents where the given field is an array that contains one or more of the comparison values"
citiesRef.where('regions', 'array-contains-any',
['west_coast', 'east_coast']);
Instead of iterating through each category that you wish to query and appending clauses to a single query object, each iteration should be its own independent query. And you can keep the categories in an array.
<document>
- itemId: abc123
- categories: [film, music, television]
If you wish to perform an OR query, you would make n-loops where each loop would query for documents where array-contains that category. Then on your end, you would dedup (remove duplicates) from the results based on the item's identifier. So if you wanted to query film or music, you would make 2 loops where the first iteration queried documents where array-contains film and the second loop queried documents where array-contains music. The results would be placed into the same collection and then you would simply remove all duplicates with the same itemId.
This also does not pose a problem with the composite-index limit because categories is a static field. The real problem comes with pagination because you would need to keep a record of all fetched itemId in case a future page of results returns an item that was already fetched and this would create an O(N^2) scenario (more on big-o notation: https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/). And because you're deduping locally, pagination blocks as the user sees them are not guaranteed to be even. If each pagination block is set to 25 documents, for example, some pages may end up displaying 24, some 21, others 14, depending on how many duplicates were removed from each block.
Are you planning on retrieving documents with the exact category array? Say, your user preference is listed as ["Film", "Music"]. Do you wish to retrieve only those documents with Film AND Music, or do you wish to retrieve documents having Film OR music?
If it's the latter, then maybe you can query for all documents with "Film" and then query for all documents with "Music", then merge it. However, the drawback here is some redundant document reads, when such document has both "Film" and "Music" in the categoryArray field.
You can also explore using Algolia to enable full-text search. In this case, you'd probably store the category list as a string maybe separated by commas, then update the whole string when the user changes their preferences.
For the former case, I have not come across sa workable solution other than maybe storing it as a concatenated string in alphabetical order? Others might have a more solid solution than mine.
Hope this helps!
Your query includes an orderBy clause. This, in combination with any equality filter, requires that you create an index to support that query. There is no way to avoid this.
If you remove the orderBy, you will be able to have flexible, dynamic filters for equality using the map properties in the document. This is the only way you will be able to have a dynamic filter without creating an index. This of course means that you will have to order and page the query results on the client.
I have set up my data in my firebase realtime database as follows:
Key{
Creation date,
Popularity,
Rating,
Author}
Would it be possible to retrieve the answers to the following questions:
what where the top 50 games last month in terms of popularity?
what where the lowest ranked games this week?
what is the highest rated game today?
Answering one would answer the other one probably, but just to be sure I put them all.
All of these require that you consider multiple properties. And since the Firebase Realtime Database can only order/filter over a single property, you can't perform these without modifying the data model. For a good primer, read my answer here: Query based on multiple where clauses in Firebase
In this case, you'd need three extra properties for each node:
a property combining the month and popularity, for your first query. For example: "month_popularity": "201812_125" (if the popularity is 125).
a property combining the month and rating, for your second query. For example: "week_rating": "2018w51_4".
a property combining the day and rating, for your third query. For example: "day_rating": "20181225_4".
With these properties, you can then order on the interval you want, and filter the values for the range you want. For example to get the top 50 games for this month:
ref.orderByChild("month_popularity").startAt("201812_").endAt("201812~")
Where the ~ is just a character after _ in ASCII, ensuring that we stop returning results after the ones from this month.
I have a collection with thousands of documents. Is there a way I can query the collection and return the first 500 documents? Then I want to load the next 500 documents (501-1000) and so on etc.
docDbClient.queryDocuments(collection._self, 'SELECT * FROM d ORDER BY d._ts DESC').toArray(function(error, arr) {});
Since skip and take are not part of the query language today (though marked as "planned" on UserVoice), you'd need to come up with an alternative approach. Cosmos DB has built-in paging (with continuation tokens), which allows you to read a chunk of data at a time. You can specify the maximum item count per page, and then, as you're ready for the next page, perform the next read with the continuation token received from the previous read.
Or you can come up with your own scheme, perhaps based on some specific property you have.