I have set up my data in my firebase realtime database as follows:
Key{
Creation date,
Popularity,
Rating,
Author}
Would it be possible to retrieve the answers to the following questions:
what where the top 50 games last month in terms of popularity?
what where the lowest ranked games this week?
what is the highest rated game today?
Answering one would answer the other one probably, but just to be sure I put them all.
All of these require that you consider multiple properties. And since the Firebase Realtime Database can only order/filter over a single property, you can't perform these without modifying the data model. For a good primer, read my answer here: Query based on multiple where clauses in Firebase
In this case, you'd need three extra properties for each node:
a property combining the month and popularity, for your first query. For example: "month_popularity": "201812_125" (if the popularity is 125).
a property combining the month and rating, for your second query. For example: "week_rating": "2018w51_4".
a property combining the day and rating, for your third query. For example: "day_rating": "20181225_4".
With these properties, you can then order on the interval you want, and filter the values for the range you want. For example to get the top 50 games for this month:
ref.orderByChild("month_popularity").startAt("201812_").endAt("201812~")
Where the ~ is just a character after _ in ASCII, ensuring that we stop returning results after the ones from this month.
Related
If I have a collection, and collection contain several documents.
Every document contain field age
The id of collection is data
There are 20 documents.
db.collection('data').orderBy('age')
My question is:
How to get certain range of that documents which is already order by age.
Example:
After order by age
[doc5, doc12, doc9, doc4, doc1, doc15, doc7, doc14, doc11, doc17,
doc3, doc2, doc13, doc8, doc6, doc18, doc20, doc16, doc19, doc10]
Fourth to Sixth (doc4, doc1, doc15)
Eleventh to Twelfth (doc3, doc2)
Fourteenth (doc8)
Firestore does not offer (actually, no longer offers) client-side queries that let you jump to an offset within the query results. You have to start from the beginning, and manually skip each result that you're not interested in, keeping track of the current index as you go. Yes, this will cost you excess document reads, but you don't have an alternative if you're not able to assign index values of your own for each document.
If you want to perform the query on a backend, you have offset() available, but the documents skipped are still counted as reads, and is neither an efficient or cheap way of skipping unwanted results.
Trying to create an activation code which should be unique, but it only consists of specific characters.
So, this is solution which i build
function findByActivationId() {
return Activation
.findOne({activationId})
.lean()
.exec();
}
let activationId = buildActivationId();
while (await findByActivationId(activationId)) {
activationId = buildActivationId();
}
This makes too many db calls, is there any better way to make query to mongodb?
Well, the major problem of checking if key is unique is based on how you are creating those.
Choose the best way for you to avoid bunch of problems later.
Your own generated string as a key
Well, you can do this but it's important to understand few disclaimers
If you want to generate your own key by the code and then compare if it is unique
in the database with all other currently created it can be done. Just create key by your
algorithm then select all keys from db and check if array of selected rows contains this freshly created string
Problems of this solution
As we can see we need to select all keys from DB and then compare each one to freshly created one. Problem can appear when your database is storing big amount of data. Every time application have to "download" big amount of data and then compare it to new one so in addition this might produce some freezes.
But if you are sure that your database will store not that much amount of unique rows, it is cool to work with.
Then it is important to create those keys properly. Now we talking about complexity, more symbols key is created from, harder to get same ones.
Shall we take a look at this example?
If you are creating keys based on letters a-z and numbers 1-9
and the length of key is for example 5, the complexity of this key is 35^5
which generates more than 52 milions possibilities.
Same keys can be generated but it is like a win on a lottery, almost impossible
And then you can just check if generated key is really unique, if not. (oh cmon) Repeat.
Other ways
Use mongodb _id which is always unique
Use UNIX timestamp to create unique key
I want to retrieve the last 20 documents in my large collection in an efficient manner.
This SO post offered this performant solution - but it does not answer my question because my question is specifically dealing with _id index - :
db.collectionName.find().min(minCriteria).hint(yourIndex).limit(N)
However, my collection just contains the default index (_id). I'm just not sure what min criteria would be - I obviously don't want to hardcode an _id value, as the collection is periodically emptied.
itemsCollection.find().min(<minCriteria>).hint({_id:1}).limit(20)
Is there any way to use min with the _id index? Or is my only option creating a new index?
Yes, you can use min with the _id index, as long as your <minCriteria> only reference the _id field.
If your min criteria is on something other than _id, you will need to create an index on that criteria in order to avoid this query being a full collection scan.
The min() cursor method is for establishing a lower bound for the index scan that will service the query. This is probably not what you are looking for to retrieve the most recently added documents.
Assuming each document's _id field contains an ObjectId or some other value that sorts in the order they were inserted, then you can, as noted in the comments, do a reverse sort on _id and limit to the number of documents desired, which can be very efficient.
This query should automatically use the _id index:
db.itemsCollection.find().sort({_id:-1}).limit(20)
The date part of the ObjectId is determined by the system creating the value, which in some cases is a client/application server. This means that clock drift may affect the ordering.
If you want to get the documents that were most recently inserted into the collection, you can use natural order:
db.itemsCollection.find().sort({$natural:-1}).limit(20)
This doesn't use an index, but it should still be fairly performant because it will only scan the number of documents you want to return.
I am struggling to find good material on best practices for filtering data using firebase firestore. I want to filter my data based on the categories selected by the user. I have a collection of documents stored on my firestore database and each document have an array which has all the appropriate categories for that single document. For the sake of filtering, I'm keeping a local array with a user's preferred categories as well. All I want to do is to filter the data based on the user's preferred categories.
firestore categories field
consider I have the user's preferred categories stored as an array of strings ( ["Film", "Music"] ) .I was planning on using firestore's 'array-contains' method like
db.collection(collectioname)
.where('categoriesArray', 'array-contains', ["Film", "Music"])
Later I found out that I can't use 'array-contains' against an array itself and after investigating on this issue, I decided to change my data structure as mentioned here.
categories changed to Map
Once I changed the categories from an array to map, I thought I could use multiple where conditions to filter the documents
let query = db.collection(collectionName)
.where(somefield, '==', true)
this.props.data.filterCategories.forEach((val) => {
query = query.where(`categories.${val}`, '==', true);
});
query = query
.orderBy(someOtherField, "desc")
.limit(itemsPerPage)
const snapshot = await query.get()
Now problem number 2, firebase requires to add indexes for compound queries. The categories I have saved within each document is dynamic and there's no way I can add these indexes in advance. What would be the ideal solution in such cases? Any help would be deeply appreciated.
This is a new feature of Firebase JavaScript SDK launched at November 7, 2019:
Version 7.3.0 - November 7, 2019
array-contains-any
"array-contains-any operator to combine up to 10 array-contains clauses on the same field with a logical OR. An array-contains-any query returns documents where the given field is an array that contains one or more of the comparison values"
citiesRef.where('regions', 'array-contains-any',
['west_coast', 'east_coast']);
Instead of iterating through each category that you wish to query and appending clauses to a single query object, each iteration should be its own independent query. And you can keep the categories in an array.
<document>
- itemId: abc123
- categories: [film, music, television]
If you wish to perform an OR query, you would make n-loops where each loop would query for documents where array-contains that category. Then on your end, you would dedup (remove duplicates) from the results based on the item's identifier. So if you wanted to query film or music, you would make 2 loops where the first iteration queried documents where array-contains film and the second loop queried documents where array-contains music. The results would be placed into the same collection and then you would simply remove all duplicates with the same itemId.
This also does not pose a problem with the composite-index limit because categories is a static field. The real problem comes with pagination because you would need to keep a record of all fetched itemId in case a future page of results returns an item that was already fetched and this would create an O(N^2) scenario (more on big-o notation: https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/). And because you're deduping locally, pagination blocks as the user sees them are not guaranteed to be even. If each pagination block is set to 25 documents, for example, some pages may end up displaying 24, some 21, others 14, depending on how many duplicates were removed from each block.
Are you planning on retrieving documents with the exact category array? Say, your user preference is listed as ["Film", "Music"]. Do you wish to retrieve only those documents with Film AND Music, or do you wish to retrieve documents having Film OR music?
If it's the latter, then maybe you can query for all documents with "Film" and then query for all documents with "Music", then merge it. However, the drawback here is some redundant document reads, when such document has both "Film" and "Music" in the categoryArray field.
You can also explore using Algolia to enable full-text search. In this case, you'd probably store the category list as a string maybe separated by commas, then update the whole string when the user changes their preferences.
For the former case, I have not come across sa workable solution other than maybe storing it as a concatenated string in alphabetical order? Others might have a more solid solution than mine.
Hope this helps!
Your query includes an orderBy clause. This, in combination with any equality filter, requires that you create an index to support that query. There is no way to avoid this.
If you remove the orderBy, you will be able to have flexible, dynamic filters for equality using the map properties in the document. This is the only way you will be able to have a dynamic filter without creating an index. This of course means that you will have to order and page the query results on the client.
I want to set data with timestamp priority in the past or future, but not at the current date. And then be able to make queries with endAt and StartAt for specific dates (365 days)
The push method is great to set unique IDs for data and manage the order. Is there any method to generate a "unique PushId" like push() method for timestamp in past or future?
You can attempt to create unique ids similar to what push does, but this seems like a lot of work for little gain when there are built in tools in Firebase to order data. The simplest answer is to set a priority on each record using the server timestamp.
ref.push({ ...data..., ".priority": Firebase.ServerValue.TIMESTAMP });
To set one in the future or past, specify the timestamp manually.
ref.push({ ...data..., ".priority": timeInTheFuture });
.info/serverTimeOffset may also be helpful here for handling latency.
To create push ids, you would do something similar to the following:
Get the current timestamp and pad it to a fixed length (i.e. 16 characters)
Append a random series of digits, such as a random number or hash, also padded to a fixed length
Your entry will now look something like this: 000128198239:KHFDBWYBEFIWFE
You now have a lexicographically sortable id based on a timestamp, which is unique
Here's a helpful discussion on sorting numbers lexicographically