Firestore Comparing two collection documents is very slow - javascript

I'm currently uploading the phones contact of my mobile users to a Firestore database.
On average a user has ~500 phone numbers, and I will be having ~1000 users.
My structure is as follow:
--- users_contacts (collection)
--- uid (document)
--- contacts (subcollection)
--- phoneNumberOfContact (document)
--- phoneNumberOfContact: true
And I have another collection where I store general phone numbers that I want to compare a specific user's contacts with.
I have about 50,000 phone number in there, each as a document. This will greatly increase later, maybe to 1 million.
--- db_contacts (collection)
--- phoneNumber (document)
--- phoneNumber: true
I'm trying to check the common numbers of a specific known uid and the db_contacts collection. How many numbers of a known uid exist in the db_contacts collection.
My Cloud Function will be as follow:
Fetch all the Phone Numbers of a Contact
First I wanted to only fetch the ids of the document of a user, since the id is the phone number, hoping it would make the proccess faster. But it seems its not possible in Javascript as snapshotChanges() does not exist.
Loop through the fetched contacts and check if the contact exists in db_contacts. Since I already know the reference path to check if it exists or not, this should go fast
Return all the common contacts
If there was an alternative to snapshotChanges() in JavaScript my script would run much faster. Is my way of thinking correct?
What I did so far:
exports.findCommonNumbers = functions.https.onCall((data, context) => {
return new Promise((resolve, reject) => {
fetchCommonNumbers().then((commonNumbers) => {
}).catch((err) => {
reject("Error Occured");
async function fetchCommonNumbers() {
var commonNumbers = [];
let contactsReference = admin.firestore().collection("user_contacts").doc("iNaYVsDCg3PWsDu67h75xZ9v2vh1").collection("contacts");
const dbContactReference = admin.firestore().collection('db_contacts');
userContacts = await contactsReference.get();
userContacts =;
for(var i in userContacts){
var userContact = userContacts[i];
const DocumentID =;
//Check if Document exist
dbContact = await dbContactReference.doc(DocumentID).get();
if (dbContact.exists) {
return Promise.resolve(commonNumbers);
The function findCommonNumbers is taking 60 seconds to execute. It has to be much faster. How can I make it faster?

When you're looking for documents in common, you're fetching one, waiting for it to come back, fetching the next, waiting for it... I haven't used async/await before, but I'd do something like:
Promise.All( => {
const DocumentID =;
//Check if Document exists
return dbContactReference.doc(DocumentID).get().then(dbContact => {
if (dbContact.exists) {
Sorry for the code fragment and mistakes; I'm on mobile. This should request them all at once.
Edit: to return after something other than all have returned:
new Promise((resolve, reject) => {
var returned = 0; => {
const DocumentID =;
//Check if Document exists
dbContactReference.doc(DocumentID).get().then(dbContact => {
if (dbContact.exists) {
if (returned == userContact.length || commonNumbers.length >= 5) {


INDEXEDDB update/put not returning updated object in the expected state

I have an INDEXEDDB database that i've created with two object stores: 'games' and 'plays' (in reference to football). I am letting IDB create the keys for each store via 'autoincrement'. The 'games' store can have multiple games and likewise, there will be multiple plays for each game. Later, i export these stores via JSON to PHP and am attempting to correlate the plays that took place in game 1 (for example) to that game and so on. I am using a 'foreign key'-like value (a gameID attribute) in the plays store to indicate that the play goes with a certain game. However, upon JSON export of the two stores, i have found that the 'games' store does not have its key value exported and therefore, i cannot for sure connect a play (which has a reference to 'gameID') to a particular game (which does not contain the reference within its structure).
So, i thought the answer to be simple: create a value called 'gameID' within the 'game' store and once i have that id, update the record in the store with the gameID value.
The problem is that i've written IDB 'update' code or 'put' code which seems to be 'successful', yet when i go get the game in question later, the value is not correct. I'm finding that my updates are not updating the data structures as i would expect to see them in Chrome Developer tools. Below is an example of what i am talking about:
Object in question in Chrome Developer tools
Above you can see graphically the issue and i'm not sure what is happening. You'll see that in the areas marked "A" and "C", there are the updated values listed (i do a similar update later to mark a game 'complete' at the end of a game). However, the actual data structure in IDB (indicated with "B") shows the old values that i "thought" that i'd updated successfully. So, i'm not at all sure how to read this structure in Chrome Developer, which seems to report the updates that were made separately from the object itself.
I've tried doing this update thru passing the gameID in question and via cursor.
function putGameID (conn, thisGameID) {
return new Promise((resolve, reject) => {
const tx = conn.transaction(['gamesList'], 'readwrite');
const gameStore = tx.objectStore(['gamesList']);
const gameRequest = gameStore.get(thisGameID);
gameRequest.onsuccess = () => {
const game = gameRequest.result;
game.gameID = thisGameID;
const updateGameRequest = gameStore.put(game);
updateGameRequest.onsuccess = () => {
console.log("Successfully updated this game ID.");
It appears the record was updated, just not in the manner i would expect.
I've also attempted this using a cursor update to similar effect:
function putGameID (conn, thisGameID) {
return new Promise((resolve, reject) => {
const tx = conn.transaction(['gamesList'], 'readwrite');
const gameStore = tx.objectStore(['gamesList']);
gameStore.openCursor().onsuccess = function(event) {
const cursor =;
if (cursor) {
if (!cursor.value.gameID) {
const updatedGame = cursor.value;
updatedGame.gameID = thisGameID;
const request = cursor.update(updatedGame);
Can someone help me to understand:
(1) how to read the structure in the CDT? Why are the updated values not part of the object's structure?
and ...
(2) how can i modify my approach to get the results that i wish to achieve?
As per requested, this is the code that originally creates the two object stores and it is called upon entry into the form:
async function idbConnect(name, version) {
return new Promise((resolve, reject) => {
const request =, DBVersion);
request.onupgradeneeded = function(event) {
//if (!request.objectStoreNames.contains('gamesList')) {
console.log('Doing indexeddb upgrade');
db = request.result;
/*Create the two stores - plays and games. */
playObjectStore = db.createObjectStore('playsList',{keyPath: "id", autoIncrement:true});
gameObjectStore = db.createObjectStore('gamesList',{keyPath: "gameID", autoIncrement:true});
/* Create indexes */
playObjectStore.createIndex("playIDIdx","id", {unique:false});
playObjectStore.createIndex("gamePlayIDIdx","gameID", {unique:false});
playObjectStore.createIndex("playCreatedDateIdx","createdDate", {unique:false});
gameObjectStore.createIndex("gameIDIdx","gameID", {unique:true});
gameObjectStore.createIndex("gameCreatedDateIdx","createdDate", {unique:false});
//return db;
request.onsuccess = () => resolve(request.result);
request.onerror = () => reject(request.error);
request.onblocked = () => { console.log('blocked'); };
This code makes the call to add the game:
try {
conn = await idbConnect(DBName,DBVersion);
game = await addGameIDB(conn);
// Understand what is going on in the line below.
//Saving the game ID to populate.
globalGameID = game.gameID;
// Here is where i'm attempting to update the gameID....
await putGameID(conn, globalGameID);
Once the stores are created, the following code adds a game:
function addGameIDB(conn) {
return new Promise((resolve, reject) => {
// some irrelevant stuff to format dates, etc....
let newGame = [
gameID: null, // What i'd like to populate....
gameDate: thisGameDate,
gameTime: thisGameTime,
team1Name: thisTeamOne,
team2Name: thisTeamTwo,
gameCompleted: false,
createdDate: d
db = conn.transaction('gamesList','readwrite');
let gameStore = db.objectStore('gamesList');
let gameRequest = gameStore.add(newGame);
gameRequest.onsuccess = (ev) => {
console.log('Successfully inserted an game object');
const newGameRequest = gameStore.get(gameRequest.result);
newGameRequest.onsuccess = () => {
gameRequest.onerror = (err) => {
console.log('error attempting to insert game object' + err);

How do to retrieve all documents in a collection and see if a document exists in firebase Firestore?

This is how my schema looks
Current Implementation:
for (let i=0; i<data.length; i++) {
var ifPresent = db.collection("Safes-Hardware").doc(data[i]['Mac address Check']);
.then(async (doc)=>{
if (!doc.exists)
// Do stuff
//Do stuff
return { message: "Success is within the palm of our hands." }
Even though this code does the job, for each data in the array I'm doing a lookup, and this results in a socket hang-up.(sometimes)
So I'm thinking I'll get all the documents in the collection in one go, store it locally and look up if a documents exists locally instead of querying the database every time.
How do I implement this?
You can just use collection("Safes-Hardware").get().then() and you can save the data locally.
let collection = []
db.collection("Safes-Hardware").get().then(function(querySnapshot) {
collection = => ({
then you can use collection to search for what you want, maybe like this
data.forEach( doc => {
let x = collection.find(v => === doc['Mac address Check'])
//it exists
// not exists
But take care you are compromising bandwidth or number of requests with o(n^2) operation in the client side

Creating objects in Model.js, NodeJs

In my model.js (using mongoose) , I am initially creating 40 objects in model.js which are to be used in the entire program. No other function in any file creates more objects but only updates the existing ones.
My model.js
var TicketSchema = mongoose.model('Tickets', TicketSchema);
for(let i = 1;i<=40;i++)
var new_ticket = new TicketSchema({ticket_number:i});, ticket) {
Problem is I noticed there were much more objects than 40 after some time. I wanted to know if model.js runs more than once during execution or is it just due to repeated calling of npm run start and then closing the server?
Also is there way better way of creating objects initially which are to be used for the entire program?
It will create new 40 documents every time you start the server. You can use this function to avoid creating if the records already exist by checking count.
const TicketModel = mongoose.model('Tickets', TicketSchema);
const insertTicketNumber = async () => {
try {
const count = await TicketModel.countDocuments({});
if (count) return;
await TicketModel.create(
.map(i => i + 1)
.map(number => ({ ticket_number: number }))
} catch (error) {

How many requests can Node-Express fire off at once?

I have a script that is pulling 25,000 records from AWS Athena which is basically a PrestoDB Relational SQL Database. Lets say that I'm generating a request for each one of these records, which means I have to make 25,000 requests to Athena, then when the data comes back I have to make 25,000 requests to my Redis Cluster.
What would be the ideal amount of requests to make at one time from node to Athena?
The reason I ask is because I tried to do this by creating an array of 25,000 promises and then calling Promise.all(promiseArray) on it, but the app just hanged forever.
So I decided instead to fire off 1 at a time and use recursion to splice the first index out and then pass the remaining records to the calling function after the promise has been resolved.
The problem with this is that it takes forever. I took about an hour break and came back and there were 23,000 records remaining.
I tried to google how many requests Node and Athena can handle at once, but I came up with nothing. I'm hoping someone might know something about this and be able to share it with me.
Thank you.
Here is my code just for reference:
As a sidenote, what I would like to do differently is instead of sending one request at a time I could send 4, 5, 6, 7 or 8 at a time depending on how fast it would execute.
Also, how would a Node cluster effect the performance of something like this?
exports.storeDomainTrends = () => {
return new Promise((resolve, reject)=>{
athenaClient.execute(`SELECT DISTINCT the_column from "the_db"."the_table"`,
(err, data) => {
var getAndStoreDomainData = (records) => {
return new promise((resolve, reject) => {
var subrecords = records.splice(0, )[0]
SUM(field) as field
FROM "the_db"."the_table"
WHERE the_field IN ('Month') AND the_field = '`+ record.domain_name +`'
GROUP BY the_field, the_field, the_field
`, (err, domainTrend) => {
if(err) {
redisClient.set(('Some String' + domainTrend[0].domain_name), JSON.stringify(domainTrend))
.then(res => {
Using the lib your code could look something like this:
const Fail = function(reason){this.reason=reason;};
const isFail = x=>(x&&x.constructor)===Fail;
const distinctDomains = () =>
new Promise(
`SELECT DISTINCT domain_name from "endpoint_dm"."bd_mb3_global_endpoints"`,
? reject(err)
: resolve(data)
const domainDetails = domain_name =>
new Promise(
SUM(endpoint_count) as endpoint_count
FROM "endpoint_dm"."bd_mb3_global_endpoints"
WHERE agg_type IN ('Month') AND domain_name = '${domain_name}'
GROUP BY timeframe_end_date, agg_type, domain_name`,
(err, domainTrend) =>
? reject(err)
: resolve(domainTrend)
const redisSet = keyValue =>
new Promise(
? reject(err)
: resolve(res)
const process = batchSize => limitFn => resolveValue => domains =>
.map(//map domains to promises
//maximum 5 active connections
//the redis client documentation makes no sense whatsoever
//no mention of a callback
//mentions a callback, since we need the return value
//and best to do it async we will use callback to promise
`Endpoint Profiles - Checkin Trend by Domain - Monthly - ${domainTrend[0].domain_name}`,
//here is where things get unpredictable, set is documented as
// a synchronous function returning "OK" or a function that
// takes a callback but no mention of what that callback recieves
// as response, you should try with one or two records to
// finish this on reverse engineering because documentation
// fails 100% here and can not be relied uppon.
console.log("bad documentation of redis client... reply is:",redisReply);
? domain
: Promise.reject(`Redis reply not OK:${redisReply}`)
.catch(//catch failed, save error and domain of failed item
new Fail([e,domain])
console.log(`got ${batchSize} results`);
const left = domains.slice(batchSize);
if(left.length===0){//nothing left
return resolveValue.conat(results);
//recursively call process untill done
return process(batchSize)(limitFn)(resolveValue.concat(results))(left)
const max5 = lib.throttle(5);//max 5 active connections to athena
distinctDomains()//you may want to limit the results to 50 for testing
//you may want to limit batch size to 10 for testing
.then(process(1000)(max5)([]))//we have 25000 domains here
results=>{//have 25000 results
const successes = results.filter(x=>!isFail(x));
//array of failed items, a failed item has a .reason property
// that is an array of 2 items: [the error, domain]
const failed = results.filter(isFail);
You should figure out what redis client does, I tried to figure it out using the documentation but may as well ask my goldfish. Once you've reverse engineered the client behavior it is best to try with small batch size to see if there are any errors. You have to import lib to use it, you can find it here.
I was able to take what Kevin B said to find a much quicker way to query the data. What I did was change the query so that I could get the trend for all domains from Athena. I ordered it by domain_name and then sent it as a Node stream so that I could separate out each domain name into it's own JSON as the data was coming in.
Anyways this is what I ended up with.
exports.storeDomainTrends = () => {
return new Promise((resolve, reject)=>{
var streamObj = athenaClient.execute(`
SELECT field,
SUM(field) AS field
FROM "db"."table"
WHERE field IN ('Month')
GROUP BY field, field, field
ORDER BY field desc`).toStream();
var data = [];
streamObj.on('data', (record)=>{
if (!data.length || record.field === data[0].field){
} else if (data[0].field !== record.field){
redisClient.set(('Key'), JSON.stringify(data))
data = [record]
streamObj.on('end', resolve);
streamObj.on('error', reject);

How to use "q" module for refactoring mongoose code?

I'm using mongoose to insert some data into mongodb. The code looks like:
var mongoose = require('mongoose');
var conn = mongoose.connection;
// insert users
conn.collection('users').insert([{/*user1*/},{/*user2*/}], function(err, docs) {
var user1 = docs[0], user2 = docs[1];
// insert channels
conn.collection('channels').insert([{userId:user1._id},{userId:user2._id}], function(err, docs) {
var channel1 = docs[0], channel2 = docs[1];
// insert articles
conn.collection('articles').insert([{userId:user1._id,channelId:channel1._id},{}], function(err, docs) {
var article1 = docs[0], article2 = docs[1];
You can see there are a lot of nested callbacks there, so I'm trying to use q to refactor it.
I hope the code will look like:
.then(function (value4) {
// Do something with value4
}, function (error) {
// Handle any error from step1 through step4
But I don't know how to do it.
You'll want to use Q.nfcall, documented in the README and the Wiki. All Mongoose methods are Node-style. I'll also use .spread instead of manually destructuring .then.
var mongoose = require('mongoose');
var conn = mongoose.connection;
var users = conn.collection('users');
var channels = conn.collection('channels');
var articles = conn.collection('articles');
function getInsertedArticles() {
return Q.nfcall(users.insert.bind(users), [{/*user1*/},{/*user2*/}]).spread(function (user1, user2) {
return Q.nfcall(channels.insert.bind(channels), [{userId:user1._id},{userId:user2._id}]).spread(function (channel1, channel2) {
return Q.nfcall(articles.insert.bind(articles), [{userId:user1._id,channelId:channel1._id},{}]);
.spread(function (article1, article2) {
// you only get here if all three of the above steps succeeded
.fail(function (error) {
// you get here if any of the above three steps failed
In practice, you will rarely want to use .spread, since you usually are inserting an array that you don't know the size of. In that case the code can look more like this (here I also illustrate Q.nbind).
To compare with the original one is not quite fair, because your original has no error handling. A corrected Node-style version of the original would be like so:
var mongoose = require('mongoose');
var conn = mongoose.connection;
function getInsertedArticles(cb) {
// insert users
conn.collection('users').insert([{/*user1*/},{/*user2*/}], function(err, docs) {
if (err) {
var user1 = docs[0], user2 = docs[1];
// insert channels
conn.collection('channels').insert([{userId:user1._id},{userId:user2._id}], function(err, docs) {
if (err) {
var channel1 = docs[0], channel2 = docs[1];
// insert articles
conn.collection('articles').insert([{userId:user1._id,channelId:channel1._id},{}], function(err, docs) {
if (err) {
var article1 = docs[0], article2 = docs[1];
cb(null, [article1, article2]);
getInsertedArticles(function (err, articles) {
if (err) {
// you get here if any of the three steps failed.
// `articles` is `undefined`.
} else {
// you get here if all three succeeded.
// `err` is null.
With alternative deferred promise implementation, you may do it as following:
var mongoose = require('mongoose');
var conn = mongoose.connection;
// Setup 'pinsert', promise version of 'insert' method
var promisify = require('deferred').promisify
mongoose.Collection.prototype.pinsert = promisify(mongoose.Collection.prototype.insert);
var user1, user2;
// insert users
// insert channels
.then(function (users) {
user1 = users[0]; user2 = users[1];
return conn.collection('channels').pinsert([{userId:user1._id},{userId:user2._id}]);
// insert articles
.match(function (channel1, channel2) {
return conn.collection('articles').pinsert([{userId:user1._id,channelId:channel1._id},{}]);
.done(function (articles) {
// Do something with articles
}, function (err) {
// Handle any error that might have occurred on the way
Considering instead of Collection.insert (quite the same in our case).
You don't need to use Q, you can wrap yourself the save method and return directly a Mongoose Promise.
First create an utility method to wrap the save function, that's not very clean but something like:
//Utility function (put it in a better place)
var saveInPromise = function (model) {
var promise = new mongoose.Promise(); (err, result) {
promise.resolve(err, result);
return promise;
Then you can use it instead of save to chain your promises
var User = mongoose.model('User');
var Channel = mongoose.model('Channel');
var Article = mongoose.model('Article');
//Step 1
var user = new User({data: 'value'});
saveInPromise(user).then(function () {
//Step 2
var channel = new Channel({user:})
return saveInPromise(channel);
}).then(function (channel) {
//Step 3
var article = new Article({channel:})
return saveInPromise(article);
}, function (err) {
//A single place to handle your errors
I guess that's the kind of simplicity we are looking for.. right? Of course the utility function can be implemented with better integration with Mongoose.
Let me know what you think about that.
By the way there is an issue about that exact problem in the Mongoose Github:
Add 'promise' return value to model save operation
I hope it's gonna be solved soon. I think it takes some times because they are thinking of switching from mpromise to Q: See here and then here.
Two years later, this question just popped up in my RSS client ...
Things have moved on somewhat since May 2012 and we might choose to solve this one in a different way now. More specifically, the Javascript community has become "reduce-aware" since the decision to include Array.prototype.reduce (and other Array methods) in ECMAScript5. Array.prototype.reduce was always (and still is) available as a polyfill but was little appreciated by many of us at that time. Those who were running ahead of the curve may demur on this point, of course.
The problem posed in the question appears to be formulaic, with rules as follows :
The objects in the array passed as the first param to conn.collection(table).insert() build as follows (where N corresponds to the object's index in an array):
[ {}, ... ]
[ {userId:userN._id}, ... ]
[ {userId:userN._id, channelId:channelN._id}, ... ]
table names (in order) are : users, channels, articles.
the corresopnding object properties are : user, channel, article (ie the table names without the pluralizing 's').
A general pattern from this article by Taoofcode) for making asynchronous call in series is :
function workMyCollection(arr) {
return arr.reduce(function(promise, item) {
return promise.then(function(result) {
return doSomethingAsyncWithResult(item, result);
}, q());
With quite light adaptation, this pattern can be made to orchestrate the required sequencing :
function cascadeInsert(tables, n) {
/* tables: array of unpluralisd table names
/* n: number of users to insert.
/* returns promise of completion|error
var ids = []; // this outer array is available to the inner functions (to be read and written to).
for(var i=0; i<n; i++) { ids.push({}); } //initialize the ids array with n plain objects.
return tables.reduce(function (promise, t) {
return promise.then(function (docs) {
for(var i=0; i<ids.length; i++) {
if(!docs[i]) throw (new Error(t + ": returned documents list does not match the request"));//or simply `continue;` to be error tolerant (if acceptable server-side).
ids[i][t+'Id'] = docs[i]._id; //progressively add properties to the `ids` objects
return insert(ids, t + 's');
}, Q());
Lastly, here's the promise-returning worker function, insert() :
function insert(ids, t) {
/* ids: array of plain objects with properties as defined by the rules
/* t: table name.
/* returns promise of docs
var dfrd = Q.defer();
conn.collection(t).insert(ids, function(err, docs) {
(err) ? dfrd.reject(err) : dfrd.resolve(docs);
return dfrd.promise;
Thus, you can specify as parameters passed to cascadeInsert, the actual table/property names and the number of users to insert.
cascadeInsert( ['user', 'channel', 'article'], 2 ).then(function () {
// you get here if everything was successful
}).catch(function (err) {
// you get here if anything failed
This works nicely because the tables in the question all have regular plurals (user => users, channel => channels). If any of them was irregular (eg stimulus => stimuli, child => children), then we would need to rethink - (and probably implement a lookup hash). In any case, the adaptation would be fairly trivial.
Today we have mongoose-q as well. A plugin to mongoose that gives you stuff like execQ and saveQ which return Q promises.
