Using findOne in a loop takes too long in Node.js - javascript

I'm using Node.js with MongoDB, I'm also using Monk for db access. I have the below code :
console.time("start");
collection.findOne({name: "jason"},
function(err, document) {
for(var i = 0; i < document.friends.length; i++) // "friends is an array contains ids of the user's friends"
{
collection.findOne({id: document.friends[i]}, function(err, doc)
{
console.log(doc.name);
});
}
});
console.log("The file was saved!");
console.timeEnd("start");
I have two questions regarding this code :
I see the execution time and "The file was saved!" string first, then I see the names of the friends coming in the console. Why is that? Shouldn't I see the names first then the execution time? Is it because the async nature of Node.js?
Names are printing very slowly in the console, the speed is like one name in two seconds. Why is it so slow? Is there a way to make the process faster?
EDIT:
Is it a good idea to break friends list to smaller pieces and call friends asynchronously? Would it make the process faster?
EDIT 2:
I changed my code to this :
collection.find({ id: { "$in": document.friends}}).then(function(err, doc)
{
console.log(doc.name);
if(err) {
return console.log(err);
}
}
This doesn't give an error, but this doesn't print anything either.
Thanks in advance.

Answer for question 1:
Yes, you are right.
Is it because the async nature of Node.js.
And to prevent that Node.js provides some mechanism for that you can use it otherwise you can do it on your own manually by setting one flag.
Answer for question 2:
you can use $in instead of findOne, it will be ease and fast.
e.g. .find({ "fieldx": { "$in": arr } })
arr :- In this you need to provide whole array.

yes, it's because javascript's async nature.
As you have called db from for loop javascript will not wait for it's response and continue the execution so it will print the file was saved first.
about your ans 2
It's making a dbCall for every friend then it's obvious that it will take some time that's why it's taking 1 or 2 secs for every friend.
console.time("start");
collection.findOne({name: "jason"},
function(err, document) {
for(var i = 0; i < document.friends.length; i++) // "friends is an array contains ids of the user's friends"
{
console.log("InsideforLoop Calling " + i + " friend");
collection.findOne({id: document.friends[i]}, function(err, doc)
{
console.log(doc.name);
});
console.log("Terminating " + i + "-----");
}
});
console.log("The file was saved!");
console.timeEnd("start");
This will make your async and db doubts more clear.
As you will see it will print all console in line.
InsideforLoop Calling 0 friend
Terminating 0 -----
and so on....Like this
console.log(doc.name);
but this will be printed asynchronusly
Added
collection.findOne({name: "jason"},
function(err, document) {
//you can do this
collection.find({id: $in:{document.friends}, function(err, doc)
{
console.log(doc);
});
});
Find All Details in one call

collection.aggregate([
{
$match:{
id :{ "$in" : document.friends},
}
}
]).exec(function ( e, d ) {
console.log( d )
if(!e){
// your code when got data successfully
}else{
// your code when you got the error
}
});

collection.findOne({name: "jason"},
function(err, document) {
if(document != undefined){
collection.find({ id: { "$in": document.friends}}).then(function(err, doc)
{
console.log(doc.name);
if(err) {
return console.log(err);
}
}
}
});

Answer to 1: Yes, it is because node is async. The part where it logs names is executed only when the first findOne returns, whereas the file was saved is executed straight away.

Related

increment counter in node js mongodb

I want to make a counter of how many time the server js file has started, for my website using mongodb driver for angular js.
I want to save a varible named counter which has a value of 0 and then increment that value each time that the server is running. my code is below. as you can see my code doesn't acutally update the field in the db. just the varible.
beside that... well.. the whole code I wrote seems like bad practise. I basically have a document with {id:<>,count:0} and I am looping through all the count fields which are greater the -1 (i.e. integers) although I have only got just 1 count field.
isn't there any simple way to persist/get this 1 value from the db?
How can I update the field inside the db itself using something like $inc, in the easiest way possible?
Thanks
MongoClient.connect(url, function(err, db) {
assert.equal(null, err);
if (err) {
console.log(err);
}
else {
console.log("Connected correctly to DB.");
var dbusers =db.collection('users');
var cursor =dbusers.find( { "count": { $gt: -1 } } );
cursor.each(function(err, doc) {
assert.equal(err, null);
if (doc != null) {
doc.count=doc.count+1;
}
}
);
}
db.close();
});
Try this:
MongoClient.connect(url, function(err, db) {
if (err) {
console.log(err);
return db.close();
}
console.log("Connected correctly to DB.");
// update a record in the collection
db.users.update(
// find record with name "MyServer"
{ name: "MyServer" },
// increment it's property called "ran" by 1
{ $inc: { ran: 1 } }
);
return db.close();
});
This should be enough to get you started. It sounds like you're trying to do something like:
get me all the objects in the collection that have a property 'count' greater than -1
increase it's value by 1
save it to the collection.
The step you're missing is step 3. Doing it your way you'd have to do a bulk update. The example I gave you is updating a single record.
here is the documentation for increment. And here is the documentation for bulk updates.

Bulk insert in MongoDB using mongoose

I currently have a collection in Mongodb say "Collection1".
I have the following array of objects that need to be into inserted into MongoDB. I am using Mongoose API. For now, I am iterating through the array and inserting each of them into mongo.
This is ok for now, but will be a problem when the data is too big.
I need a way of inserting the data in bulk into MongoDB without repetition.
I am not sure how to do this. I could not find a bulk option in Mongoose.
My code below
myData = [Obj1,Obj2,Obj3.......]
myData.forEach(function(ele){
//console.log(ele)
saveToMongo(ele);
});
function saveToMongo(obj){
(new Collection1(obj)).save(function (err, response) {
if (err) {
// console.log('Error while inserting: ' + obj.name + " " +err);
} else {
// console.log('Data successfully inserted');
}
});
return Collection1(obj);
}
You might want to use the insertMany() method here if you're using the latest Mongoose version 4.4.X and greater, which essentially uses Model.collection.insertMany() under the hood and the driver might handle parallelizing >= 1000 docs for you.
myData = [Obj1, Obj2, Obj3.......];
Collection1.insertMany(myData, function(error, docs) {});
or using Promises for better error handling
Collection1.insertMany(myData)
.then(function(docs) {
// do something with docs
})
.catch(function(err) {
// error handling here
});
It works by creating a bunch of documents, calls .validate() on them in parallel, and then calls the underlying driver's insertMany() on the result of toObject({ virtuals: false }); of each doc.
Although insertMany() doesn't trigger pre-save hooks, it has better performance because it only makes 1 round-trip to the server rather than 1 for each document.
For Mongoose versions ~3.8.8, ~3.8.22, 4.x which support MongoDB Server >=2.6.x, you could use the Bulk API as follows
var bulk = Collection1.collection.initializeOrderedBulkOp(),
counter = 0;
myData.forEach(function(doc) {
bulk.insert(doc);
counter++;
if (counter % 500 == 0) {
bulk.execute(function(err, r) {
// do something with the result
bulk = Collection1.collection.initializeOrderedBulkOp();
counter = 0;
});
}
});
// Catch any docs in the queue under or over the 500's
if (counter > 0) {
bulk.execute(function(err,result) {
// do something with the result here
});
}
you can pass an array of objects to mongoose model create function
var Collection1 = mongoose.model('Collection1');
Collection1.create(myData,function(err){
if(err) ...
});

How to elegantly detect when all mongo inserts have completed

Ive written and basic Node app (my first) to insert many csv rows into mongo (items array in the code below). Once all items have been inserted the db connection should be closed and the program exited.
The issue ive been working with is figuring out when to close the db connection once all inserts have returned a result. Ive gotten it working by counting all of the insert result callbacks but to me this feels clunky. I know one improvement I could make is to batch the inserts via an array to the insert function but ill still need to have my code be aware of when all inserts have completed (assuming it would be bad to insert 100k items in one query). Is there and better way (my code feels hacky) to do this?
Hack part...
function (err, result) {
queryCompletedCount++;
if (err) console.log(err);
//Not sure about doing it this way
//Close db once all queries have returned a result
if (queryCompletedCount === items.length) {
db.close();
console.log("Finish inserting data: " + new Date());
}
}
Full insert code
MongoClient.connect(dbConnectionURL, function (err, db) {
if (err) {
console.log("Error connecting to DB: " + err);
} else {
var productCollection = db.collection('products');
console.log("Connected to DB");
console.log("Start inserting data: " + new Date());
var queryCompletedCount = 0;
for (var i = 0; i < items.length; i++) {
productCollection.insert([{
manufacturerCode: null,
name: items[i].name,
description: null
}], function (err, result) {
queryCompletedCount++;
if (err) console.log(err);
//Not sure about doing it this way
//Close db once all queries have returned a result
if (queryCompletedCount === items.length) {
db.close();
console.log("Finish inserting data: " + new Date());
}
});
}
}
});
What do you think about realizing this issue with async module like this:
async = require('async')
async.eachSeries(items, function (item, next) {
productCollection.insert(productCollection.insert(
[{
manufacturerCode: null,
name: item.name,
description: null
}], function (err, result) {
if (err) {
return next(err);
}
next();
})
)
}, function () {
// this will be called after all insertion completed
db.close();
console.log("Finish inserting data: " + new Date());
});
What you need here is MongoDB's Write Concern, configured in the strictest way.
There are two levels of Write Concern. The first is the write mode, in which case the query returns only if the result is written to the configured number of mongo instances. In your case I suppose there is a single instance, but for future you may configure it as "w": "majority". The second level is the Journal concern, where by setting "j": 1 your query will return only when the data is written into the journal.
So in your case you best Write Concern configuration might be {"w": "majority", "j": 1}. Just add it as the last argument of your insert statement.

Best method to string together variety of DB calls in Node js

I basically need to make about 3 calls to get the data for a json object.. It basically JSON array of JSON object which have some attributes, one of which is an array of other values selected using a second query, then that one also has an array inside which is selected with another db call.
I tried using asyn.concatSeries so that I can dig down into the bottom call and put together all the information I collected for one root json object but that's creating a lot of unexpected behaviour..
Example of JSON
[
{
"item" : "firstDbCall"
"children" : [ {
"name" : "itemDiscoveredWithSecondDBCall"
"children" : [ itemsDiscoveredwith3rdDBCall]
},
]
}
]
This is really difficult using node.js. I really need to figure out how to do this properly since I have to do many of these for different purposes.
EDIT
This is the code i have. There's some strange behaviour with async.concatSeries. The results get called multiple times after each one of the functions finish for each array. So i had to put a check in place. I know it's very messy code but i've been just putting band-aids all over it for the past 2 hours to make it work..
console.log("GET USERS HAREDQARE INFO _--__--_-_-_-_-_____");
var query = "select driveGroupId from tasks, driveInformation where agentId = '"
+ req.params.agentId + "' and driveInformation.taskId = tasks.id order by driveInformation.taskId desc;";
connection.query(query, function(err, rows) {
if (rows === undefined) {
res.json([]);
return;
}
if(rows.length<1) { res.send("[]"); return;}
var driveGroupId = rows[0].driveGroupId;
var physicalQuery = "select * from drives where driveGroupId = " + driveGroupId + ";";
connection.query(physicalQuery, function(err, rows) {
console.log("ROWSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS");
console.log(rows);
async.concatSeries(rows, function(row, cb) {
console.log("-------------------------------SINGLE ROW-------------------------------------");
console.log(row);
if(row.hasLogicalDrives != 0) {
console.log("HAS LOGICAL DRIVES");
console.log(row.id);
var query = "select id, name from logicalDrives where driveId = " + row.id;
connection.query(query, function(error, drives) {
console.log("QUERY RETURNED");
console.log(drives);
parseDriveInfo(row.name, row.searchable, drives, cb);
});
}
else
var driveInfo = { "driveName" : row.name, "searchable" : row.searchable};
console.log("NO SUB ITEMS");
cb(null, driveInfo);
}, function(err, results) {
console.log("GEETTTTINGHERE");
console.log(results);
if(results.length == rows.length) {
console.log("RESULTS FOR THE DRIVE SEARCH");
console.log(results);
var response = {"id": req.params.agentId};
response.driveList = results;
console.log("RESPONSE");
console.log(response);
res.json(response);
}
});
});
});
};
parseDriveInfo = function(driveName, searchable, drives, cb) {
async.concatSeries(drives, function(drive,callback) {
console.log("SERIES 2");
console.log(drive);
console.log("END OF DRIVE INFO");
var query = "select name from supportedSearchTypes where logicalDriveId = " + drive.id;
connection.query(query, function(error, searchTypes) {
drive.searchTypes = searchTypes;
var driveInfo = { "driveName" :driveName,
"searchable" : searchable,
"logicalDrives" : drive
};
callback(null, driveInfo);
});
}, function (err, results) {
console.log("THIS IS ISISIS ISISISSISISISISISISISISISIS");
console.log(results);
if(results.length === drives.length) {
console.log("GOTHERE");
cb(null, results);
}
});
}
Getting good enough with async to use exactly the right combination of methods under the right circumstances takes a fair amount of experience. Most likely your case in particular can be handled with async.waterfall if its query1 then query2(dataFoundByQuery1) then query3(dataFoundByQuery2). But depending on the circumstances you need to mix and match async methods appropriately and sometimes have 2 levels - for example a "big picture" async.waterfall where some of the steps in the waterfall do async.parallel or async.series as needed. I've never used async.concat and given your needs I think you have chosen the wrong method. The workhorses are async.each, async.eachSeries, async.waterfall, and async.map, at least for the web app & DB query use cases I mostly encounter, so make sure you really have those understood before exploring the more specific convenience methods.
EDIT: This is a more in depth example based on use of the connection library you seem to be using. Please note, some of this is javascript psuedo code. Things like adding objects to the resultsArray are clearly not complete, the only thing I took time to make sure was correct is the "flow of logic" as it pertains to callbacks. Everything else is for you to implement. In order to support multiple calls to the same callback function and maintain state from call to call, the best way is to wrap the set of callbacks in a closure. This allows the callbacks to share some state with the main event loop. This allows you to pass arguments to the callbacks, without actually having to pass them as arguments, much like class variables in c++, or even globals in javascript, but we haven't poluted the global scope :)
function queryDataBase(query) {
//wrap the whole query in a function so the callbacks can share some
//variables with similar scope. This is called a closure
int rowCounter = 0;
var dataRowsFromStep2;
var resultsArray = {};
connection.query(query, dataBaseQueryStep2);
function dataBaseQueryStep2(err, rows) {
//do something with err and rows
dataRowsFromStep2 = rows;
var query = getQueryFromRow(dataRowsFromStep2[rowCounter++]);//Always zero the first time. Might need to double check rows isn't empty!
connection.query(query, dataBaseQueryStep3);
}
function dataBaseQueryStep3(err, rows) {
//do something with err and rows
if(rowCounter < dataRowsFromStep2.size) {
resultsArray.add(rows);//Probably needs to be more interesting, but you get the idea
//since this is within the same closure, rowCounter maintains it's state
var query = getQueryFromRow(dataRowsFromStep2[rowCounter++]);
//recursive call query using dataBaseQueryStep3 as it's callback repeatedly until
//we run out of rows to call it on.
connection.query(query, dataBaseQueryStep3)
} else {
//when the if statement fails we have no more rows to run queries on so return to main program flow
returnToMainProgramLogic(resultsArray);
}
}
}
function returnToMainProgramLogic(results) {
//continue running your program here
}
I personally like the above logic better than the syntax async produces... I believe the heart of your problem rests in your nested calls to async, and the fact that ASYN itself, runs the series of functions asynchronously, but in order(confusing I know). If you write your program like this, you won't have to worry about it!
I would strongly suggest using sequelize.js It provides a really powerful orm that allows you to chain queries together. It also allows you to directly load your data into js objects, write dynamic sql, and connect to many different databases. Picture ActiveRecord from the Ruby world for Node.

nodejs and mongodb (mongojs): Trying to query and update database within a for loop

I'm writing a multiplayer game(mongojs, nodejs) and trying to figure out how to update user stats based on the outcome of the game. I already have the code written to compute all the post game stats. The problem comes when I try to update the users' stats in a for loop. Here's what i got:
//Game Stats
var tempgame = {
gameid: 1234,
stats: [
{
score: 25,
user: 'user1'
},
{
score: 25,
user: 'user2'
}
]
}
for(i = 0; i < tempgame.stats.length; i++){
db.users.find({ username: tempgame.stats[i].user }, function(err, res){
if( err != null){
//handle errors here.
} else {
var userstats = res[0].stats;
if( tempgame.stats[i].score > userstats.bestscore ){ //this is where it chokes
userstats.bestscore = tempgame.stats[i].score;
}
//code here to pass back new manipulated stats
}
});
}
Everything works fine until i try to use the tempgame object within the callback function. It says "cannot read property 'score' of undefined". Is this just a scoping issue?
Also i was thinking it could be an issue with the callback function itself. Maybe the loop would increment before the callback is even run. But even in that case, the score should be be there it would just be pulling from the wrong array index... that's what lead me to believe it may just be a scope issue.
Any help would be greatly appreciated.
You've been tripped up by the notorious "defining functions inside a loop" problem.
Use "forEach" instead:
tempgame.stats.forEach(function (stat) {
db.users.find({ username: stat.user }, function(err, res){
if( err != null){
//handle errors here.
} else {
var userstats = res[0].stats;
if( stat.score > userstats.bestscore ){ //this is where it chokes
userstats.bestscore = stat.score;
}
//code here to pass back new manipulated stats
}
});
});
Part of your problem is as mjhm stated in his answer to your question, and is as you have suspected. The i variable is changing before the callback is invoked.
The other half of your problem is because your database calls have not returned yet. Due to the asynchronous nature of NodeJS, your loop will finish before your database calls complete. Additionally, your database calls are not necessarily coming back in the same order you called them. What you need is some sort of flow control like async.js. Using async.map will allow you to make all calls to the DB in parallel and return them as an array of values you can use, after all db calls have been completed.
async.map(tempgame.stats, function(stat, callback){
db.users.find({ username: stat.user }, function(err, res){
if( err != null){
callback(err);
} else {
callback(null, res[0].stats);
}
});
}, function(err, stats){
if(err){
//handle errors
} else{
stats.forEach(function(stat){
//do something with your array of stats
//this wont be called until all database calls have been completed
});
}
});
In addition to the above, if you want to return results back to the application,
http://nodeblog.tumblr.com/post/60922749945/nodejs-async-db-query-inside-for-loop

Categories