I am trying to write a nodejs program that queries github for a list of repos (via a Node wrapper for the github API: https://www.npmjs.com/package/github) and retrieves the git clone url in an array, which I then wish to sort alphabetically.
Due to the asynchronous nature of the calls, I am not sure how to wait until all of the async requests are returned?
Here is the loop in question. repoArrayis an array of repos in [username/reponame] format
var urls = [];
for (var i=0; i < repoArray.length; i++) {
var components = repoArray[i].split('/');
github.repos.get({
user: components[0],
repo: components[1]
}, function(err, res) {
urls.push(res.ssh_url);
});
}
// do a case-insensitive sort
urls.sort(function(a,b) {
return a.localeCompare(b, 'en', {'sensitivity': 'base'});
});
console.log("urls: " + urls);
Basically, since github.repos.get() calls in the loop are all asynchronous/callback-based, when the code reaches urls.sort() and then the console.log(), none or some of the github.repos.get() calls are done yet.
I am not that familiar with promises or deferreds, but is that the way to go? I'm not sure how I could refactor that loop so that urls.sort() is called only after all the requests from the loop are complete?
The Async library is meant for exactly these scenarios, and is usually what people tend to use for these problems. It can help you execute asynchronous tasks in parallel and execute a callback when they all finish, using async.each.
var async = require('async');
var urls = [];
//make each HTTP request
function process(repo,callback){
var components = repo.split('/');
github.repos.get({
user: components[0],
repo: components[1]
}, function(err, res) {
if(err){
// call callback(err) if there is an error
return callback(err);
}
else{
urls.push(res.ssh_url);
// call callback(null) if it was a success,
return callback(null);
}
});
}
// this will iterate over repoArray and pass each repo to the 'process' function.
// if any of the calls to 'process' result in an error,
// the final callback will be immediately called with an error object
async.each(repoArray,process,function(error){
if(error){
console.error('uh-oh: '+error)
return;
}
else{
// do a case-insensitive sort
urls.sort(function(a,b) {
return a.localeCompare(b, 'en', {'sensitivity': 'base'});
});
console.log("urls: " + urls);
}
});
edit: since you are sorting them at the end, the urls will be in order.
Related
I am new to javascript async programming and i have a basic issue where I have a Set of code where i am doing two separate DB calls on basis of request body params .
These are two methods which does a DB Call and returns a Promise
validateExam
validateUserExists
I want to store resullts from async call to this myExam variable and then return it in response .
getExam: function(req, res) {
var myExam = {};
var coupon = req.body.coupon;
var email = req.body.email;
async.series([
function(callback) {
validateExam(coupon)
.then(function(success) {
callback(null, success);
});
},
function(callback) {
validateUserExists(email)
.then(function(result) {
callback(null, result);
})
}
], function(error, results) {
myExam.res = results;
});
res.json({
"status": 400,
"message": myExam
});
},
You can't return an asynchronously retrieved value from your function. Your function returns BEFORE the async operation is even done. Instead, you need to communicate the return value back to the caller via either a returned promise or by passing in a callback that you can call when the async operation is done. For more info on the details of that, see: How do I return the response from an asynchronous call?.
In addition, using the async library to manage two promise operations is very odd. Promises have all the tools built in themselves to manage asynchronous operations so if your core operations are already returning promises, you should just use those promises directly and not involve the async library.
In looking at your code, it appears that validating the exam and validating the user are independent operations and you can run them in a parallel and use Promise.all() to know when both promises are done.
You can do something like this:
getExam: function(req, res) {
var coupon = req.body.coupon;
var email = req.body.email;
Promise.all([validateExam(coupon), validateUserExists(email)]).then(function(results) {
// results is a two element array that contains the two validation results
// send your response here based on the results array (not clear to me exactly what you want here)
res.json(...);
}).catch(function(err) {
// return some sort of error response here
res.status(500).json(...);
});
},
I have some code that I cant get my head around, I am trying to return an array of object using a callback, I have a function that is returning the values and then pushing them into an array but I cant access this outside of the function, I am doing something stupid here but can't tell what ( I am very new to Node.JS )
for (var index in res.response.result) {
var marketArray = [];
(function () {
var market = res.response.result[index];
createOrUpdateMarket(market, eventObj , function (err, marketObj) {
marketArray.push(marketObj)
console.log('The Array is %s',marketArray.length) //Returns The Array is 1.2.3..etc
});
console.log('The Array is %s',marketArray.length) // Returns The Array is 0
})();
}
You have multiple issues going on here. A core issue is to gain an understanding of how asynchronous responses work and which code executes when. But, in addition to that you also have to learn how to manage multiple async responses in a loop and how to know when all the responses are done and how to get the results in order and what tools can best be used in node.js to do that.
Your core issue is a matter of timing. The createOrUpdateMarket() function is probably asynchronous. That means that it starts its operation when the function is called, then calls its callback sometime in the future. Meanwhile the rest of your code continues to run. Thus, you are trying to access the array BEFORE the callback has been called.
Because you cannot know exactly when that callback will be called, the only place you can reliably use the callback data is inside the callback or in something that is called from within the callback.
You can read more about the details of the async/callback issue here: Why is my variable unaltered after I modify it inside of a function? - Asynchronous code reference
To know when a whole series of these createOrUpdateMarket() operations are all done, you will have to code especially to know when all of them are done and you cannot rely on a simple for loop. The modern way to do that is to use promises which offer tools for helping you manage the timing of one or more asynchronous operations.
In addition, if you want to accumulate results from your for loop in marketArray, you have to declare and initialize that before your for loop, not inside your for loop. Here are several solutions:
Manually Coded Solution
var len = res.response.result.length;
var marketArray = new Array(len), cntr = 0;
for (var index = 0, index < len; index++) {
(function(i) {
createOrUpdateMarket(res.response.result[i], eventObj , function (err, marketObj) {
++cntr;
if (err) {
// need error handling here
}
marketArray[i] = marketObj;
// if last response has just finished
if (cntr === len) {
// here the marketArray is fully populated and all responses are done
// put your code to process the marketArray here
}
});
})(index);
}
Standard Promises Built Into Node.js
// make a version of createOrUpdateMarket that returns a promise
function createOrUpdateMarketAsync(a, b) {
return new Promise(function(resolve, reject) {
createOrUpdateMarket(a, b, function(err, marketObj) {
if (err) {
reject(err);
return;
}
resolve(marketObj);
});
});
}
var promises = [];
for (var i = 0; i < res.response.result.length; i++) {
promises.push(createorUpdateMarketAsync(res.response.result[i], eventObj));
}
Promise.all(promises).then(function(marketArray) {
// all results done here, results in marketArray
}, function(err) {
// an error occurred
});
Enhanced Promises with the Bluebird Promise library
The bluebird promise library offers Promise.map() which will iterate over your array of data and produce an array of asynchronously obtained results.
// make a version of createOrUpdateMarket that returns a promise
var Promise = require('bluebird');
var createOrUpdateMarketAsync = Promise.promisify(createOrUpdateMarket);
// iterate the res.response.result array and run an operation on each item
Promise.map(res.response.result, function(item) {
return createOrUpdateMarketAsync(item, eventObj);
}).then(function(marketArray) {
// all results done here, results in marketArray
}, function(err) {
// an error occurred
});
Async Library
You can also use the async library to help manage multiple async operations. In this case, you can use async.map() which will create an array of results.
var async = require('async');
async.map(res.response.result, function(item, done) {
createOrUpdateMarker(item, eventObj, function(err, marketObj) {
if (err) {
done(err);
} else {
done(marketObj);
}
});
}, function(err, results) {
if (err) {
// an error occurred
} else {
// results array contains all the async results
}
});
I have this code:
var queue = [];
var allParserd = [];
_.each(webs, function (web) {
queue.push(function () {
WebsitesUtils.parseWebsite(web, function (err, parsed) {
allParserd.push(parsed);
});
});
});
Promise.all(queue).then(function (data) {
console.log(allParserd);
});
Basically I need to fetch all my webs and be sure to give the result after that every parsing is done. the function parseWebsite return the correct data, but in this way is not called and allParsed return just as an empty array. I'm sure that I miss some things, I've started to use the promises just from some days.
If you need some more information just tell me.
P.s.
I want that all the functions to start at the same time; I don't want to wait for each one response for going forward.
Tagged with Bluebird so let's use it:
First, let's convert your callback API to promises:
Promise.promisifyAll(WebsitesUtils);
Now, let's use .map to map every item in webs to it being parsed parseWebsite:
Promise.map(webs, function(item){
return WebsitesUtils.parseWebsiteAsync(item); // note the suffix
}).then(function(results){
// all the results are here.
}).catch(function(err){
// handle any errors
});
As you can see - this is trivial to do with Bluebird.
Promise.all doesn't take a queue of functions to execute. It expects an array of promises which represent the results of the many concurrently running (still pending) requests.
The first step is to have a function that actually returns a promise, instead of only executing a callback. We can use
function parseWebsite(web) {
return new Promise(function(fulfill, reject) {
WebsitesUtils.parseWebsite(web, function (err, parsed) {
if (err)
reject(err);
else
fulfill(parsed);
});
});
}
or simply use promisification that does this generically:
var parseWebsite = Promise.promisify(WebsitesUtils.parseWebsite, WebsitesUtils);
Now we can go to construct our array of promises by calling that function for each site:
var promises = [];
_.each(webs, function (web) {
promises.push(parseWebsite(web));
});
or just
var promises = _.map(webs, parseWebsite);
so that in the end we can use Promise.all, and get back our allParsed array (which even is in the same order as webs was!):
Promise.all(promises).then(function(allParsed) {
console.log(allParsed);
});
Bluebird even provides a shortcut function so you don't need promises:
Promise.map(webs, parseWebsite).then(function(allParsed) {
console.log(allParsed);
});
Here's how might do it with async:
var async = require('async');
var webs = ...
async.map(webs, function(web, callback) {
WebsitesUtils.parseWebsite(web, callback);
}, function(err, results) {
if (err) throw err; // TODO: handle errors better
// `results` contains all parsed results
});
and if parseWebsite() isn't a prototype method dependent on WebsitesUtils then you could simplify it further:
async.map(webs, WebsitesUtils.parseWebsite, function(err, results) {
if (err) throw err; // TODO: handle errors better
// `results` contains all parsed results
});
var request = require('request'),
requests = [],
values = [],
request("url1", function());
function() {
.....
for (x in list){
requests.push(requestFunction(x));
}
}
requestFunction(x){
request("url2", function (e,r,b) {
....
return function(callback) {
values[i] = b
}
});
}
async.parallel(requests, function (allResults) {
// values array is ready at this point
// the data should also be available in the allResults array
console.log(values);
});
I new to node. Issue is that the request needs to be called to populate the requests callback array. But the issue is the async.parallel will run before the requests array is full and need run all the callbacks. Where do I move this async so it runs after the requests array is full?
Asynchronous programming is all about chaining blocks. This allows node to efficiently run its event queue, while ensuring that your steps are done in order. For example, here's a query from a web app I wrote:
app.get("/admin", isLoggedIn, isVerified, isAdmin, function (req, res) {
User.count({}, function (err, users) {
if (err) throw err;
User.count({"verified.isVerified" : true}, function (err2, verifiedUsers) {
if (err2) throw err2;
Course.count({}, function (err3, courses) {
// and this continues on and on — the admin page
// has a lot of information on all the documents in the database
})
})
})
})
Notice how I chained function calls inside of one another. Course.count({}, ...) could only be called once User.count({"verified.isVerified" : true}, ...) was called. This means the i/o is never blocked and the /admin page is never rendered without the required information.
You didn't really give enough information regarding your problem (so there might be a better way to fix it), but I think you could, for now, do this:
var request = require('request'),
requests = [],
values = [],
length; // a counter to store the number of times to run the loop
request("url1", function() {
length = Object.keys(list).length;
// referring to the list below;
// make sure list is available at this level of scope
for (var x in list){
requests.push(requestFunction(x));
length--;
if (length == 0) {
async.parallel(requests, function (allResults) {
console.log(values); // prints the values array
});
}
}
}
function requestFunction(x) {
request("url2", function (e,r,b) {
values[i] = b;
return b;
}
}
I am assuming that requestFunction() takes a while to load, which is why async.parallel is running before the for (var x in list) loop finishes. To force async.parallel to run after the loop finishes, you'll need a counter.
var length = Object.keys(list).length;
This returns the number of keys in the list associative array (aka object). Now, every time you run through the for loop, you decrement length. When length == 0, you then run your async.parallel process.
edit: You could also write the requests.push() part as:
requests.push(
(function () {
request("url2", function (e,r,b) {
values[i] = b;
return b;
}
})()
);
I think it's redundant to store b in both values and requests, but I have kept it as you had it.
I have a function that returns an array of items from MongoDB:
var getBooks = function(callback){
Comment.distinct("doc", function(err, docs){
callback(docs);
}
});
};
Now, for each of the items returned in docs, I'd like to execute another mongoose query, gather the count for specific fields, gather them all in a counts object, and finally pass that on to res.render:
getBooks(function(docs){
var counts = {};
docs.forEach(function(entry){
getAllCount(entry, ...){};
});
});
If I put res.render after the forEach loop, it will execute before the count queries have finished. However, if I include it in the loop, it will execute for each entry. What is the proper way of doing this?
I'd recommend using the popular NodeJS package, async. It's far easier than doing the work/counting, and eventual error handling would be needed by another answer.
In particular, I'd suggest considering each (reference):
getBooks(function(docs){
var counts = {};
async.each(docs, function(doc, callback){
getAllCount(entry, ...);
// call the `callback` with a error if one occured, or
// empty params if everything was OK.
// store the value for each doc in counts
}, function(err) {
// all are complete (or an error occurred)
// you can access counts here
res.render(...);
});
});
or you could use map (reference):
getBooks(function(docs){
async.map(docs, function(doc, transformed){
getAllCount(entry, ...);
// call transformed(null, theCount);
// for each document (or transformed(err); if there was an error);
}, function(err, results) {
// all are complete (or an error occurred)
// you can access results here, which contains the count value
// returned by calling: transformed(null, ###) in the map function
res.render(...);
});
});
If there are too many simultaneous requests, you could use the mapLimit or eachLimit function to limit the amount of simultaneous asynchronous mongoose requests.
forEach probably isn't your best bet here, unless you want all of your calls to getAllCount happening in parallel (maybe you do, I don't know — or for that matter, Node is still single-threaded by default, isn't it?). Instead, just keeping an index and repeating the call for each entry in docs until you're done seems better. E.g.:
getBooks(function(docs){
var counts = {},
index = 0,
entry;
loop();
function loop() {
if (index < docs.length) {
entry = docs[index++];
getAllCount(entry, gotCount);
}
else {
// Done, call `res.render` with the result
}
}
function gotCount(count) {
// ...store the count, it relates to `entry`...
// And loop
loop();
}
});
If you want the calls to happen in parallel (or if you can rely on this working in the single thread), just remember how many are outstanding so you know when you're done:
// Assumes `docs` is not sparse
getBooks(function(docs){
var counts = {},
received = 0,
outstanding;
outstanding = docs.length;
docs.forEach(function(entry){
getAllCount(entry, function(count) {
// ...store the count, note that it *doesn't* relate to `entry` as we
// have overlapping calls...
// Done?
--outstanding;
if (outstanding === 0) {
// Yup, call `res.render` with the result
}
});
});
});
In fact, getAllCount on first item must callback getAllCount on second item, ...
Two way: you can use a framework, like async : https://github.com/caolan/async
Or create yourself the callback chain. It's fun to write the first time.
edit
The goal is to have a mechanism that proceed like we write.
getAllCountFor(1, function(err1, result1) {
getAllCountFor(2, function(err2, result2) {
...
getAllCountFor(N, function(errN, resultN) {
res.sender tout ca tout ca
});
});
});
And that's what you will construct with async, using the sequence format.