Infinite execution of tasks for nodejs application - javascript

Suppose I have code, like this
function execute() {
var tasks = buildListOfTasks();
// ...
}
buildListOfTask creates array of functions. Functions are async, might issue HTTP requests or/and perform db operations.
If tasks list appears empty or all tasks are executed, I need to repeat same execute routine again. And again, in say "infinite loop". So, it's daemon like application.
I could quite understand how to accomplish that in sync-world, but bit confused how to make it possible in node.js async-world.

use async.js and it's queue object.
function runTask(task, callback) {
//dispatch a single asynchronous task to do some real work
task(callback);
}
//the 10 means allow up to 10 in parallel, then start queueing
var queue = async.queue(runTask, 10);
//Check for work to do and enqueue it
function refillQueue() {
buildListOfTasks().forEach(function (task) {
queue.push(task);
});
}
//queue will call this whenever all pending work is completed
//so wait 100ms and check again for more arriving work
queue.drain = function() {
setTimeout(refillQueue, 100);
};
//start things off initially
refillQueue();

If you're already familiar with libraries like async, you can use the execute() as the final callback to restart the tasks:
function execute(err) {
if (!err) {
async.series(buildListOfTasks(), execute);
} else {
// ...
}
}

I think you have to use async.js, probably the parallel function. https://github.com/caolan/async#parallel
In the global callback, just call execute to make a recursive call.
async.parallel(tasks,
function(err, results){
if(!err) execute();
}
);

Related

node.js async request with timeout?

Is it possible, in node.js, to make an asynchronous call that times out if it takes too long (or doesn't complete) and triggers a default callback?
The details:
I have a node.js server that receives a request and then makes multiple requests asynchronously behind the scenes, before responding. The basic issue is covered by an existing question, but some of these calls are considered 'nice to have'. What I mean is that if we get the response back, then it enhances the response to the client, but if they take too long to respond it is better to respond to the client in a timely manner than with those responses.
At the same time this approach would allow to protect against services that simply aren't completing or failing, while allowing the main thread of operation to respond.
You can think of this in the same way as a Google search that has one core set of results, but provides extra responses based on other behind the scenes queries.
If its simple just use setTimout
app.get('/', function (req, res) {
var result = {};
// populate object
http.get('http://www.google.com/index.html', (res) => {
result.property = response;
return res.send(result);
});
// if we havent returned within a second, return without data
setTimeout(function(){
return res.send(result);
}, 1000);
});
Edit: as mentioned by peteb i forgot to check to see if we already sent. This can be accomplished by using res.headerSent or by maintaining a 'sent' value yourself. I also noticed res variable was being reassigned
app.get('/', function (req, res) {
var result = {};
// populate object
http.get('http://www.google.com/index.html', (httpResponse) => {
result.property = httpResponse;
if(!res.headersSent){
res.send(result);
}
});
// if we havent returned within a second, return without data
setTimeout(function(){
if(!res.headersSent){
res.send(result);
}
}, 1000);
});
Check this example of timeout callback https://github.com/jakubknejzlik/node-timeout-callback/blob/master/index.js
You could modify it to do action if time's out or just simply catch error.
You can try using a timeout. For example using the setTimeout() method:
Setup a timeout handler: var timeOutX = setTimeout(function…
Set that variable to null: timeOutX = NULL (to indicate that the timeout has been fired)
Then execute your callback function with one argument (error handling): callback({error:'The async request timed out'});
You add the time for your timeout function, for example 3 seconds
Something like this:
var timeoutX = setTimeout(function() {
timeOutX = null;
yourCallbackFunction({error:'The async request timed out'});
}, 3000);
With that set, you can then call your async function and you put a timeout check to make sure that your timeout handler didn’t fire yet.
Finally, before you run your callback function, you must clear that scheduled timeout handler using the clearTimeout() method.
Something like this:
yourAsyncFunction(yourArguments, function() {
if (timeOutX) {
clearTimeout(timeOutX);
yourCallbackFunction();
}
});

Synchronous run of included function

I have a question regarding synchronous run of included function. For example, I have the following code.
partitions(dataset, num_of_folds, function(train, test, fold) {
train_and_test(train,test, function(err, results)
{
})
})
where partitions runs num_of_folds times, for deffierent fold it returns different train and test sets. I want to run every iteration of partitions only when train_and_test is finished. How to do so?
Given your previous question: Optimum async flow for cross validation in node.js, I understand that partition is your own function, which splits the dataset into several parts, and then runs the callback (basically train_and_test here) for each of those parts.
Your issue now is that train_and_test is asynchronous, but you want to wait for each invocation to finish (which is signalled by the its own callback being called, I assume) before running the next one.
One trivial solution is do change your code to keep state, and run the next invocation from the callback. For instance:
exports.partitions = function(dataset, numOfPartitions, callback) {
var testSetCount = dataset.length / numOfPartitions;
var iPartition=0;
var iteration = function() {
if (iPartition<numOfPartitions)
{
var testSetStart = iPartition*testSetCount;
var partition = exports.partition(dataset, testSetStart, testSetCount);
callback(partition.train, partition.test, iPartition, iteration);
iPartition++;
}
};
iteration();
};
You'll then need to pass the additional callback down to your asynchronous function:
partitions(dataset, num_of_folds, function(train, test, fold, callback) {
train_and_test(train,test, function(err, results)
{
callback();
})
});
Note that I haven't tested any of the code above.
I want to run every iteration of partitions only when train_and_test is finished. How to do so?
That's entirely up to how partitions calls the callback you're giving it. If partitions calls the callback normally, then it will wait for the callback to finish before proceeding. But if it calls is asynchronously, for instance via nextTick or setTimeout or similar, then the only way you'd be able to tell it to wait would be if it provides a means of telling it that.
If I read your question another way: If train_and_test is asynchronous, and partitions isn't designed to deal with asynchronous callbacks, you can't make it wait.

node.js and asynchronous programming palindrome

This question might be possible duplication. I am a noob to node.js and asynchronous programming palindrome. I have google searched and seen a lot of examples on this, but I still have bit confusion.
OK, from google search what I understand is that all the callbacks are handled asynchronous.
for example, let's take readfile function from node.js api
fs.readFile(filename, [options], callback) // callback here will be handled asynchronously
fs.readFileSync(filename, [options])
var fs = require('fs');
fs.readFile('async-try.js' ,'utf8' ,function(err,data){
console.log(data); })
console.log("hii");
The above code will first print hii then it will print the content of
the file.
So, my questions are:
Are all callbacks handled asynchronously?
The below code is not asynchronous, why and how do I make it?
function compute(callback){
for(var i =0; i < 1000 ; i++){}
callback(i);
}
function print(num){
console.log("value of i is:" + num);
}
compute(print);
console.log("hii");
Are all callbacks handled asynchronously?
Not necessarily. Generally, they are, because in NodeJS their very goal is to resume execution of a function (a continuation) after a long running task finishes (typically, IO operations). However, you wrote yourself a synchronous callback, so as you can see they're not always asynchronous.
The below code is not asynchronous, why and how do I make it?
If you want your callback to be called asynchronously, you have to tell Node to execute it "when it has time to do so". In other words, you defer execution of your callback for later, when Node will have finished the ongoing execution.
function compute(callback){
for (var i = 0; i < 1000; i++);
// Defer execution for later
process.nextTick(function () { callback(i); });
}
Output:
hii
value of i is:1000
For more information on how asynchronous callbacks work, please read this blog post that explains how process.nextTick works.
No, that is a regular function call.
A callback will not be asynchronous unless it is forced to be. A good way to do this is by calling it within a setTimeout of 0 milliseconds, e.g.
setTimeout(function() {
// Am now asynchronous
}, 0);
Generally callbacks are made asynchronous when the calling function involves making a new request on the server (e.g. Opening a new file) and it doesn't make sense to halt execution whilst waiting for it to complete.
The below code is not asynchronous, why and how do I make it?
function compute(callback){
for(var i =0; i < 1000 ; i++){}
callback(i);
}
I'm going to assume your code is trying to say, "I need to do something 1000 times then use my callback when everything is complete".
Even your for loop won't work here, because imagine this:
function compute(callback){
for(var i =0; i < 1000 ; i++){
DatabaseModel.save( function (err, result) {
// ^^^^^^ or whatever, Some async function here.
console.log("I am called when the record is saved!!");
});
}
callback(i);
}
In this case your for loop will execute the save calls, not wait around for them to be completed. So, in your example, you may get output like (depending on timing)
I am called when the record is saved
hii
I am called when the record is saved
...
For your compute method to only call the callback when everything is truely complete - all 1000 records have been saved in the database - I would look into the async Node package, which can do this easily for you, and provide patterns for many async problems you'll face in Node.
So, you could rewrite your compute function to be like:
function compute(callback){
var count = 0
async.whilst(
function() { return count < 1000 },
function(callback_for_async_module) {
DatabaseModel.save( function (err, result) {
console.log("I am called when the record is saved!!");
callback_for_async_module();
count++;
});
},
function(err) {
// this method is called when callback_for_async_module has
// been called 1000 times
callback(count);
);
console.log("Out of compute method!");
}
Note that your compute function's callback parameter will get called sometime after console.log("Out of compute method"). This function is now asynchronous: the rest of the application does not wait around for compute to complete.
You can put every callback call inside a timeout with one milisecond, that way they will be executed first when there are a thread free and all synchron tasks are done, then will the processor work through the stack of timeouts that want to be executet.

How to end on first async parallel task completion in Node?

I have a list of tasks that I want to run in parallel using https://github.com/caolan/async.
I want the program to proceed (probably through a callback) after the first of these parallel tasks is complete, not all of them. So I don't think the naive
async.parallel([task1, task2], callback)
works for me.
Alternatively I could spawn two tasks and cancel the incomplete one, but I can't figure out how to do that using async either.
Thanks!
-Charlie
Parallel Race
You can get async to initiate the final callback by returning an error that evaluates as true but isn't actually an error.
I've put together an example that uses -1 as an error code. In the final callback I check the error value and if it's not -1 then it's an actual error. If the error value is -1 then we'll have a valid value in results. At that point, we just need to remove extra elements from results of the other async functions that have not completed yet.
In the below example I've used the request module to pull html pages and the underscore module to filter the results in the final callback.
var request = require('request');
var _ = require('underscore');
exports.parallel = function(req, res) {
async.parallel([
/* Grab Google.jp */
function(callback) {
request("http://google.jp", function(err, response, body) {
if(err) { console.log(err); callback(true); return; }
callback(-1,"google.jp");
});
},
/* Grab Google.com */
function(callback) {
request("http://google.com", function(err, response, body) {
if(err) { console.log(err); callback(true); return; }
callback(-1,"google.com");
});
}
],
/* callback handler */
function(err, results) {
/* Actual error */
if(err && err!=-1) {
console.log(err);
return;
}
/* First data */
if(err===-1) {
/*
* async#parallel returns a list, one element per parallel function.
* Functions that haven't finished yet are in the list as undefined.
* use underscore to easily filter the one result.
*/
var one = _.filter(results, function(x) {
return (x===undefined ? false : true);
})[0];
console.log(results);
console.log(one);
res.send(one);
}
}
);
};
Remaining Function Results
When you setup async#parallel to work like this you won't have access to the results of the other asynchronous functions. If you're only interested in the first one to respond then this isn't a problem. However, you will not be able to cancel the other requests. That's most likely not a problem, but it might be a consideration.
The async.parallel documentation says:
If any of the functions pass an error to its callback, the main callback is immediately called
with the value of the error.
So you could return an error object from all of your parallel functors, and the first one to finish would jump you to the completion callback. Perhaps even your own special error class, so you can tell the difference between an actual error and a "hey I won" error.
Having said that, you would still have your parallel functions running, potentially waiting for callbacks to complete or whatever. Perhaps you could use async.parallelLimit to make sure you're not firing off too many tasks in parallel ?
Having said all that, it's possible you are better served by trying another method from the async library for this task - firing off parallel tasks then having these tasks race each other may not be the best idea.

Javascript - waiting for a number of asynchronous callbacks to return?

What's the best way/library for handling multiple asynchronous callbacks? Right now, I have something like this:
_.each(stuff, function(thing){
async(thing, callback);
});
I need to execute some code after the callback has been fired for each element in stuff.
What's the cleanest way to do this? I'm open to using libraries.
Since you're already using Underscore you might look at _.after. It does exactly what you're asking for. From the docs:
after _.after(count, function)
Creates a version of the function that will only be run after first being called count times. Useful for grouping asynchronous responses, where you want to be sure that all the async calls have finished, before proceeding.
There is a great library called Async.js that helps solve problems like this with many async & flow control helpers. It provides several forEach functions that can help you run callbacks for every item in a an array/object.
Check out:
https://github.com/caolan/async#forEach
// will print 1,2,3,4,5,6,7,all done
var arr = [1,2,3,4,5,6,7];
function doSomething(item, done) {
setTimeout(function() {
console.log(item);
done(); // call this when you're done with whatever you're doing
}, 50);
}
async.forEach(arr, doSomething, function(err) {
console.log("all done");
});
I recommend https://github.com/caolan/async for this. You can use async.parallel to do this.
function stuffDoer(thing) {
return function (callback) {
//Do stuff here with thing
callback(null, thing);
}
}
var work = _.map(stuff, stuffDoer)
async.parallel(work, function (error, results) {
//error will be defined if anything passed an error to the callback
//results will be an unordered array of whatever return value if any
//the worker functions passed to the callback
}
async.parallel() / async.series should suit your requirement. You can provide with a final callback that gets executed when all the REST calls succeed.
async.parallel([
function(){ ... },
function(){ ... }
], callback);
async.series([
function(){ ... },
function(){ ... }
], callback);
Have a counter, say async_count. Increase it by one every time you start a request (inside you loop) and have the callback reduce it by one and check if zero has been reached - if so, all the callbacks have returned.
EDIT: Although, if I were the one writing this, I would chain the requests rather than running them in parallel - in other words, I have a queue of requests and have the callback check the queue for the next request to make.
See my response to a similar question:
Coordinating parallel execution in node.js
My fork() function maintains the counter internally and automatically.

Categories