Async each function - javascript

I'm trying to understand async each(coll, iteratee, callback) function to execute a function in parallel for each item of an array. From the async docs I understand that callback will be executed only once (when iteratee function will be executed for each item of the array).
And in the case of an error in the iteratee function, calling callback('some error message') will immediately call the callback function with error message.
Below is an example from the async docs for each function
each(coll, iteratee, callback)
// assuming openFiles is an array of file names
async.each(openFiles, function(file, callback) {
// Perform operation on file here.
console.log('Processing file ' + file);
if( file.length > 32 ) {
console.log('This file name is too long');
callback('File name too long');
} else {
// Do work to process file here
console.log('File processed');
callback();
}
}, function(err) {
// if any of the file processing produced an error, err would equal that error
if( err ) {
// One of the iterations produced an error.
// All processing will now stop.
console.log('A file failed to process');
} else {
console.log('All files have been processed successfully');
}
});
What I'm not able to understand is, what does calling callback() without argument does, it looks very strange to me that we call callback() with no argument when there is no error in the iteratee function. What does calling callback() or callback(null) do in case of no errors.
Can't we just remove those callback() or callback(null), when we actually mean to call the callback only once (when iteratee function is executed for all the elements of the array) rather than for each item of the array.

What does calling callback() or callback(null) do in case of no errors.
Calling callback with no arguments or with null signals to async.each that the iteratee function finished executing on that item (file in the case of the example). When all of the iteratee functions have called their respective callback, or one of them passes an error to it's callback, async.each will call the original callback function passed to it.
To elaborate on that a little, async.js is designed to handle asynchronous functions. The benefit (or issue, depending on how you look at it), of an asynchronous function is that there is no way to tell when it will finish executing. The way to deal with this is to pass the asynchronous function another function, a callback, to execute when it is finished. The asynchronous function could pass any errors it encounters, or any data it retrieves, to the original calling function through the callback function. For example, fs.readFile passes the read file data, and any errors, through the callback function is it passed.
Can't we just remove those callback() or callback(null), when we actually mean to call the callback only once (when iteratee function is executed for all the elements of the array) rather than for each item of the array.
No, because async.js has to assume the iteratee function is asynchorous, and therefore, it has to wait for it to its callback. The callback passed to async.each is only called once.
The confusion may be caused by the variable names. An individual callback function is only meant to be called once. The callback function passed to async.each is not the same callback passed to the iteratee. Each time iteratee is invoked with a value in coll, it is passed a new callback function. That call of iteratee is only supposed to call the passed callback once (async will throw an error otherwise). This allows async to track if a call to the iteratee function has called its callback, and wait for the rest to call their respective callback. Once all of the callback functions are called, async.each knows all of the asynchorous iteratee function calls have finished executing, and that it can call the original callback passed to it.
This is one of the difficult aspects of creating docs. They have to concise enough that a developer can get information from them quickly, and also include enough detail that they can explain the concept, or the function. It's a hard balance to achieve sometimes.

Calling callback with no arguments adds to a counter inside of the .each function. This counter, when full is the thing that actually calls your callback. Without that, it would never know when it completed.

Related

How do functions like fs.writeFile() callback arguements work?

I'm learning JavaScript at the moment, specifically Node JS. And during a basic tutorial I notice that some functions have callbacks with predefined arguments. For example
fs.writeFile(
path.join(__dirname, 'test', 'hello.txt'),
"Hello World",
err => {
if (err) throw err;
console.log('File Created...')})
I want to know where was "err" defined as a ErrNoException type.
Thanks!
You can write similar functions yourself. Callback is simply a function (F) which is passed to another function(A) as a parameter. Function (A) executes some tasks and then executes Function F from parameters.
For example:
function getData(url, callback) {
fetch(url)
.then(res => callback(null, res))
.catch(err => callback(err, null))
}
In the sample function, what happens is, after making a fetch request, we wait until a response or error is generated. If a response is there, we simply pass that to callback param 2 and call first param to null and vice versa.
Now, while calling the function,
getData('http://stackoverflow.com', (err, res) => {
if(err)
return console.log(err)
console.log(res)
})
I want to know where was "err" defined as a ErrNoException type.
In this type of interface, you pass a callback function and fs.writeFile() determines what arguments it will pass that callback when it calls it. It is up to you to declare your callback function to have appropriately named arguments that match what the caller will be passing it. So, YOU must declare the callback with an appropriately named argument. Since Javascript is not a typed language, you don't specific what the type of that argument is. The type of the argument is determined by the code that calls the callback. In this case that is the fs.writeFile() implementation. To use the callback appropriately, you have to read the documentation for fs.writeFile() so that you know how to use the argument to the callback.
So, in this specific case, fs.writeFile() requires a callback function and that callback function will be passed one argument and that argument will either by null (if there was no error) or will be an Error object that explains what the error was). You must learn this by either reading the code for fs.writeFile() or reading some documentation for it.
It's important to realize that when you define your callback function, all you are doing is providing a name for the arguments you expect the callback to be called with. You can then use that name in your code to reference those arguments. The actual data that is passed to that callback when it is called will be supplied by the fs.writeFile() function when it calls the callback.
It's unclear exactly what you mean by "defined as a ErrNoException type". Nodejs has an internal function (not available for external use) called errNoException() and you can see the source here. It uses that function internally for creating an Error object with some common properties such as an error code and a stack trace.

Saving a value through JavaScript's request method

I run into an issue when trying to use the request method in javascript where I can't save a value. I'll run a block of code like:
let savedData;
request({
url: url,
json: true
}, function (err, resp, body) {
if (err) {
return;
}
savedData = body.data;
});
console.log(savedData);
I know that request doesn't block or something, so I think it's run after the console.log or something like that? I just need to know how I can save the desired data for use later in the method.
Your code is working correctly, you're just neglecting the fact that the callback provided as a second parameter to request() is executed asynchronously.
At the time when your console.log() is executed, the network request may or may not have successfully returned the value yet.
Further Explanation
Take a look at the documentation for the request() function.
It states that the function call takes the following signature,
request(options, callback);
In JavaScript, a callback performs exactly as the name perscribes; it calls back by executing the provided function, after it has done what it first needs to.
This asynchronous behavior is especially prominent in making networking requests, since you wouldn't want your program to freeze and wait for the network request to retrieve or send what you've requested.
Example
function callback() {
console.log('I finished doing my asynchronous stuff!');
console.log('Just calling you back as I promised.');
}
console.log('Running some asynchronous code!');
request({...options}, callback);
console.log('Hi! I'm being called since I'm the next line of code; and the callback will be called when its ready.');
Output
Running some asynchronous code!
Hi! I'm being called since I'm the next line of code; and the callback
will be called when its ready.
I finished doing my asynchronous stuff!
Just calling you back as I promised.
You'll either need to do the rest of the code in the callback of the request function, or use a promise. Anything outside of that callback is going to execute before saveedData ever shows up.

When is setImmediate required in node callback

I would like to clarify - when is it correct to use setImmediate for a node callback.
The examples/articles I've studied argue it is best to use setImmediate to ensure a callback is asynchronous. The most common example is where a value might exist in a "cache" e.g.
const getData = function(id,callback) {
const cacheValue = cache[id];
if (cacheValue) {
// DON'T DO THIS ...
return callback(null,cacheValue);
// DO THIS ...
return setImmediate(callback,null,cacheValue);
}
return queryDB(id,function(err,result){
if (err){
return callback(err);
}
return callback(null,result);
});
};
Here's where it gets confusing. Most node examples of error handing in callbacks never seem to call setImmediate. From my example above:
if (err) {
return callback(err);
}
instead of:
if (err) {
return setImmediate(callback,err);
}
I read some arguments that say that setImmediate is not necessary in this case and indeed can impact performance, why is this? Is this example not the same as the example of accessing the cache.
Is it better to be consistent and always use setImmediate? In which case then, why not do the following:
const getData = function(id,callback) {
const cacheValue = cache[id];
if (cacheValue) {
return setImmediate(callback,null,cacheValue);
}
return queryDB(id,function(err,result){
if (err){
return setImmediate(callback,err);
}
return setImmediate(callback,null,result);
});
};
Quick answer
If you're calling the callback synchronously (before the host function has returned), you should use setImmediate(). If you are calling the callback asynchronously (after the function has returned), you do not need it and can call the callback directly.
Longer Answer
When you have an interface that accepts a callback and that callback is at least sometimes called asynchronously (meaning some indefinite time in the future after your function has returned), then it is a good practice to always call it asynchronously, even if the result is known immediately. This is so that the caller of your function and user of the callback will always see a consistent asynchronous interface.
As you seem aware of, a classic example of this would be with a cached result. If the result is in the cache, then the result is known immediately.
There is no law of Javascript-land that one MUST always call the callback asynchronously. Code may work OK if the callback is sometimes called synchronously and sometimes called asynchronously, but it is more subject to bugs caused by how the callback is used. If the callback is always called asynchronously, then the caller is less likely to accidentally create a bug by how they use the callback.
In the case of:
if (err) {
return callback(err);
}
My guess is that this is already in an asynchronous location. For example:
function someFunction(someUrl, callback) {
request(someURL, function(response, body, err) {
// this is already inside an async response
if (err) {
callback(err);
} else {
callback(body);
}
});
}
I read some arguments that say that setImmediate is not necessary in this case and indeed can impact performance, why is this? Is this example not the same as the example of accessing the cache.
In this case, the if (err) is already in an asynchronous callback part of code so there is no need for an additional setImmediate() there. The host function has already returned and thus calling the callback here without setImmediate() is already asynchronous timing. There is no need for an additional setImmediate().
The only time the setImmediate() is needed is when you are still in the synchronous body of the function and calling the callback there would call it before the host function returns, thus making the callback synchronous instead of asynchronous.
Summary
When is setImmediate required in node callback
So, to summarize. You should use setImmediate(callback) when you in code that is executing synchronously before the host function has returned. You do not need to use setImmediate(callback) when the code you are in is already in an asynchronous callback and the host function has already returned.
FYI, this is one reason to exclusively use promises in your asynchronous programming interfaces (instead of plain callbacks) because promises already handle this for you automatically. They guarantee that a .then() handler on a resolved promise will always be called asynchronously, even if the promise is resolved synchronously.

d3.queue never triggering .await function

I've run into a problem using d3-queue. This is my code:
var dataQueue = d3.queue();
dataQueue.defer(collectData,ISBNs,locations)
.await(processData);
Where collectData is a function that does several API calls (a large number of them to the Google Books API).
Now the problem is that the processData function is never called. I know for a fact that the collectData function runs properly, since I put a print statement just before the return statement, along with several other print statements along the way.
You are not passing your data correctly between the deferred task collectData and the final processData. The documentation has it as follows (emphasis mine):
# queue.defer(task[, arguments…]) <>
Adds the specified asynchronous task callback to the queue, with any optional arguments. The task is a function that will be called when the task should start. It is passed the specified optional arguments and an additional callback as the last argument; the callback must be invoked by the task when it finishes. The task must invoke the callback with two arguments: the error, if any, and the result of the task.
Thus, to pass the result of the deferred task to the function processData, your function collectData() has to be something like this:
function collectData(ISBNs, locations, callback) {
var error = null; // The error, if any
var data = { }; // The actual data to pass on
// ...collect your data...
// Pass the collected data (and the error) by invoking the callback.
callback(error, data);
}

Node.js callback

I can't seem to grasp the concept of a callback. I haven't worked with them before so bear with me. To get my hands wet, I'm trying to login to twitter with zombie.js.
Here is an example:
var Browser = require("zombie");
var browser = new Browser({ debug: true})
browser.visit("https://mobile.twitter.com/session/new", function (callback) {
browser.fill("username", "xxxxx");
browser.fill("password", "xxxxx");
browser.pressButton("Sign in", function (err, success) {
if(err){
console.log(browser.text('.message'));
console.log('There has been a error: ' + err);
}
else{
console.log('Worked!');
}
});
});
At the browser.pressButton part, it will determine if I have been able to successfully login or not, depending on if .message contains the text "Typing on your phone stinks, we know! Double-check your username and password and try again."
However, I don't understand how it determines to fire the callback err. If .message isn't present in the html, then I would like to trigger the success callback to move onto the next function.
The convention Zombie seems to use for callbacks comes from node.js where the first argument is an error object, which should be null on success, and any subsequent arguments are for the success case. If you define a callback, the library you are using (Zombie in this case) will execute your callback function when their async operation is complete. When your callback is invoked it means "OK, an operation has completed and you can now process the result as you see fit". Your code needs to look at that first argument to decide if the operation was a success or failure.
When you accept a callback function as an argument and then perform some (possibly asynchronous) operation, the callback is the way for you to tell the calling library you are done, and again use that first argument to distinguish errors from success.
Part of your confusion is probably coming from the fact that your function signature for the callback to browser.visit is wrong. You need to name that first argument to clearly indicate it's an error like this:
browser.visit("https://mobile.twitter.com/session/new", function (error, browser) {
So in the body of that anonymous callback function, if zombie couldn't load that page, the error argument will have info about the error. If the page did load correctly, error will be null and the browser 2nd argument can be used to further tell zombie to do more stuff on the page. This is how Zombie says "I'm done with the visit operation, time for you to handle the results."
visit doesn't pass a callback argument, the anonymous function you pass as an argument to visit IS THE CALLBACK. You could code it like this to clarify (although nobody does)
browser.visit("https://mobile.twitter.com/session/new", function callback(error, browser) {
So that's a callback when a library needs to tell you it is done. You don't invoke it. The library invokes it and you put code inside it.
On the other hand, when your code does async operations, you need to invoke the callback that you received as a function argument appropriately to tell your caller that you are done and whether it was success for failure. In this case, you don't do any of your own async code, so there's no need for you to invoke any callback functions.

Categories