async vs. Q generally
I'm learning Node.js development, and trying to wrap my brain around strategies for managing asynchronous "callback hell". The two main strategies I've explored are Caolan McMahon's async module, and Kris Kowal's promise-based Q module.
Like many other people, I'm still struggling to understand when you should use one vs. the other. However, generally speaking I have found promises and Q-based code to be slightly more intuitive, so I have been moving in that direction.
Mapping/Concatenating collections generally
However, I'm still stuck using the async module's functions for managing collections. Coming from a Java and Python background, most of the time when I work with a collection, the logic looks like this:
Initialize a new empty collection, in which to store results.
Perform a for-each loop with the old collection, applying some logic to each element and pushing its result into the new empty collection.
When the for-each loop ends, proceed to use the new collection.
In client-side JavaScript, I've grown accustomed to using jQuery's map() function... passing in that step #2 logic, and getting the step #3 result as a return value. Feels like the same basic approach.
Mapping/Concatenating collections with async and Q
The Node-side async module has similar map and concat functions, but they don't return the concatenated result back at the original scope level. You must instead descend into the callback hell to use the result. Example:
var deferred = Q.defer();
...
var entries = [???]; // some array of objects with "id" attributes
async.concat(entries, function (entry, callback) {
callback(null, entry.id);
}, function (err, ids) {
// We now have the "ids" array, holding the "id" attributes of all items in the "entries" array.
...
// Optionaly, perhaps do some sorting or other post-processing on "ids".
...
deferred.resolve(ids);
});
...
return deferred.promise;
Since my other functions are becoming promise-based, I have this code returning a promise object so it can be easily included in a then() chain.
Do I really need both?
The ultimate question that I'm struggling to articulate is: do I really need both async and Q in the code example above? I'm learning how to replace the async module's control flow with Q-style promise chains generally... but it hasn't yet "clicked" for me how to do mapping or concatenation of collections with a promise-based approach. Alternatively, I'd like to understand why you can't, or why it's not a good idea.
If async and Q are meant to work together as I am using them in the example above, then so be it. But I would prefer not to require the extra library dependency if I could cleanly use Q alone.
(Sorry if I'm missing something outrageously obvious. The asynchronous event-driven model is a very different world, and my head is still swimming.)
Do I really need both?
No. Mapping asynchronous iterators over a collection is quite simple with promises, but it requires two steps instead of one function call. First, the collection is mapped to an array of promises for the parallel iteration. Then, those promises are fed into Q.all to make one promise for the mapped collection. In contrast to async, the order of the result is guaranteed.
var entries = […]; // some array of objects with "id" attributes
var promises = entries.map(function(object) {
return asyncPromiseReturingFunction(object);
}); // the anonymous wrapper might be omitted
return Q.all(promises);
For concat, you would have to append a
.then(function(results) {
return Array.prototype.concat.apply([], results);
});
Related
I am not sure whether the Array.sort(callback) is synchronous or asyncronous. But I have used the Array.sort(callback) to sort the date (updateddate which is stored in db as string type) in the following code snippet. Do i need to include await for Array.sort(callback) to ensure that the rest of the code is executed only when the array sorting is completed. Is it right way to use sorting above the synchronous code? Should I write the rest of the code inside the callback of data.sort?
modify_data(data,likesData){
data.sort(function(a,b){
return new Date(b.updateddate) - new Date(a.updateddate);
})
var nomination_group_id = _.groupBy(data,"submissionid")
var likes_group
var refined_arr = [];
var likesData = likesData
_.each(nomination_group_id,function(eachObj){
var mapObj = new Map()
mapObj.set('category',eachObj[0] ? eachObj[0].question : " ")
mapObj.set('submitter',eachObj[0] ? eachObj[0].email : " ")
refined_arr.push([ ...mapObj.values() ])
})
return refined_arr
}
I am not sure whether the Array.sort(callback) is synchronous or asyncronous
Its safe, and its synchronous.
Array.sort(sortFunction) argument sortFunction is synchronously applied on each element of the array. Also, Javascript is single-threaded.
const names = ['adelaine','ben', 'diana', 'carl']
const namesSortedAlphabetically = names.sort() // default is alphabetical sort
console.log('ALPHA SORT', namesSortedAlphabetically)
const namesSortedByLength = names.sort((a, b) => a.length > b.length) // custom sort
console.log('CUSTOM SORT', namesSortedByLength)
Do i need to include await for Array.sort(callback)
Nope, since Array.sort is synchronous and doesn't return a Promise.
It's synchronous. Callback itself doesn't mean owner function is asynchronous.
Callback is just for deciding how to sort. So you can put your next logic outside of callback safely.
It is safe & synchronous, because it is not a call back, it is just a comparator (compare function) see here
It is not calling this function at the end of doing something, it is using that function. Hope it calrifies
.sort() is a function from the JavaScript language, and it is determined to be synchronous by the ECMAScript specification.
(Functions of the node.js framework are ususally asynchronous by default, but .sort() is coming from JavaScript language itself, not from the node.js framework.)
It is very likely that you want to use it asynchronously, when it comes to any production level application. You can:
just wrap it around an asynchronous function scope, or
write your own asynchronous implementation, or
you can find several ready-to-use implementations for this issue, for example as an npm package.
You are right, you need to pay attention that the code which is counting on the sorting to be done needs to be executed after sorting. If you have chosen the first option in the 3 options before, you can:
put stuff inside one and only callback (likely minor things), or
build up a callback chain with .then()
use the await keyword as you have mentioned.
Please note that I didn't want to confuse you with so many solutions, but it's necessary to be aware of them. They will become second nature over time. If all of this feels a little confusing at this moment, I'd suggest you to start with the basic .sort() wrapped around an asynchronous function, and chained with .then() functions.
EDIT
As my peer mentioned above, .sort() itself does not accept a callback function, but a sorting function instead. That is correct. In order to use .sort() asynchronously, you have to wrap it around your own asynchronous function, as I've suggested.
Modify input is a bad idea and you don't need await to make sure data.sort is synchronous, just declare a new variable to hold the result of sort like:
var sorted_data = data.sort(function(a,b){
return new Date(b.updateddate) - new Date(a.updateddate);
})
var nomination_group_id = _.groupBy(sorted_data, "submissionid")
I have an async function that looks something like this:
const myMethod = async () => {
await Promise.all(
someIds.map((id) => {
someMethodThatReturnsPromise(id);
});
);
doSomethingElseAfterPromisesResolve();
};
This function contains a bug because it uses curly braces in its map function but doesn't return each promise. The Promise.all consumes the undefined return value and silently proceeds to the next statement, without waiting. The problem can be corrected by using parentheses instead of braces, or by including an explicit return statement.
My question is, how can I test this? I know I can use Promise.resolve() or Promise.reject() to simulate different states of the promise, and mock the return values of the inner method, but that doesn't really cover the problem. Outside a full blown integration test, how can I prevent the above error with a test?
Well, the issue is not that Promise.all() accepts null, it doesnt. What it accepts is arrays of the kind of [null] or [undefined] (which are 2 different things, actually)
As I mentioned in my comments, testing Promise.all() is not something I would do, it's third party code, you should be testing your own, so I think having a linter check for such bugs is a far superior solution
That being said, you are entitled to your opinions, I'll merely point out a possibility for achieving what you want. That is: monkey patching
You could overwrite the Promise.all() like so:
let originalAll = Promise.all;
Promise.all = (promises) => {
// you could check several other things
// but this covers what you wanted, right?
let isArrayWithBlanks = promises.includes(null) || promises.includes(undefined);
return isArrayWithBlanks
? Promise.reject("NO!")
: originalAll(promises);
};
Now you can easily write a test given you use this monkey patched Promise.all throughout your project. You decide where to place that code
Hope this helps
I would stub both the someMethodThatReturnsPromise and doSomethingElseAfterPromisesResolve functions, returning any value from both.
Then ensure that someIds has multiple values.
Your assertions could then be:
someMethodThatReturnsPromise is called once for each item in the someIds array
doSomethingElseAfterPromisesResolve is called
I'm writing a nodejs thing, and trying the Pacta promise library for fun. Pacta's interface is "algebraic," but I don't have any experience with that paradigm.
I'd like to know what is the "Pacta way" to accomplish the same thing as
$.when.apply(undefined, arrayOfThings)
.then(function onceAllThingsAreResolved(thing1Val, thing2Val, ...) {
// code that executes once all things have been coerced to settled promises
// and which receives ordered resolution values, either as
// separate args or as a single array arg
}
That is, given an array, an iterator function that returns a promise, and a callback function, I'd like to map the iterator onto the array and provide an array of the resolution values (or rejection reasons) to the callback once all the promises have been settled.
If there isn't an idiomatically algebraic way to express this, I'd be just as interested to know that.
EDIT: updated use of $.when to properly accommodate an array, per #Bergi.
Pacta's interface is "algebraic," but I don't have any experience with that paradigm.
ADTs are type theory constructs that represent nested data types, like a Promise for Integer. They are heavily used in functional programming, a flavour where you always know the types of your expressions and values. There are no intransparent, implicit type coercions, but only explicit ones.
This is completely contrary to jQuery's approach, where $.when() and .then() do completely different things based on the types (and number) of its arguments. Therefore, translating your code is a bit complicated. Admittedly, Pacta doesn't have the most useful implementation, so we have to use some own helper functions to do this.
Assume you have an array of (multiple) promises, and your then callback takes the arguments and returns a non-promise value:
arrayOfPromises.reduce(function(arr, val) {
return arr.append(val);
}, Promise.of([])).spread(function (…args) {
// code that executes once all promises have been fulfilled
// and which receives the resolution values as separate args
});
If your callback does not take multiple arguments, use map instead of spread:
arrayOfPromises.reduce(function(arrp, valp) {
return arrp.append(valp);
}, Promise.of([])).map(function (arr) {
// code that executes once all promises have been fulfilled
// and which receives the resolution values as an array
});
If your callback does return a promise, use chain instead of map:
arrayOfPromises.reduce(function(arr, val) {
return arr.append(val);
}, Promise.of([])).chain(function (arr) {
// code that executes once all promises have been fulfilled
// and which receives the resolution values as an array
});
If you don't know what it returns, use then instead of chain. If you don't know what it returns and want to get multiple arguments, use .spread(…).then(identity).
If your array contains promises mixed with plain values, use the following:
arrayOfThings.reduce(function(arrp, val) {
var p = new Promise();
Promise.resolve(p, val);
return arrp.append(p);
}, Promise.of([])).…
If your array contains only a single or no (non-thenable) value, use
Promise.of(arrayOfThings[0]).…
If your array contains anything else, even $.when would not do what you expect.
Of course, promises that resolve with multiple values are not supported at all - use arrays instead. Also, your callback will only be called when all promises are fulfilled, not when they're settled, just as jQuery does this.
I'm new to programming. I recently started with Javascript and in one of the topics there was create an array of functions. My question is what are those useful for? I didn't get the idea behind. Can someone help me understand?
Update: to make the question more clear I will use an example a colleague shared. Let's say we have this:
var twoDimensionalImageData = ...
var operations = [
function(pixel) { blur(pixel); },
function(pixel) { invert(pixel); },
function(pixel) { reflect(pixel); }
];
foreach(var pixel in twoDimensionalImageData)
foreach(var func in operations)
func( pixel );
Can this be achieved without the use of functions in array? Or can this be achieved without the use of function(pixel) in operations array? If yes I would like to understand why the function in array can be better than normal functions. What's the benefit of it?
I can see a possible use for an array-of-functions if you're wanting to massage data; rather than using function currying and composition, just apply a series of functions to the data, like macro steps. This might be useful in imaging applications, think of Photoshop's "Actions" feature.
var twoDimensionalImageData = ...
var operations = [
function(pixel) { blur(pixel); },
function(pixel) { invert(pixel); },
function(pixel) { reflect(pixel); }
];
foreach(var pixel in twoDimensionalImageData)
foreach(var func in operations)
func( pixel );
You could use a list of functions as a list of callbacks or have functions as listeners (instead of objects) of an Observer pattern.
Observer:
This is a very famous software design pattern. It consists of having one main object or, say, Entity, that lots of other objects are interested into. When this main Entity changes its attribute or something happens to it, it tells whoever is interested (or listening) that something happened (and what did).
List of callbacks:
Could be useful when, say, you made an Ajax request (Asynchronous Javascript and XML) to update your news feed and then you want to also execute many other steps. All you have to do if you have this list of functions is iterate through it and call them. (Yes, you could call them in a single function, but keeping them in an array would give it a lot of flexibility).
In both cases, it's very easy to "know" what functions you have to call :)
I started using Twisted in a project that require asynchronous programming and the docs are pretty good.
So my question is, is a Deferred in Twisted the same as a Promise in Javascript? If not, what are the differences?
The answer to your question is both Yes and No depending on why you're asking.
Yes:
Both a Twisted Deferred and a Javascript Promise implement a mechanism for queuing synchronous blocks of code to be run in a given order while being decoupled from other synchronous blocks of code.
No:
So Javascript's Promise is actually more similar to Python's Future, and the airy-fairy way to explain this is to talk about the Promise and the Resolver being combined to make a Deferred, and to state that this affects what you can do with the callbacks.
This is all very well and good in that it's accurate, however it doesn't really make anything any clearer, and without typing thousands of words where I'm almost guaranteed to make a mistake, I'm probably better quoting someone who knows a little something about Python.
Guido van Rossum on Deferreds:
Here's my attempt to explain Deferred's big ideas (and there are a lot
of them) to advanced Python users with no previous Twisted experience.
I also assume you have thought about asynchronous calls before. Just
to annoy Glyph, I am using a 5-star system to indicate the importance
of ideas, where 1 star is "good idea but pretty obvious" and 5 stars
is "brilliant".
I am showing a lot of code snippets, because some ideas are just best
expressed that way -- but I intentionally leave out lots of details,
and sometimes I show code that has bugs, if fixing them would reduce
understanding the idea behind the code. (I will point out such bugs.)
I am using Python 3.
Notes specifically for Glyph: (a) Consider this a draft for a blog
post. I'd be more than happy to take corrections and suggestions for
improvements. (b) This does not mean I am going to change Tulip to a
more Deferred-like model; but that's for a different thread.
Idea 1: Return a special object instead of taking a callback argument
When designing APIs that produce results asynchronously, you find that
you need a system for callbacks. Usually the first design that comes
to mind is to pass in a callback function that will be called when the
async operation is complete. I've even seen designs where if you don't
pass in a callback the operation is synchronous -- that's bad enough
I'd give it zero stars. But even the one-star version pollutes all
APIs with extra arguments that have to be passed around tediously.
Twisted's first big idea then is that it's better to return a special
object to which the caller can add a callback after receiving it. I
give this three stars because from it sprout so many of the other good
ideas. It is of course similar to the idea underlying the Futures and
Promises found in many languages and libraries, e.g. Python's
concurrent.futures (PEP 3148, closely following Java Futures, both of
which are meant for a threaded world) and now Tulip (PEP 3156, using a
similar design adapted for thread-less async operation).
Idea 2: Pass results from callback to callback
I think it's best to show some code first:
class Deferred:
def __init__(self):
self.callbacks = []
def addCallback(self, callback):
self.callbacks.append(callback) # Bug here
def callback(self, result):
for cb in self.callbacks:
result = cb(result)
The most interesting bits are the last two lines: the result of each
callback is passed to the next. This is different from how things work
in concurrent.futures and Tulip, where the result (once set) is fixed
as an attribute of the Future. Here the result can be modified by each
callback.
This enables a new pattern when one function returning a Deferred
calls another one and transforms its result, and this is what earns
this idea three stars. For example, suppose we have an async function
that reads a set of bookmarks, and we want to write an async function
that calls this and then sorts the bookmarks. Instead of inventing a
mechanism whereby one async function can wait for another (which we
will do later anyway :-), the second async function can simply add a
new callback to the Deferred returned by the first one:
def read_bookmarks_sorted():
d = read_bookmarks()
d.addCallback(sorted)
return d
The Deferred returned by this function represents a sorted list of
bookmarks. If its caller wants to print those bookmarks, it must add
another callback:
d = read_bookmarks_sorted()
d.addCallback(print)
In a world where async results are represented by Futures, this same
example would require two separate Futures: one returned by
read_bookmarks() representing the unsorted list, and a separate Future
returned by read_bookmarks_sorted() representing the sorted list.
There is one non-obvious bug in this version of the class: if
addCallback() is called after the Deferred has already fired (i.e. its
callback() method was called) then the callback added by addCallback()
will never be called. It's easy enough to fix this, but tedious, and
you can look it up in the Twisted source code. I'll carry this bug
through successive examples -- just pretend that you live in a world
where the result is never ready too soon. There are other problems
with this design too, but I'd rather call the solutions improvements
than bugfixes.
Aside: Twisted's poor choices of terminology
I don't know why, but, starting with the project's own name, Twisted
often rubs me the wrong way with its choice of names for things. For
example, I really like the guideline that class names should be nouns.
But 'Deferred' is an adjective, and not just any adjective, it's a
verb's past participle (and an overly long one at that :-). And why is
it in a module named twisted.internet?
Then there is 'callback', which is used for two related but distinct
purposes: it is the preferred term used for a function that will be
called when a result is ready, but it is also the name of the method
you call to "fire" the Deferred, i.e. set the (initial) result.
Don't get me started on the neologism/portmanteau that is 'errback',
which leads us to...
Idea 3: Integrated error handling
This idea gets only two stars (which I'm sure will disappoint many
Twisted fans) because it confused me a lot. I've also noted that the
Twisted docs have some trouble explaining how it works -- In this case
particularly I found that reading the code was more helpful than the
docs.
The basic idea is simple enough: what if the promise of firing the
Deferred with a result can't be fulfilled? When we write
d = pod_bay_doors.open()
d.addCallback(lambda _: pod.launch())
how is HAL 9000 supposed to say "I'm sorry, Dave. I'm afraid I can't
do that" ?
And even if we don't care for that answer, what should we do if one of
the callbacks raises an exception?
Twisted's solution is to bifurcate each callback into a callback and
an 'errback'. But that's not all -- in order to deal with exceptions
raised by callbacks, it also introduces a new class, 'Failure'. I'd
actually like to introduce the latter first, without introducing
errbacks:
class Failure:
def __init__(self):
self.exception = sys.exc_info()[1]
(By the way, great class name. And I mean this, I'm not being
sarcastic.)
Now we can rewrite the callback() method as follows:
def callback(self, result):
for cb in self.callbacks:
try:
result = cb(result)
except:
result = Failure()
This in itself I'd give two stars; the callback can use
isinstance(result, Failure) to tell regular results apart from
failures.
By the way, in Python 3 it might be possible to do away with the
separate Failure class encapsulating exceptions, and just use the
built-in BaseException class. From reading the comments in the code,
Twisted's Failure class mostly exists so that it can hold all the
information returned by sys.exc_info(), i.e. exception class/type,
exception instance, and traceback but in Python 3, exception objects
already hold a reference to the traceback.There is some debug stuff
that Twisted's Failure class does which standard exceptions don't, but
still, I think most reasons for introducing a separate class have been
addressed.
But let's not forget about the errbacks. We change the list of
callbacks to a list of pairs of callback functions, and we rewrite the
callback() method again, as follows:
def callback(self, result):
for (cb, eb) in self.callbacks:
if isinstance(result, Failure):
cb = eb # Use errback
try:
result = cb(result)
except:
result = Failure()
For convenience we also add an errback() method:
def errback(self, fail=None):
if fail is None:
fail = Failure()
self.callback(fail)
(The real errback() function has a few more special cases, it can be
called with either an exception or a Failure as argument, and the
Failure class takes an optional exception argument to prevent it from
using sys.exc_info(). But none of that is essential and it makes the
code snippets more complicated.)
In order to ensure that self.callbacks is a list of pairs we must also
update addCallback() (it still doesn't work right when called after
the Deferred has fired):
def addCallback(self, callback, errback=None):
if errback is None:
errback = lambda r: r
self.callbacks.append((callback, errback))
If this is called with just a callback function, the errback will be a
dummy that passes the result (i.e. a Failure instance) through
unchanged. This preserves the error condition for a subsequent error
handler. To make it easy to add an error handler without also handling
a regular resullt, we add addErrback(), as follows:
def addErrback(self, errback):
self.addCallback(lambda r: r, errback)
Here, the callback half of the pair will pass the (non-Failure) result
through unchanged to the next callback.
If you want the full motivation, read Twisted's Introduction to
Deferreds; I'll just end by noting that an errback and substitute a
regular result for a Failure just by returning a non-Failure value
(including None).
Before I move on to the next idea, let me point out that there are
more niceties in the real Deferred class. For example, you can specify
additional arguments to be passed to the callback and errback. But in
a pinch you can do this with lambdas, so I'm leaving it out, because
the extra code for doing the administration doesn't elucidate the
basic ideas.
Idea 4: Chaining Deferreds
This is a five-star idea! Sometimes it really is necessary for a
callback to wait for an additional async event before it can produce
the desired result. For example, suppose we have two basic async
operations, read_bookmarks() and sync_bookmarks(), and we want a
combined operation. If this was synchronous code, we could write:
def sync_and_read_bookmarks():
sync_bookmarks()
return read_bookmarks()
But how do we write this if all operations return Deferreds? With the
idea of chaining, we can do it as follows:
def sync_and_read_bookmarks():
d = sync_bookmarks()
d.addCallback(lambda unused_result: read_bookmarks())
return d
The lambda is needed because all callbacks are called with a result
value, but read_bookmarks() takes no arguments.