How to indicate "doneness" in an async operation - javascript

Given this code which counts the files in the given directory and subdirectories, how do I go about indicating that the operation is done?
function walkDirs(dirPath) {
var fs = require('fs'),
path = require('path'),
events = require('events'),
count = 0,
emitter = new events.EventEmitter();
function walkDir(dirPath) {
function readDirCallback(err, entries) {
for (var idx in entries) {
var fullPath = path.join(dirPath, entries[idx]);
(function statHandler(fullPath) {
fs.stat(fullPath, function statEach(err, stats) {
if (stats) {
if (stats.isDirectory()) {
walkDir(fullPath);
} else if (stats.isFile()) {
count += 1;
emitter.emit('counted', count, fullPath);
}
}
});
})(fullPath);
}
}
fs.readdir(dirPath, readDirCallback);
}
walkDir(dirPath);
return emitter;
}
var walker = walkDirs('C:');
I've tried specifically,
firing an event to indicate "doneness" at a place I thought appropriate, specifically after fs.readdir(dirPath, readDirCallback) call.
modifying statHandler() to return the count added. (I realized that this is effectively no different from incrementing count inside that function.
Both of these failed, because when checked, the value of count is 0. I've determined that I'm not waiting until the operation (counting the files) is done. Obviously, I need to fire a callback or event when done to get the right count.
I know the code is successfully counting, because when attaching a debugger, the count value is as expected.
At this point, I've fairly certainly determined that I have no idea how to further proceed. Specically -
How do I implement indicating "doneness" in an asynchronous operation?

Asynchronous Javascript functions generally call callback functions when they are finished. In this case, a done event would be appropriate if you have other events. Promises are now often preferred over just callbacks, but you do need to understand callbacks first.
Since readdir is asynchronous, the execution being on the next line doesn't mean its finished. That's what's confusing about asynchronous code compared to synchronous code. If I were you, I would use the debugger and step through some really simple (simpler than this) async examples. It does take awhile to get used to and is tricky.
For walking directories see https://www.npmjs.com/package/walk . You don't want to re-invent the wheel here. Always be sure to Google and/or search on npmjs for existing modules.
Once you are sure you really understand callbacks and asynchronous code then you can move on to the async module, and after that, promises with bluebird, ES6 promises, etc. Down the line something like this https://www.npmjs.com/package/co may be useful.

You can achieve this by using promises. In this example I chose Q.
npm install q
You resolve your promise whenever you consider the async function to be done by calling the .resolve() function. When your promise gets resolved it will call your success-callback in then .then() function of you promise object, which is walkDirs. The .then() function get triggered whenever your promises get resolved or rejected. If you reject you promise the error-callback will be called.
var q = require('q');
function walkDirs(dirPath) {
var deffered = q.defer();
var fs = require('fs'),
path = require('path'),
events = require('events'),
count = 0,
emitter = new events.EventEmitter();
function walkDir(dirPath) {
function readDirCallback(err, entries) {
for (var idx in entries) {
var fullPath = path.join(dirPath, entries[idx]);
(function statHandler(fullPath) {
fs.stat(fullPath, function statEach(err, stats) {
if (stats) {
if (stats.isDirectory()) {
walkDir(fullPath);
} else if (stats.isFile()) {
count += 1;
emitter.emit('counted', count, fullPath);
deffered.resolve(emitter); // resolve promise
}
}
});
})(fullPath);
}
}
fs.readdir(dirPath, readDirCallback);
}
walkDir(dirPath);
return q.promise;
}
walkDirs('C:')
.then(success, error) //can also take error callback if promise is rejected.
function success(function(data) {
//data = resolved data.
console.log("is successfully done");
})
function errer(function() {
console.log("is errorly done");
})

Related

How to code an async stream generator in javascript without a buffer

We want to consume stream data events from a generator into a transformation/transduction pipeline.
One form of a generators is the following
var generator = async function* () {
var value
while(true){
value = await new Promise((resolve)=>{
value = ... // some functionality
resolve(value)
})
yield value
}
}
Assume there is a data stream that is producing values. There is a factory function available that takes a handler function
async function makeStream(handler) {
...
}
The handle function will provide a payload representing a value produced from the stream.
var handler = function(payload){
}
For the stream we need to provide a callback. For the generator we need to resolve the promise when the handler is invoked. We want to be able to write code something like the following:
function makeCallbackGenerator() {
var handler
var generator = async function*() {
while(true){
var value = await new Promise((resolve)=>{
handler = function(payload){
resolve(payload)
}
})
yield value
}
}
return {
handler,
generator
}
}
This is a factory function which produces
a generator
a callback
The callback is needed to be passed into the stream. The generator is required to pass into the transformation pipeline.
But this declaration isnt right. We dont want to define the function on each iteration of the promise, instead we want to use the function to resolve the promise on each iteration.
The fundamental problem is how to define the callback within the promise so that the stream can be coupled to the generator.
A work around is to use a buffer between the stream and the generator.
function makeCallbackGenerator() {
var buffer = []
var handler = function(payload){
buffer.push(payload)
return 1//consumed
}
var start = async function* () {
while(true){
if(buffer.length>0){
var next = buffer.pop()
debug("Generator yield")
yield next
}else {
await new Promise((resolve)=>{
setTimeout(()=>resolve(),1000)
})
}
}
}
return {
handler, start
}
}
Is there a simpler way to achieve this without a buffer?
I had the same question it is possible to simplify a solution without a buffer and generator. 
I finally found for me much better solution with async function in Node-Media-Server project, async branch.
In v1.2.7 is with generator function.
I tested performance of the RTSP server that I build with both provided async and generator function solutions and got the same results. As I remember several Gbits/sec and 1000 concurrent clients per cpu core.

Avoiding javascript callback and promise hell

I have many asynchronous methods to execute and my program flows can change a lot depending on each method return. The logic below is one example. I could not write it in a easy-to-read way using Promises. How would you write it?
Ps: more complex flows are welcome.
Ps2: is_business is a predefined flag where we say whether we are writing a "business user" or a "person user".
begin transaction
update users
if updated
if is_business
update_business
if not updated
insert business
end if
else
delete business
end if
else
if upsert
insert user
if is_business
insert business
end if
end if
end if
commit transaction
The nice thing about promises is that they make a simple analogy between synchronous code and asynchronous code. To illustrate (using the Q library):
Synchronous:
var thisReturnsAValue = function() {
var result = mySynchronousFunction();
if(result) {
return getOneValue();
} else {
return getAnotherValue();
}
};
try {
var value = thisReturnsAValue();
console.log(value);
} catch(err) {
console.error(err);
}
Asynchronous:
var Q = require('q');
var thisReturnsAPromiseForAValue = function() {
return Q.Promise(function() {
return myAsynchronousFunction().then(function(result) {
if(result) {
// Even getOneValue() would work here, because a non-promise
// value is automatically cast to a pre-resolved promise
return getOneValueAsynchronously();
} else {
return getAnotherValueAsynchronously();
}
});
});
};
thisReturnsAPromiseForAValue().then(function(value) {
console.log(value);
}, function(err) {
console.error(err);
});
You just need to get used to the idea that return values are always accessed as arguments to then-callbacks, and that chaining promises equates to composing function calls (f(g(h(x)))) or otherwise executing functions in sequence (var x2 = h(x); var x3 = g(x2);). That's essentially it! Things get a little tricky when you introduce branches, but you can figure out what to do from these first principles. Because then-callbacks accept promises as return values, you can mutate a value you got asynchronously by returning another promise for an asynchronous operation which resolves to a new value based on the old one, and the parent promise will not resolve until the new one resolves! And, of course, you can return these promises from within if-else branches.
The other really nice thing illustrated in the example above is that promises (at least ones that are compliant with Promises/A+) handle exceptions in an equally analogous way. The first error "raised" bypasses the non-error callbacks and bubbles up to the first available error callback, much like a try-catch block.
For what it's worth, I think trying to mimic this behavior using hand-crafted Node.js-style callbacks and the async library is its own special kind of hell :).
Following these guidelines your code would become (assuming all functions are async and return promises):
beginTransaction().then(function() {
// beginTransaction() has run
return updateUsers(); // resolves the boolean value `updated`
}).then(function(updated) {
// updateUsers() has "returned" `updated`
if(updated) {
if(isBusiness) {
return updateBusiness().then(function(updated) {
if(!updated) {
return insertBusiness();
}
// It's okay if we don't return anything -- it will
// result in a promise which immediately resolves to
// `undefined`, which is a no-op, just like a missing
// else-branch
});
} else {
return deleteBusiness();
}
} else {
if(upsert) {
return insertUser().then(function() {
if(isBusiness) {
return insertBusiness();
}
});
}
}
}).then(function() {
return commitTransaction();
}).done(function() {
console.log('all done!');
}, function(err) {
console.error(err);
});
The solution is a mix of #mooiamaduck answer and #Kevin comment.
Using promises, ES6 generators and co library makes the code much clearer. I found a good example when reading a postgresql node library example (pg). In the example below pool.connect and client.query are asynchronous operations that returns Promises. We can easily add an if/else after geting result and then make more async operations keeping code looking like synchronous.
co(function * () {
var client = yield pool.connect()
try {
yield client.query('BEGIN')
var result = yield client.query('SELECT $1::text as name', ['foo'])
yield client.query('INSERT INTO something(name) VALUES($1)', [result.rows[0].name])
yield client.query('COMMIT')
client.release()
} catch(e) {
// pass truthy value to release to destroy the client
// instead of returning it to the pool
// the pool will create a new client next time
// this will also roll back the transaction within postgres
client.release(true)
}
})

Promise factories not working in Nodejs

I need to perform some async tasks in Nodejs. In this case, I need to iterate throw al levels of a JSON. For that reason, I need to "iterate" syncronusly that object but in order.
I'm doing tests with this code which is a simple example adapted from this site
var fnlist = [ doFirstThing, doSecondThing, doThirdThing, lastThing];
// Promise returning functions to execute
function doFirstThing(){ return Promise.resolve(1); }
function doSecondThing(res){ return Promise.resolve(res + 1); }
function doThirdThing(res){ return Promise.resolve(res + 2); }
function lastThing(res){ console.log("result:", res); }
// Execute a list of Promise return functions in series
function pseries(req,json,list) {
var p = Promise.resolve();
return doFirstThing()
.then((value) => {
console.log('value');
console.log(value);
return doSecondThing(value).then((value2) => {
console.log('value2');
console.log(value2);
});
});
}
router.get('/', function(req, res, next) {
var thisArray = json[0].array;
for(var i = 0;i < thisArray.length; i++){
pseries(req,json,fnlist);
}
});
Console output is:
1
value
1
value
1
value2
2
value2
2
value2
2
And is not still valid because I would need to have this kind of flow:
value
1
value2
2
value
1
value2
2
value
1
value2
2
I know I need to use promises factories in order to don't execute them as soon as they are created, but seems to not be working now. I know I can't use .all because I need to use some data from one promise in the next one.
Any ideas? Thanks!
You have started multiple independent promise chains in your for loop (each call to pseries() is a separate promise chain). As such, you cannot control the sequencing of the separate promise chains. If you want to control one chain vs. another, then you will have to link them (e.g. chain them together) so the ordering is explicit rather than left to chance.
The output you see is not surprising because the first thing your for loop does is register a bunch of .then() handlers. Because the promises are already resolved for those, the .then() handlers are all queued to run as soon as your for loop is done (.then() handlers are ALWAYS queued to run asynchronously). The for loop finishes and then the first crop of .then() handlers all run. The process of running them schedules three more .then() handlers. Those are then queued and they run when the first crop of .then() handlers is all done. While I explained the likely logic for why you get the order you see, this is not guaranteed. These are async operations and the only thing you know is that they complete some uncertain time in the future. If you want explicit order, you have to force that through explicit synchronization of your promises.
You can sequence an iteration through an array in a known order like this using a fairly common design pattern with array.reduce():
router.get('/', function(req, res, next) {
var thisArray = json[0].array;
thisArray.reduce(function(p, item) {
return p.then(function() {
return pseries(req,json,fnlist);
});
}, Promise.resolve()).then(function(result) {
// all done here
}, function(err) {
// error here
});
});
Try to chain all your promise using a foreach:
var sequence = Promise.resolve();
// Loop through our chapter urls
story.chapterUrls.forEach(function(chapterUrl) {
// Add these actions to the end of the sequence
sequence = sequence.then(function() {
return getJSON(chapterUrl);
}).then(function(chapter) {
addHtmlToPage(chapter.html);
});
});
for more complex combination, check this page:
http://www.html5rocks.com/en/tutorials/es6/promises/#toc-parallelism-sequencing

Node.JS How to set a variable outside the current scope

I have some code that I cant get my head around, I am trying to return an array of object using a callback, I have a function that is returning the values and then pushing them into an array but I cant access this outside of the function, I am doing something stupid here but can't tell what ( I am very new to Node.JS )
for (var index in res.response.result) {
var marketArray = [];
(function () {
var market = res.response.result[index];
createOrUpdateMarket(market, eventObj , function (err, marketObj) {
marketArray.push(marketObj)
console.log('The Array is %s',marketArray.length) //Returns The Array is 1.2.3..etc
});
console.log('The Array is %s',marketArray.length) // Returns The Array is 0
})();
}
You have multiple issues going on here. A core issue is to gain an understanding of how asynchronous responses work and which code executes when. But, in addition to that you also have to learn how to manage multiple async responses in a loop and how to know when all the responses are done and how to get the results in order and what tools can best be used in node.js to do that.
Your core issue is a matter of timing. The createOrUpdateMarket() function is probably asynchronous. That means that it starts its operation when the function is called, then calls its callback sometime in the future. Meanwhile the rest of your code continues to run. Thus, you are trying to access the array BEFORE the callback has been called.
Because you cannot know exactly when that callback will be called, the only place you can reliably use the callback data is inside the callback or in something that is called from within the callback.
You can read more about the details of the async/callback issue here: Why is my variable unaltered after I modify it inside of a function? - Asynchronous code reference
To know when a whole series of these createOrUpdateMarket() operations are all done, you will have to code especially to know when all of them are done and you cannot rely on a simple for loop. The modern way to do that is to use promises which offer tools for helping you manage the timing of one or more asynchronous operations.
In addition, if you want to accumulate results from your for loop in marketArray, you have to declare and initialize that before your for loop, not inside your for loop. Here are several solutions:
Manually Coded Solution
var len = res.response.result.length;
var marketArray = new Array(len), cntr = 0;
for (var index = 0, index < len; index++) {
(function(i) {
createOrUpdateMarket(res.response.result[i], eventObj , function (err, marketObj) {
++cntr;
if (err) {
// need error handling here
}
marketArray[i] = marketObj;
// if last response has just finished
if (cntr === len) {
// here the marketArray is fully populated and all responses are done
// put your code to process the marketArray here
}
});
})(index);
}
Standard Promises Built Into Node.js
// make a version of createOrUpdateMarket that returns a promise
function createOrUpdateMarketAsync(a, b) {
return new Promise(function(resolve, reject) {
createOrUpdateMarket(a, b, function(err, marketObj) {
if (err) {
reject(err);
return;
}
resolve(marketObj);
});
});
}
var promises = [];
for (var i = 0; i < res.response.result.length; i++) {
promises.push(createorUpdateMarketAsync(res.response.result[i], eventObj));
}
Promise.all(promises).then(function(marketArray) {
// all results done here, results in marketArray
}, function(err) {
// an error occurred
});
Enhanced Promises with the Bluebird Promise library
The bluebird promise library offers Promise.map() which will iterate over your array of data and produce an array of asynchronously obtained results.
// make a version of createOrUpdateMarket that returns a promise
var Promise = require('bluebird');
var createOrUpdateMarketAsync = Promise.promisify(createOrUpdateMarket);
// iterate the res.response.result array and run an operation on each item
Promise.map(res.response.result, function(item) {
return createOrUpdateMarketAsync(item, eventObj);
}).then(function(marketArray) {
// all results done here, results in marketArray
}, function(err) {
// an error occurred
});
Async Library
You can also use the async library to help manage multiple async operations. In this case, you can use async.map() which will create an array of results.
var async = require('async');
async.map(res.response.result, function(item, done) {
createOrUpdateMarker(item, eventObj, function(err, marketObj) {
if (err) {
done(err);
} else {
done(marketObj);
}
});
}, function(err, results) {
if (err) {
// an error occurred
} else {
// results array contains all the async results
}
});

Node.js Asynchronous Library Comparison - Q vs Async

I have used kriskowal's Q library for a project (web scraper / human-activity simulator) and have become acquainted with promises, returning them and resolving/rejecting them, and the library's basic asynchronous control flow methods and error-throwing/catching mechanisms have proven essential.
I have encountered some issues though. My promise.then calls and my callbacks have the uncanny tendency to form pyramids. Sometimes it's for scoping reasons, other times it's to guarantee a certain order of events. (I suppose I might be able to fix some of these problems by refactoring, but going forward I want to avoid "callback hell" altogether.)
Also, debugging is very frustrating. I spend a lot of time console.log-ing my way to the source of errors and bugs; after I finally find them I will start throwing errors there and catching them somewhere else with promise.finally, but the process of locating the errors in the first place is arduous.
Also, in my project, order matters. I need to do pretty much everything sequentially. Oftentimes I find myself generating arrays of functions that return promises and then chaining them to each other using Array.prototype.reduce, which I don't think I should have to do.
Here is an example of one of my methods that uses this reduction technique:
removeItem: function (itemId) {
var removeRegexp = new RegExp('\\/stock\\.php\\?remove=' + itemId);
return this.getPage('/stock.php')
.then(function (webpage) {
var
pageCount = 5,
promiseFunctions = [],
promiseSequence;
// Create an array of promise-yielding functions that can run sequentially.
_.times(pageCount, function (i) {
var promiseFunction = function () {
var
promise,
path;
if (i === 0) {
promise = Q(webpage);
} else {
path = '/stock.php?p=' + i;
promise = this.getPage(path);
}
return promise.then(function (webpage) {
var
removeMatch = webpage.match(removeRegexp),
removePath;
if (removeMatch !== null) {
removePath = removeitemMatch[0];
return this.getPage(removePath)
.delay(1000)
// Stop calling subsequent promises.
.thenResolve(true);
}
// Don't stop calling subsequent promises.
return false;
}.bind(this));
}.bind(this);
promiseFunctions.push(promiseFunction);
}, this);
// Resolve the promises sequentially but stop early if the item is found.
promiseSequence = promiseFunctions.reduce(function (soFar, promiseFunction, index) {
return soFar.then(function (stop) {
if (stop) {
return true;
} else {
return Q.delay(1000).then(promiseFunction);
}
});
}, Q());
return promiseSequence;
}.bind(this))
.fail(function (onRejected) {
console.log(onRejected);
});
},
I have other methods that do basically the same thing but which are suffering from much worse indentation woes.
I'm considering refactoring my project using coalan's async library. It seems similar to Q, but I want to know exactly how they differ. The impression I am getting is that async more "callback-centric" while Q is "promise-centric".
Question: Given my problems and project requirements, what would I gain and/or lose by using async over Q? How do the libraries compare? (Particularly in terms of executing series of tasks sequentially and debugging/error-handling?)
Both libraries are good. I have discovered that they serve separate purposes and can be used in tandem.
Q provides the developer with promise objects, which are future representations of values. Useful for time travelling.
Async provides the developer with asynchronous versions of control structures and aggregate operations.
An example from one attempt at a linter implementation demonstrates a potential unity among libraries:
function lint(files, callback) {
// Function which returns a promise.
var getMerged = merger('.jslintrc'),
// Result objects to invoke callback with.
results = [];
async.each(files, function (file, callback) {
fs.exists(file, function (exists) {
// Future representation of the file's contents.
var contentsPromise,
// Future representation of JSLINT options from .jslintrc files.
optionPromise;
if (!exists) {
callback();
return;
}
contentsPromise = q.nfcall(fs.readFile, file, 'utf8');
optionPromise = getMerged(path.dirname(file));
// Parallelize IO operations.
q.all([contentsPromise, optionPromise])
.spread(function (contents, option) {
var success = JSLINT(contents, option),
errors,
fileResults;
if (!success) {
errors = JSLINT.data().errors;
fileResults = errors.reduce(function (soFar, error) {
if (error === null) {
return soFar;
}
return soFar.concat({
file: file,
error: error
});
}, []);
results = results.concat(fileResults);
}
process.nextTick(callback);
})
.catch(function (error) {
process.nextTick(function () {
callback(error);
});
})
.done();
});
}, function (error) {
results = results.sort(function (a, b) {
return a.file.charCodeAt(0) - b.file.charCodeAt(0);
});
callback(error, results);
});
}
I want to do something potentially-blocking for each file. So async.each is the obvious choice. I can parallelize related operations per-iteration with q.all and reuse my option values if they apply to 2 or more files.
Here, Async and Q each influence the control flow of the program, and Q represents values resolving to file contents sometime in the future. The libraries work well together. One does not need to "choose one over the other".
Callback pyramids in your code can be simplified using promise composition and javascript lexical scoping.
removeItem: function (itemId) {
var removeRegexp = new RegExp('\\/stock\\.php\\?remove=' + itemId);
var found = false
var promise = getPage('/sock.php')
_.times(5, (i) => {
promise = promise.then((webpage) => {
if (found) return true
var removeMatch = webpage.match(removeRegexp)
var found = removeMath !== null
var nextPage = found ? removeMatch[0] : '/stock.php?p='+i+1
return Q.delay(1000).then(() => this.getPage(nextPage))
})
})
return promise.fail(console.log.bind(console))
},
IMHO async should not be used in new javascript code. Promises are more composable, and allow for a lot more intutive code.
The primary reason why node did not use promises was because of performance concerns which have largely been addressed very well by libraries like Bluebird and Q.
As async/await syntax becomes more mainstream, promises will pave the way for code that looks very similar with synchronous code.
While this is still not an actual answer to my question (Q vs async), regarding my problem, I've found Selenium / WebDriverJs to be a viable solution.
driver.get('http://www.google.com');
driver.findElement(webdriver.By.name('q')).sendKeys('webdriver');
driver.findElement(webdriver.By.name('btnG')).click();
driver.wait(function() {
return driver.getTitle().then(function(title) {
return title === 'webdriver - Google Search';
});
}, 1000);
WebDriver uses a queue to execute promises sequentially, which helps immensely with controlling indentation. Its promises are also compatible with Q's.
Creating a sequence of promises is no longer an issue. A simple for-loop will do.
As for stopping early in a sequence, don't do this. Instead of using a sequence, use an asynchronous-while design and branch.

Categories