node, async programming, callback hell - javascript

i'm trying to understand callbacks and async programming, but I'm having a bit of trouble.
here's some pseudocode :
var lines = [];
var arrayOfFeedUrls = [url1,url2,...];
function scrape(url){
http.get(url, function(res) {
res.pipe(new FeedParser([options]))
.on('readable', function () {
var stream = this, item;
while (item=stream.read()) {
line = item.title;
lines.push(line);
}
});
});
});
for (i in arrayOfFeedUrls){
scrape(arrayOfFeedUrls[i];
}
console.log(lines.length);
It obviously returns 0, as the scrape function is executed asynchronously. I understand that much, but I've tried many intricated ways and can't figure out how to write it properly. Any help/explanation would be greatly appreciated. I've read (and I'm still reading) a lot of tutorials and examples, but I think the only way for me to get it is to write some code myself. If I solve this I'll post the answer.

You could want to check this article for an introduction in Node that might help you understand async programming in Node a little better.
As far as async programming goes, async is a very popular module in Node's userland which helps you write asynchronous code effortlessly. For instance (untested pseudo-code):
function scrape (done) {
http.get(url, done);
}
function parse (res, done) {
var lines = [];
res.pipe(new FeedParser([options]))
.on('readable', function () {
var stream = this, item;
while (item=stream.read()) {
line = item.title;
lines.push(line);
}
})
.on('end', function () {
done(null, lines);
});
}
function done (err, lines) {
if (err) { throw err; }
console.log(lines.length);
}
async.waterfall([scrape, parse], done);

This depends on if you want to scrape all urls in parallell or in series.
If you were to do it in series, you should think of it as this:
Start with the first url. Scrape. In the callback, scrape the next url. in the callback, scrape the next url.
This will give the notorious callback hell you are talking about, but that is the principle at least. That where librarires like async etc removes a lot of the headache.

When programming async calls in this manner, functions and instructions that you want to chain onto the end, such as console.log(lines.length);, must also be callbacks. So for instance, try something like this:
var lines = [];
var arrayOfFeedUrls = [url1,url2,...];
function scrape(url){
http.get(url, function(res) {
res.pipe(new FeedParser([options]))
.on('readable', function () {
var stream = this, item;
while (item=stream.read()) {
line = item.title;
lines.push(line);
done();
}
});
});
});
for (i in arrayOfFeedUrls){
scrape(arrayOfFeedUrls[i];
}
function done () {
if (lines.length == arrayOfFeedUrls.length) {
console.log(lines.length);
}
}
You may also want to look into promises, an alternative programming style to callbacks, which aims to avoid callback hell.

Have to admit that I'm very new to node.js and struggling to grok the callback stuff. In my limited experience, adding one more parameter to the callback function may be the trick. The hard question is, which parameter?
In your example, if the function scrape had an extra boolean "lastOne", then it could call console.log(lines) itself. Or, if it understood that a null url meant to stop. However, I don't think even this works, as I'm not sure everything will get done in order. If the 2nd URL takes forever, the last one may complete first, right??? (You could try it). In other words, I still don't know which parameter to add. Sorry...
What seems more reliable is to set a counter to urls.length, and for scrape() to decrement it each time. When the counter reaches 0, it knows that the entire process is done and it should log (or do whatever) with the result. I'm not 100% sure where to declare this counter. Coming from Java I still have little idea what is a static global, what is an instance, whatever...
Now, a true-blue node.jser would pass a function to doWhatever as an extra parameter to scrape(), so that you can do something other than console.log(). :-) But I'd settle for the check for zero.
to elaborate slightly, add a callWhenDone parameter to scrape(), and add (somewhere in all that nesting!!!)
if (--counter <= 0)
callWhenDone (lines);

Ok, so here's how i've solved the problem, feel free to comment and tell me if it's right.
var lines = [];
var arrayOfFeedUrls = [url1,url2,...];
function scrape(array){
var url = array.shift();
http.get(url, function(res) {
res.pipe(new FeedParser([options]))
.on('readable', function () {
var stream = this, item;
while (item=stream.read()) {
line = item.title;
lines.push(line);
}
}).on('end', function () {
if(array.length){
scrapeFeeds(array);
}
});
});
});
scrapeFeeds(array);
Thanks for all the answers, i'm looking more in depth to async as I've got more complicated stuff to do. Let me know what you think of my code, it's always useful.

Related

how to make wait javascript to finish all functions before return result

output = true;
if($("#password-field").css('display') != 'none') {
if(!($("#verificationCode").val())) {
output = false;
$("#code-error").html("required");
}
var codeverify = function(){
var code = document.getElementById("verificationCode").value;
coderesult
.confirm(code)
.then( function(result) {
if (result.user.uid) {
let phoneNumber = result.user.phoneNumber;
//alert(output);
alert("Verification done");
console.log(phoneNumber);
} else {
alert(output);
$("#code-error").html("no user");
output = false;
}
})
.catch(function(error) {
output = false;
$("#code-error").html("wrong");
alert(error.message);
});
}();
}
return output;
When i run this code everything works fine. but before checking the codeverify function it return the output to true even if the codeverify() function returns false
PS. I am using wizard form.
This comes down to how you write JavaScript code, I found that usually when to get to a point where my procedures are out of sync it means that I have done something wrong in previous steps. This is usually only fixed by refactoring.
Remember JavaScript does not behave the same as other languages.
What I can see from your procedure is that you are trying to do many things in one go.
Although I do not have a solution I have a suggestion, consider each action that you want your procedure to execute. Declare a separate function for each of these steps, even if your function only has one line to execute.
If there are dependencies make sure they can be resolved by parameterization.
And lastly, think pure functions. Try and structure every function to receive something and return something.
Another tip that I can give is, write your procedure to select and hold elements in variables until they are required. Consider what elements are required in execution, which of those are in the dom when execution starts and set them to variables before you start executing, then during execution if elements are added that are maybe required for later select them immediately after they are placed in the dom, this means that as your procedure executes all the ingredients are available to do whatever must be done they don't have to go find what they need on the fly.
Good Luck and happy coding.
Your coderesult.confirm(code) using promise(then & catch) so i assume it is asynchronous. You need to google yourself to learn what is async
One important thing of JS behavior is JS always process the rest of the code if there is a async function in between.
Sample:
console.log(1)
setTimeout(()=>{
console.log(2,"i suppose at position 2 but i'm at the last. This is because setTimeout is async function")
},1000)
console.log(3)
In your case, your codeverify function has async code (.confirm()) in between, so JS will process the code after codeverify (return output)until codeverify is fully completed.
Since your output was set at true since the beginning, it will return true from the last row of code return output before your codeverify completed, this is why you always get true. If you change the first row output = undefined, i believe you will see undefined result.
To solve this, one of the way is you can wrap the entire codeverify as Promise.
function codeverify(){
return new Promise((resolve,reject)=>{
var code = document.getElementById("verificationCode").value;
coderesult.confirm(code).then( function(result) {
if (result.user.uid) {
let phoneNumber = result.user.phoneNumber;
//alert(output);
alert("Verification done");
console.log(phoneNumber);
output = true
resolve(output)
} else {
alert(output);
$("#code-error").html("no user");
output = false;
resolve(output)
}
}).catch(function(error) {
output = false;
$("#code-error").html("wrong");
alert(error.message);
reject(output) //you can reject(error.message) so you can pass error message back.
});
})
}
You can execute the function like this:
codeverify().then(successData=>{
console.log(successData) //from what you resolve
}).catch(failedData=>{
console.log(failedData) //from what you reject
})
Lastly, take some time to study what is Asynchronous and What Promise to do with Async because it is everywhere in JS.

How to use do/while Statement in an async function in JavaScript?

I have a an async function that checks if the id already exists in a table.
async function generateIdentifier () {
try {
let exists;
let id;
do {
id = someRandomStringGenerator();
const email = await Database.find({id});
if (email.length > 0) {
exists = true;
} else {
exists = false;
}
} while (exists);
return id;
} catch (e) {
throw e;
}
}
With the code above, the find method will return an array. If the array is empty, means no id is found. When an id is found, it should generate a new one until id is unique.
Also, yes this works though performance wise, are there better options with doing things like this?
I suggest you to use callback function as below. I took an API call to represent your Database request and I put the condition to loop until the string has the character v inside. With that it will work like a charm.
function loadData(callback){
$.ajax({url: "https://helloacm.com/api/random/?n=10", success: function(response){
callback(response);
}});
}
function checkData(response){
if(response.includes("w")){
console.log(response, "good");
} else {
console.log(response, "bad");
loadData(checkData);
}
}
function register(){
loadData(checkData);
}
register();
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
Yes this works, the entire selling point of async await is that you can make promise based code look like regular imperative constructs (such as while loops) just by adding the await keyword whenever you call out to another async function.
Performance wise, you could obviously benefit from generating the random ID on the server so that you always get an ID which is known to be unique in a single call. This is probably not a problem in practice as having more than 1 collision is likely to be very rare if the space of IDs is sufficiently large.
infinity do while is good when you have multiple cases when you need to change the condition of loop. In your case its simple. If found record, return otherwise do it again. However, the function name should represent it function. In your case is register, but actually you retrieving random record. Performance? Well, you are not really saving much here. You have couple async calls which will blocks your script until resolved. Example without dowhile https://stackblitz.com/edit/js-bgyman

Why does node-debug always break at _tickCallback function?

It's kind of stupid question, but, I can't really figure it out for 2 hours and can't find any answer on google.
I'm trying to debug my controller by dropping a break point to my save function, on the line var profile = req.body:
function save(collectionName) {
return function (req, res, next) {
var profile = req.body,
query = {};
...
...
};
}
However, the app always breaks inside _tickCallback function placed in node.js file:
// Run callbacks that have no domain.
// Using domains will cause this to be overridden.
function _tickCallback() {
var callback, threw, tock;
scheduleMicrotasks();
while (tickInfo[kIndex] < tickInfo[kLength]) {
tock = nextTickQueue[tickInfo[kIndex]++];
callback = tock.callback;
threw = true;
try {
callback();
threw = false;
} finally {
if (threw)
tickDone();
}
if (1e4 < tickInfo[kIndex])
tickDone();
}
tickDone();
}
So, I tried to step over until it went out of the function, however, it also resumed the application without going back to my break point. Any help would be really appreciated.
I think this situation happens then you use node-debug command with node 0.12.*.
This is a nodejs bug https://github.com/joyent/node/issues/25266
As a workaround you can use debugger statement (with NI >=0.10.1. I recommend NI 0.11.0), or use iojs

How can I make node wait? or perhaps a different solution?

I am using https://github.com/gpittarelli/node-ssq to query of a bunch of TF2 game servers to find out if they are on, and if so, how many players are inside.
Once I find a server that is on and has less than 6 players in it, I want to use that server's Database ID to insert into somewhere else.
Code looks like this:
for (var i = 0;i < servers.length;i++) {
ssq.info(""+servers[i].ip, servers[i].port, function (err, data) {
serverData = deepCopy(data);
serverError = deepCopy(err);
});
if (!serverError) {
if (serverData.numplayers < 6){
//its ok
currentServer = servers[i].id;
i = 99999;
}
}
else {
if (i == servers.length-1){
currentServer = 666;
}
}
}
And then right after I insert into database with https://github.com/felixge/node-mysql .
If I put a console.log(serverData) in there, the info will show up in the console AFTER it inserted into the DB and did a couple other stuff.
So how do I "stop" node, or should I be looking at this / doing this differently?
Update:
A simple solution here is to just move your if statements inside the callback function:
for (var i = 0;i < servers.length;i++) {
ssq.info(""+servers[i].ip, servers[i].port, function (err, data) {
serverData = deepCopy(data);
serverError = deepCopy(err);
// moving inside the function, so we have access to defined serverData and serverError
if (!serverError) {
if (serverData.numplayers < 6){
//its ok
currentServer = servers[i].id;
i = 99999;
/* add an additional function here, if necessary */
}
}
else {
if (i == servers.length-1){
currentServer = 666;
/* add an additional function here, if necessary */
}
}
});
// serverData and serverError are undefined outside of the function
// because node executes these lines without waiting to see if ``ssq.info``
// has finished.
}
Any additional functions within the callback to ssq.info will have access to variables defined within that function. Do be careful with nesting too many anonymous functions.
Original (nodesque) Answer
If ssq.info is an Asynchronous function (which it seem it is), Node is going to immediately execute it and move on, only dealing with the callback function (which you passed as a last parameter) when ssq.info has finished. That is why your console.log statement is going to execute immediately. This is the beauty/terror of node's asynchronous nature : )
You can use setTimeout to make Node wait, but that will hold up every other process on your server.
The better solution, in my opinion, would be to make use of Node's Event Emiters, to:
watch for an event (in this case, when a player leaves a server)
Check if the number of players is less than 6
If so, execute your query function (using a callback)
A good primer on this is: Mixu's Node Book - Control Flow. Also, see this SO post.
You should use a callback,
connection.query('INSERT INTO table', function(err, rows, fields) {
if (err) throw err;
//entry should be inserted here.
});
also the http://www.sequelizejs.com/ library is a bit more matrue, it could be an implementation problem with node-mysql

Which is a better way of writing callbacks?

Just by seeing what I've wrote now, I can see that one is much smaller, so in terms of code golf Option 2 is the better bet, but as far as which is cleaner, I prefer Option 1. I would really love the community's input on this.
Option 1
something_async({
success: function(data) {
console.log(data);
},
error: function(error) {
console.log(error);
}
});
Option 2
something_async(function(error,data){
if(error){
console.log(error);
}else{
console.log(data);
}
});
They are not exactly the same. Option 2 will still log the (data), whereas Option 1 will only log data on success. (Edit: At least it was that way before you changed the code)
That said, Option 1 is more readable. Programming is not / should not be a competition to see who can write the fewest lines that do the most things. The goal should always be to create maintainable, extendable (if necessary) code --- in my humble opinion.
Many people will find option#1 easier to read and to maintain - two different callback functions for two different purposes. It is commonly used by all Promise Libraries, where two arguments will be passed. Of course, the question Multiple arguments vs. options object is independent from that (while the object is useful in jQuery.ajax, it doesn't make sense for promise.then).
However, option#2 is Node.js convention (see also NodeGuide) and used in many libraries that are influenced by it, for example famous async.js. However, this convention is discussable, top google results I found are WekeRoad: NodeJS Callback Conventions and Stackoverflow: What is the suggested callback style for Node.js libraries?.
The reason for the single callback function with an error argument is that it always reminds the developer to handle errors, which is especially important in serverside applications. Many beginners at clientside ajax functions don't care forget about error handling for example, asking themselves why the success callback doesn't get invoked. On the other hand, promises with then-chaining are based on the optionality of error callbacks, propagating them to the next level - of course it still needs to be catched there.
In all honesty, I prefer to take them one step further, into Promises/Futures/Deferreds/etc...
Or (/and) go into a "custom event" queue, using a Moderator (or an observer/sub-pub, if there is good reason for one particular object to be the source for data).
This isn't a 100% percent of the time thing. Sometimes, you just need a single callback. However, if you have multiple views which need to react to a change (in model data, or to visualize user-interaction), then a single callback with a bunch of hard-coded results isn't appropriate.
moderator.listen("my-model:timeline_update", myView.update);
moderator.listen("ui:data_request", myModel.request);
button.onclick = function () { moderator.notify("ui:data_request", button.value); }
Things are now much less dependent upon one big callback and you can mix and match and reuse code.
If you want to hide the moderator, you can make it a part of your objects:
var A = function () {
var sys = null,
notify = function (msg, data) {
if (sys && sys.notify) { sys.notify(msg, data); }
},
listen = function (msg, callback) {
if (sys && sys.listen) { sys.listen(msg, callback); }
},
attach = function (messenger) { sys = messenger; };
return {
attach : attach
/* ... */
};
},
B = function () { /* ... */ },
shell = Moderator(),
a = A(),
b = B();
a.attach(shell);
b.attach(shell);
a.listen("do something", a.method.bind(a));
b.notify("do something", b.property);
If this looks a little familiar, it's similar behaviour to, say Backbone.js (except that they extend() the behaviour onto objects, and others will bind, where my example has simplified wrappers to show what's going on).
Promises would be the other big-win for usability, maintainable and easy to read code (as long as people know what a "promise" is -- basically it passes around an object which has the callback subscriptions).
// using jQuery's "Deferred"
var ImageLoader = function () {
var cache = {},
public_function = function (url) {
if (cache[url]) { return cache[url].promise(); }
var img = new Image(),
loading = $.Deferred(),
promise = loading.promise();
img.onload = function () { loading.resolve(img); };
img.onerror = function () { loading.reject("error"); };
img.src = url;
cache[url] = loading;
return promise;
};
return public_function;
};
// returns promises
var loadImage = ImageLoader(),
myImg = loadImage("//site.com/img.jpg");
myImg.done( lightbox.showImg );
myImg.done( function (img) { console.log(img.width); } );
Or
var blog_comments = [ /* ... */ ],
comments = BlogComments();
blog_comments.forEach(function (comment) {
var el = makeComment(comment.author, comment.text),
img = loadImage(comment.img);
img.done(el.showAvatar);
comments.add(el);
});
All of the cruft there is to show how powerful promises can be.
Look at the .forEach call there.
I'm using Image loading instead of AJAX, because it might seem a little more obvious in this case:
I can load hundreds of blog comments, if the same user makes multiple posts, the image is cached, and if not, I don't have to wait for images to load, or write nested callbacks. Images load in any order, but still appear in the right spots.
This is 100% applicable to AJAX calls, as well.
Promises have proven to be the way to go as far as async and libraries like bluebird embrace node-style callbacks (using the (err, value) signature). So it seems beneficial to utilize node-style callbacks.
But the examples in the question can be easily be converted into either format with the functions below. (untested)
function mapToNodeStyleCallback(callback) {
return {
success: function(data) {
return callback(null, data)
},
error: function(error) {
return callback(error)
}
}
}
function alterNodeStyleCallback(propertyFuncs) {
return function () {
var args = Array.prototype.slice.call(arguments)
var err = args.shift()
if (err) return propertyFuncs.err.apply(null, [err])
return propertyFuncs.success.apply(null, args)
}
}

Categories