I am a beginner in Node js and was wondering if someone could help me out.
Winston allows you to pass in a callback which is executed when all transports have been logged - could someone explain what this means as I am slightly lost in the context of callbacks and Winston?
From https://www.npmjs.com/package/winston#events-and-callbacks-in-winston I am shown an example which looks like this:
logger.info('CHILL WINSTON!', { seriously: true }, function (err, level, msg, meta) {
// [msg] and [meta] have now been logged at [level] to **every** transport.
});
Great... however I have several logger.info across my program, and was wondering what do I put into the callback? Also, do I need to do this for every logger.info - or can I put all the logs into one function?
I was thinking to add all of the log call into an array, and then use async.parallel so they all get logged at the same time? Good or bad idea?
The main aim is to log everything before my program continues with other tasks.
Explanation of the code above in callback and winston context would be greatly appreciated!
Winston allows you to pass in a callback which is executed when all transports have been logged
This means that if you have a logger that handles more than one transport (for instance, console and file), the callback will be executed only after the messages have been logged on all of them (in this case, on both the console and the file).
An I/O operation on a file will always take longer than just outputting a message on the console. Winston makes sure that the callback will be triggered, not at the end of the first transport logging, but at the end of the last one of them (that is, the one that takes longest).
You don't need to use a callback for every logger.info, but in this case it can help you make sure everything has been logged before continuing with the other tasks:
var winston = require('winston');
winston.add(winston.transports.File, { filename: './somefile.log' });
winston.level = 'debug';
const tasks = [x => {console.log('task1');x();},x => {console.log('task2');x();},x => {console.log('task3');x()}];
let taskID = 0;
let complete = 0;
tasks.forEach(task => {
task(() => winston.debug('CHILL WINSTON!', `logging task${++taskID}`, waitForIt));
});
function waitForIt() {
// Executed every time a logger has logged all of its transports
if (++complete===tasks.length) nowGo();
};
function nowGo() {
// Now all loggers have logged all of its transports
winston.log('debug', 'All tasks complete. Moving on!');
}
Sure... you probably won't define tasks that way, but just to show one way you could launch all the tasks in parallel and wait until everythings has been logged to continue with other tasks.
Just to explain the example code:
The const tasks is an array of functions, where each one accepts a function x as a parameter, first performs the task at hand, in this case a simple console.log('task1'); then executes the function received as parameter, x();
The function passed as parameter to each one of those functions in the array is the () => winston.debug('CHILL WINSTON!',`logging task${++taskID}`, waitForIt)
The waitForIt, the third parameter in this winston.debug call, is the actual callback (the winston callback you inquired about).
Now, taskID counts the tasks that have been launched, while complete counts the loggers that have finished logging.
Being async, one could launch them as 1, 2, 3, but their loggers could end in a 1, 3, 2 sequence, for all we know. But since all of them will trigger the waitForIt callback once they're done, we just count how many have finished, then call the nowGo function when they all are done.
Compare it to
var winston = require('winston');
var logger = new winston.Logger({
level:'debug',
transports: [
new (winston.transports.Console)(),
new (winston.transports.File)({filename: './somefile.log'})
]
});
const tasks = [x => {console.log("task1");x();},x => {console.log("task2");x();},x => {console.log("task3");x()}];
let taskID = 0;
let complete = 0;
tasks.forEach(task => {
task(() => logger.debug('CHILL WINSTON!', `logging task${++taskID}`, (taskID===tasks.length)? nowGo : null));
});
logger.on('logging', () => console.log(`# of complete loggers: ${++complete}`));
function nowGo() {
// Stop listening to the logging event
logger.removeAllListeners('logging');
// Now all loggers have logged all of its transports
logger.debug('All tasks complete. Moving on!');
}
In this case, the nowGo would be the callback, and it would be added only to the third logger.debug call. But if the second logger finished later than the third, it would have continued without waiting for the second one to finish logging.
In such simple example it won't make a difference, since all of them finish equally fast, but I hope it's enough to get the concept.
While at it, let me recommend the book Node.js Design Patterns by Mario Casciaro for more advanced async flow sequencing patterns. It also has a great EventEmitter vs callback comparison.
Hope this helped ;)
Related
I am using npm ws module (or actually the wrapper called isomorphic-ws) for websocket connection.
NPM Module: isomorphic-ws
I use it to receive some array data from a websocket++ server running on the same PC. This data is then processed and displayed as a series of charts.
Now the problem is that the handling itself takes a very long time. I use one message to calculate 16 charts and for each of them I need to calculate a lot of logarithms and other slow operations and all that in JS. Well, the whole refresh operation takes about 20 seconds.
Now I could actually live with that, but the problem is that when I get a new request it is processed after the whole message handler is finished. And if I get several requests in the meantime, all of them shall be processed as they came in. And so the requests are there queued and the current state gets more and more outdated as the time goes on...
I would like to have a way of detecting that there is another message waiting to be processed. If that is the case I could just stop the current handler at any time and start over... So when using npm ws, is there a way of telling that there is another message waiting in to be processed?
Thanks
You need to create some sort of cancelable job wrapper. It's hard to give a concrete suggestion without seeing your code. But it could be something like this.
const processArray = array => {
let canceled = false;
const promise = new Promise((resolve, reject) => {
// do something with the array
for(let i = 0; i < array.length; i++) {
// check on each iteration if the job has been canceled
if(canceled) return reject({ reason: 'canceled' });
doSomething(array[i])
}
resolve(result)
})
return {
cancel: () => {
cancel = true
},
promise
}
}
const job = processArray([1, 2, 3, ...1000000]) // huge array
// handle the success
job.promise.then(result => console.log(result))
// Cancel the job
job.cancel()
I'm sure there are libraries to serve this exact purpose. But I just wanted to give a basic example of how it could be done.
Threading-wise, what's the difference between web workers and functions declared as
async function xxx()
{
}
?
I am aware web workers are executed on separate threads, but what about async functions? Are such functions threaded in the same way as a function executed through setInterval is, or are they subject to yet another different kind of threading?
async functions are just syntactic sugar around
Promises and they are wrappers for callbacks.
// v await is just syntactic sugar
// v Promises are just wrappers
// v functions taking callbacks are actually the source for the asynchronous behavior
await new Promise(resolve => setTimeout(resolve));
Now a callback could be called back immediately by the code, e.g. if you .filter an array, or the engine could store the callback internally somewhere. Then, when a specific event occurs, it executes the callback. One could say that these are asynchronous callbacks, and those are usually the ones we wrap into Promises and await them.
To make sure that two callbacks do not run at the same time (which would make concurrent modifications possible, which causes a lot of trouble) whenever an event occurs the event does not get processed immediately, instead a Job (callback with arguments) gets placed into a Job Queue. Whenever the JavaScript Agent (= thread²) finishes execution of the current job, it looks into that queue for the next job to process¹.
Therefore one could say that an async function is just a way to express a continuous series of jobs.
async function getPage() {
// the first job starts fetching the webpage
const response = await fetch("https://stackoverflow.com"); // callback gets registered under the hood somewhere, somewhen an event gets triggered
// the second job starts parsing the content
const result = await response.json(); // again, callback and event under the hood
// the third job logs the result
console.log(result);
}
// the same series of jobs can also be found here:
fetch("https://stackoverflow.com") // first job
.then(response => response.json()) // second job / callback
.then(result => console.log(result)); // third job / callback
Although two jobs cannot run in parallel on one agent (= thread), the job of one async function might run between the jobs of another. Therefore, two async functions can run concurrently.
Now who does produce these asynchronous events? That depends on what you are awaiting in the async function (or rather: what callback you registered). If it is a timer (setTimeout), an internal timer is set and the JS-thread continues with other jobs until the timer is done and then it executes the callback passed. Some of them, especially in the Node.js environment (fetch, fs.readFile) will start another thread internally. You only hand over some arguments and receive the results when the thread is done (through an event).
To get real parallelism, that is running two jobs at the same time, multiple agents are needed. WebWorkers are exactly that - agents. The code in the WebWorker therefore runs independently (has it's own job queues and executor).
Agents can communicate with each other via events, and you can react to those events with callbacks. For sure you can await actions from another agent too, if you wrap the callbacks into Promises:
const workerDone = new Promise(res => window.onmessage = res);
(async function(){
const result = await workerDone;
//...
})();
TL;DR:
JS <---> callbacks / promises <--> internal Thread / Webworker
¹ There are other terms coined for this behavior, such as event loop / queue and others. The term Job is specified by ECMA262.
² How the engine implements agents is up to the engine, though as one agent may only execute one Job at a time, it very much makes sense to have one thread per agent.
In contrast to WebWorkers, async functions are never guaranteed to be executed on a separate thread.
They just don't block the whole thread until their response arrives. You can think of them as being registered as waiting for a result, let other code execute and when their response comes through they get executed; hence the name asynchronous programming.
This is achieved through a message queue, which is a list of messages to be processed. Each message has an associated function which gets called in order to handle the message.
Doing this:
setTimeout(() => {
console.log('foo')
}, 1000)
will simply add the callback function (that logs to the console) to the message queue. When it's 1000ms timer elapses, the message is popped from the message queue and executed.
While the timer is ticking, other code is free to execute. This is what gives the illusion of multithreading.
The setTimeout example above uses callbacks. Promises and async work the same way at a lower level — they piggyback on that message-queue concept, but are just syntactically different.
Workers are also accessed by asynchronous code (i.e. Promises) however Workers are a solution to the CPU intensive tasks which would block the thread that the JS code is being run on; even if this CPU intensive function is invoked asynchronously.
So if you have a CPU intensive function like renderThread(duration) and if you do like
new Promise((v,x) => setTimeout(_ => (renderThread(500), v(1)),0)
.then(v => console.log(v);
new Promise((v,x) => setTimeout(_ => (renderThread(100), v(2)),0)
.then(v => console.log(v);
Even if second one takes less time to complete it will only be invoked after the first one releases the CPU thread. So we will get first 1 and then 2 on console.
However had these two function been run on separate Workers, then the outcome we expect would be 2 and 1 as then they could run concurrently and the second one finishes and returns a message earlier.
So for basic IO operations standard single threaded asynchronous code is very efficient and the need for Workers arises from need of using tasks which are CPU intensive and can be segmented (assigned to multiple Workers at once) such as FFT and whatnot.
Async functions have nothing to do with web workers or node child processes - unlike those, they are not a solution for parallel processing on multiple threads.
An async function is just1 syntactic sugar for a function returning a promise then() chain.
async function example() {
await delay(1000);
console.log("waited.");
}
is just the same as
function example() {
return Promise.resolve(delay(1000)).then(() => {
console.log("waited.");
});
}
These two are virtually indistinguishable in their behaviour. The semantics of await or a specified in terms of promises, and every async function does return a promise for its result.
1: The syntactic sugar gets a bit more elaborate in the presence of control structures such as if/else or loops which are much harder to express as a linear promise chain, but it's still conceptually the same.
Are such functions threaded in the same way as a function executed through setInterval is?
Yes, the asynchronous parts of async functions run as (promise) callbacks on the standard event loop. The delay in the example above would implemented with the normal setTimeout - wrapped in a promise for easy consumption:
function delay(t) {
return new Promise(resolve => {
setTimeout(resolve, t);
});
}
I want to add my own answer to my question, with the understanding I gathered through all the other people's answers:
Ultimately, all but web workers, are glorified callbacks. Code in async functions, functions called through promises, functions called through setInterval and such - all get executed in the main thread with a mechanism akin to context switching. No parallelism exists at all.
True parallel execution with all its advantages and pitfalls, pertains to webworkers and webworkers alone.
(pity - I thought with "async functions" we finally got streamlined and "inline" threading)
Here is a way to call standard functions as workers, enabling true parallelism. It's an unholy hack written in blood with help from satan, and probably there are a ton of browser quirks that can break it, but as far as I can tell it works.
[constraints: the function header has to be as simple as function f(a,b,c) and if there's any result, it has to go through a return statement]
function Async(func, params, callback)
{
// ACQUIRE ORIGINAL FUNCTION'S CODE
var text = func.toString();
// EXTRACT ARGUMENTS
var args = text.slice(text.indexOf("(") + 1, text.indexOf(")"));
args = args.split(",");
for(arg of args) arg = arg.trim();
// ALTER FUNCTION'S CODE:
// 1) DECLARE ARGUMENTS AS VARIABLES
// 2) REPLACE RETURN STATEMENTS WITH THREAD POSTMESSAGE AND TERMINATION
var body = text.slice(text.indexOf("{") + 1, text.lastIndexOf("}"));
for(var i = 0, c = params.length; i<c; i++) body = "var " + args[i] + " = " + JSON.stringify(params[i]) + ";" + body;
body = body + " self.close();";
body = body.replace(/return\s+([^;]*);/g, 'self.postMessage($1); self.close();');
// CREATE THE WORKER FROM FUNCTION'S ALTERED CODE
var code = URL.createObjectURL(new Blob([body], {type:"text/javascript"}));
var thread = new Worker(code);
// WHEN THE WORKER SENDS BACK A RESULT, CALLBACK AND TERMINATE THE THREAD
thread.onmessage =
function(result)
{
if(callback) callback(result.data);
thread.terminate();
}
}
So, assuming you have this potentially cpu intensive function...
function HeavyWorkload(nx, ny)
{
var data = [];
for(var x = 0; x < nx; x++)
{
data[x] = [];
for(var y = 0; y < ny; y++)
{
data[x][y] = Math.random();
}
}
return data;
}
...you can now call it like this:
Async(HeavyWorkload, [1000, 1000],
function(result)
{
console.log(result);
}
);
I have a redux saga setup where for some reason my channel blocks when I'm trying to take from it and I can't work out why.
I have a PubSub mechanism which subscribes to an event and when received calls this function:
const sendRoundEnd = (msg, data) => {
console.log('putting round end')
roundEndChannel.put({
type: RUN_ENDED,
data: data.payload
})
}
I have a watcher for this channel defined like this:
function* watchRoundEndChannel() {
while(true) {
console.log('before take')
const action = yield take(roundEndChannel)
console.log('after take')
yield put(action)
}
}
And I have a reducer setup which is listening for the put of RUN_ENDED like this:
case RUN_ENDED:
console.log(action)
return {
...state,
isRunning: false,
roundResult: action.data
}
Finally, I have a roundEndChannel const within the file (but not within the functions) and I export the following function as part of an array which is fed into yield all[]:
takeEvery(roundEndChannel, watchRoundEndChannel)
So if my understanding is right, when I get the msg from my pubsub I should first hit sendRoundEnd which puts to roundEndChannel which should in turn put the RUN_ENDED action.
What's weird however is that when I run these functions and receive the message from the pubsub, the following is logged:
putting round end
before take
I never get to the after take which suggests to me that the channel doesn't have anything in it, but I'm pretty sure that isn't the case as it should have been put to in the event handler of the pubsub immediately prior.
It feels like I'm missing something simple here, does anyone have any ideas (or ways I can examine the channel at different points to see what's in there?)
Arg, managed to fix this. The problem was I had exported the watchRoundEndChannel wrapped in a takeEvery which was snatching my pushed events up.
I exported the function like this fork(watchRoundEndChannel) and things work as I expected.
i've written code in node.js and my data is on Firebase. The problem i'm facing is that my code never exits. I've done it like this one Link
The problem is that firebase referance/listener never become null and therefore my function never exits. I tried using firebase.database().goOffline() but it didn't work.
On my local machine i forcefully stopped the process using process.exit(0), but when i deployed my code on AWS lambda, it doesn't return any response/call back and exits (giving error message "Process exited before completing request")
I also added wait of 5-10 seconds after invoking callback in lambda and then forcefully exited the process, but it didn't help either.
How to fix this issue? Please help.
Your going through crisis that any new lambda user has gone.
As suggested, you can use context.done for stopping.
However, this is not recommended as this is only possible due to historic runtime versions of nodejs.
why this timeout happens?
Your lambda may get to the last line of your code and still keep running. Well, it is actually waiting for something - for the event loop to be empty.
what this means?
In nodejs, when you make an async operation and register a callback function to be executed once the operation is done, the registration sort of happens in the event loop.
In one line, it's the event loop that knows which callback function to execute when an async operation ends. But that's to another thread :)
back to Lambda
Given the above information, it follows that lambda should not halt before empty event loop is reached - as this means some follow-up procedure will not execute after some async operation returns.
What if you still need to halt the execution manually? regardless of the event loop status?
At the beginning of the function, execute:
context.callbackWaitsForEmptyEventLoop = false
And then use the third parameter you get in the handler signature. Which is the callback.
the callback parameter
It is a function which you call when you want to end the execution.
If you call it with no parameters, or with the first parameter as null and text as second parameter - it is considered as a successful invocation.
To fail the lambda execution, you can call the callback function with some non-null value as the first parameter.
Add this line at the beginning of your handler function and then you should be able to use the callback without issue:
function handler (event, context, callback) {
context.callbackWaitsForEmptyEventLoop = false // Add this line
}
Setting callbackWaitsForEmptyEventLoop to false should only be your last resort if nothing else works for you, as this might introduce worse bugs than the problem you're trying to solve here.
This is what I do instead to ensure every call has firebase initialized, and deleted before exiting.
// handlerWithFirebase.js
const admin = require("firebase-admin");
const config = require("./config.json");
function initialize() {
return admin.initializeApp({
credential: admin.credential.cert(config),
databaseURL: "https://<your_app>.firebaseio.com",
});
}
function handlerWithFirebase(func) {
return (event, context, callback) => {
const firebase = initialize();
let _callback = (error, result) => {
firebase.delete();
callback(error, result);
}
// passing firebase into your handler here is optional
func(event, context, _callback, firebase /*optional*/);
}
}
module.exports = handlerWithFirebase;
And then in my lambda handler code
// myHandler.js
const handlerWithFirebase = require("../firebase/handler");
module.exports.handler = handlerWithFirebase(
(event, context, callback, firebase) => {
...
});
Calling callbackfunciton and then process.exit(0) didn't help in my case. goOffline() method of firebase didn't help either.
I fixed the issue calling context.done(error, response) (instead of callback method). Now, my code is working.
Still, if any one have better solution, kindly post here. It may help some one else :)
I've been learning about continuation passing style, particularly the asynchronous version as implemented in javascript, where a function takes another function as a final argument and creates an asychronous call to it, passing the return value to this second function.
However, I can't quite see how continuation-passing does anything more than recreate pipes (as in unix commandline pipes) or streams:
replace('somestring','somepattern', filter(str, console.log));
vs
echo 'somestring' | replace 'somepattern' | filter | console.log
Except that the piping is much, much cleaner. With piping, it seems obvious that the data is passed on, and simultaneously execution is passed to the receiving program. In fact with piping, I expect the stream of data to be able to continue to pass down the pipe, whereas in CPS I expect a serial process.
It is imaginable, perhaps, that CPS could be extended to continuous piping if a comms object and update method was passed along with the data, rather than a complete handover and return.
Am I missing something? Is CPS different (better?) in some important way?
To be clear, I mean continuation-passing, where one function passes execution to another, not just plain callbacks. CPS appears to imply passing the return value of a function to another function, and then quitting.
UNIX pipes vs async javascript
There is a big fundamental difference between the way unix pipes behave vs the async CPS code you link to.
Mainly that the pipe blocks execution until the entire chain is completed whereas your async CPS example will return right after the first async call is made, and will only execute your callback when it is completed. (When the timeout wait is completed, in your example.)
Take a look at this example. I will use the Fetch API and Promises to demonstrate async behavior instead of setTimeout to make it more realistic. Imagine that the first function f1() is responsible for calling some webservice and parsing the result as a json. This is "piped" into f2() that processes the result.
CPS style:
function f2(json){
//do some parsing
}
function f1(param, next) {
return fetch(param).then(response => response.json()).then(json => next(json));
}
// you call it like this:
f1("https://service.url", f2);
You can write something that syntactically looks like a pipe if you move call to f2 out of f1, but that will do exactly the same as above:
function f1(param) {
return fetch(param).then(response => response.json());
}
// you call it like this:
f1("https://service.url").then(f2);
But this still will not block. You cannot do this task using blocking mechanisms in javascript, there is simply no mechanism to block on a Promise. (Well in this case you could use a synchronous XMLHttpRequest, but that's not the point here.)
CPS vs piping
The difference between the above two methods is that who has the control to decide whether to call the next step and with exactly what paramters, the caller (later example) or the called function (CPS).
A good example where CPS comes very handy is middleware. Think about a caching middleware for example in a processing pipeline. Simplified example:
function cachingMiddleware(request, next){
if(someCache.containsKey(request.url)){
return someCache[request.url];
}
return next(request);
}
The middleware executes some logic, checks if the cache is still valid:
If it is not, then next is called, which then will proceed on with the processing pipeline.
If it is valid then the cached value is returned, skipping the next execution.
Continuation Passing Style at application level
Instead of comparing at an expression/function-block level, factoring Continuation Passing Style at an application level can provide an avenue for flow control advantages through its "continuation" function (a.k.a. callback function). Lets take Express.js for example:
Each express middleware takes a rather similar CPS function signature:
const middleware = (req, res, next) => {
/* middleware's logic */
next();
}
const customErrorHandler = (error, req, res, next) => {
/* custom error handling logic*/
};
next is express's native callback function.
Correction: The next() function is not a part of the Node.js or Express API, but is the third argument that is passed to the middleware function. The next() function could be named anything, but by convention it is always named “next”
req and res are naming conventions for HTTP request and HTTP response respectively.
A route handler in Express.JS would be made up of one or more middleware functions. Express.js will pass each of them the req, res objects with changes made by the preceding middleware to the next, and an identical next callback.
app.get('/get', middlware1, middlware2, /*...*/ , middlewareN, customErrorHandler)
The next callback function serves:
As a middleware's continuation:
Calling next() passes the execution flow to the next middleware function. In this case it fulfils its role as a continuation.
Also as a route interceptor:
Calling next('Custom error message') bypasses all subsequent middlewares and passes the execution control to customErrorHandler for error handling. This makes 'cancellation' possible in the middle of the route!
Calling next('route') bypasses subsequent middlewares and passes control to the next matching route eg. /get/part.
Imitating Pipe in JS
There is a TC39 proposal for pipe , but until it is accepted we'll have to imitate pipe's behaviour manually. Nesting CPS functions can potentially lead to callback hell, so here is my attempt for cleaner code:
Assuming that we want to compute a sentence 'The fox jumps over the moon' by replacing parts of a starter string (e.g props)
const props = " The [ANIMAL] [ACTION] over the [OBJECT] "
Every function to replace different parts of the string are sequenced with an array
const insertFox = s => s.replace(/\[ANIMAL\]/g, 'fox')
const insertJump = s => s.replace(/\[ACTION\]/g, 'jumps')
const insertMoon = s => s.replace(/\[OBJECT\]/g, 'moon')
const trim = s => s.trim()
const modifiers = [insertFox, insertJump, insertMoon, trim]
We can achieve a synchronous, non-streaming, pipe behaviour with reduce.
const pipeJS = (chain, callBack) => seed =>
callBack(chain.reduce((acc, next) => next(acc), seed))
const callback = o => console.log(o)
pipeJS(modifiers, callback)(props) //-> 'The fox jumps over the moon'
And here is the asynchronous version of pipeJS;
const pipeJSAsync = chain => async seed =>
await chain.reduce((acc, next) => next(acc), seed)
const callbackAsync = o => console.log(o)
pipeJSAsync(modifiers)(props).then(callbackAsync) //-> 'The fox jumps over the moon'
Hope this helps!