I want to understand Event Loop better. I read documents, articles, Node.js' API docs. Almost all of them separate Timers:
setImmediate():
setImmediate(callback[, arg][, ...])
To schedule the "immediate"
execution of callback after I/O events callbacks and before setTimeout
and setInterval .
process.nextTick():
process.nextTick(callback[, arg][, ...])#
This is not a simple alias to setTimeout(fn, 0), it's much more
efficient. It runs before any additional I/O events (including
timers) fire in subsequent ticks of the event loop.
Why? What is so exceptional about timer functions in Node.js in the context of Event Loop?
These all relate to having really fine grained control over the asynchronous execution of the callback.
The nextTick() function is executed the soonest to the point where it's called. It got its name from event loop "tick". Tick represents full turn of the event loop where different sort of events such as timers, io, network are exhausted once. Even though the name today is confusing since it has changed semantics over node versions. The nextTick() callback is executed in the same tick as the calling code. Adding more nextTick()s in the nextTick() callback function aggregates them and they are called in the same tick.
The setImmediate() gives the second closest execution. It's almost identical to setTimeout(0) but it is called before all traditional setTimeout() and setInterval() timers. It is processed as the first thing at the beginning of the next tick. You get sort of a fast lane of asynchronous execution that is privileged over the traditional setInterval() and setTimeout(). Adding more setImmediate() callbacks in setImmediate() itself defers them to next tick and they are not executed in the same tick in contrast to how nextTick()s are. This setImmediate() functionality was originally the semantics for nextTick(), hence the name nextTick().
The setTimeout() and setInterval() work then as expected, having the third closest exeuction point (or later if the timeouts are long).
Put simply, the order they called is:nextTick()-->setImmediate()-->setTimeout(fn, 0)
Related
So I kind of understand the JS event loop, but still have a few questions. Here is my scenario and a few questions.
So let's say I have these functions:
function1 - reads an absolute huge file.
function2 - console.log("Hey");
function3 - console.log("What's up");
The way I am understanding this, and correct me if I'm wrong, what would happen is that the function1, function2, and function3, would be added to the queue. Then function1 would be added to the call stack followed by the next two functions.
Now the part where I'm confused is because the first function is going to take an extremely long time what happens to it? Does it get pushed somewhere else so that the next two functions are executed? I think the answer to this is that the only way it gets pushed somewhere else so that you can continue running is to make it an asynchronous function. And the way you make it a asynchronous function is either by using a callback function or promises. If this is the case how does it know that this is a asynchronous function? And where does it get pushed to so that the other two functions can be executed since they are relatively simple?
I think I answered the question myself but I keep confusing myself so if somebody could explain in extremely simple terms that would be great and sorry for the extremely stupid question.
Ordinary function calls are not pushed on the event queue, they're just executed synchronously.
Certain built-in functions initiate asynchronous operations. For instance, setTimeout() creates a timer that will execute the function asynchronously at a future time. fetch() starts an AJAX request, and returns a promise that will resolve when the response is received. addEventListener() creates a listener that will call the function when the specified event occurs on an element.
In all these cases, what effectively happens is that the callback function is added to the event queue when the corresponding condition is reached.
When one of these functions is called, it runs to completion. When the function returns, the event loop pulls the next item off the event queue and runs its callback, and so on.
If the event queue is empty, the event loop just idles until something is added. So when you're just starting at a simple web page, nothing may happen until you click on something that has an event listener, then its listener function will run.
In interactive applications like web pages, we try to avoid writing functions that take a long time to run to completion, because it blocks the user interface (other asynchronous actions can't interrupt it). So if you're going to read a large file, you use an API that reads it incrementally, calling an event listener for each block. That will allow other functions to run between processing of each block.
There's nothing specific that identifies asynchronous functions, it's just part of the definition of each function. You can't say that any function that has a callback argument is asynchronous, because functions like Array.forEach() are synchronous. And promises don't make something asychronous -- you can create a promise that resolves synchronously, although there's not usually a point to it (but you might do this as a stub when the caller expects a promise in the general case). The keyword async before a function definition just wraps its return value in a promise, it doesn't actually make it run asynchronously.
I'm learning the mechanism of Event-Loop in Node.js, and I'm doing some exercises, but have some confusions as explained bellow.
const fs = require("fs");
setTimeout(() => console.log("Timer 1"), 0);
setImmediate(() => console.log("Immediate 1"));
fs.readFile("test-file-with-1-million-lines.txt", () => {
console.log("I/O");
setTimeout(() => console.log("Timer 2"), 0);
setTimeout(() => console.log("Timer 3"), 3000);
setImmediate(() => console.log("Immediate 2"));
});
console.log("Hello");
I expected to see the following output:
Hello
Timer 1
Immediate 1
I/O
Timer 2
Immediate 2
Timer 3
but I get the following output:
Hello
Timer 1
Immediate 1
I/O
Immediate 2
Timer 2
Timer 3
Would you please clarify for me how are these lines executed step by step.
First off, I should mention that if you really want asynchronous operation A to be processed in a specific order with relation to asynchronous operation B, you should probably write your code such that it guarantees that without relying on the details of exactly what gets to run first. But, that said, I have run into issues where one type of asynchronous operation can "hog" the event loop and starve other types of events and it can be useful to understand what's really going on inside if/when that happens.
Broken down to its core, your question is really about why Immediate2 logs before Timer2 when scheduled from within an I/O callback, but not when called from top level code? Thus it is inconsistent.
This has to do with where the event loop is in its cycle through various checks it is doing when the setTimeout() and setImmediate() are called (when they are scheduled). It is somewhat explained here: https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/#setimmediate-vs-settimeout.
If you look at this somewhat simplified diagram of the event loop (from the above article):
You can see that there are a number of different parts to the event loop cycle. setTimeout() is served by the "timers" block at the top of the diagram. setImmediate() is served in the "check" block near the bottom of the diagram. File I/O is served in the "poll" block in the middle.
So, if you schedule both a setImmediate(fn1) and a setTimeout(fn2, 0) from within a file I/O callback (which is your case for Intermediate2 and Timer2), then the event loop processing happens to be in the poll phase when these two are scheduled. So, the next phase of the event loop is the "check" phase and the setImmediate(fn1) gets processed. Then, after the "check" phase and the "close callbacks" phase, then it cycles back around to the "timers" phase and you get the setTimeout(fn2,0).
If, on the other hand, you call those same two setImmediate() and setTimeout() from code that runs from a different phase of the event loop, then the timer might get processed first before the setImmediate() - it will depend upon exactly where that code was executed from in the event loop cycle.
This structure of the event loop is why some people describe setImmediate() as "it runs right after I/O" because it's positioned in the loop to be processed right after the "poll" phase. If you are in the middle of processing some file I/O in an I/O callback and you want something to run as soon as the stack unwinds, you can use setImmediate() to accomplish that. It will always run after the current I/O callback finishes, but before timers.
Note: Missing from this simplified description is promises which have their own special treatment. Promises are considered microtasks and they have their own queue. They get to run a lot more often. Starting with node v11, they get to run in every phase of the event loop. So, if you have three pending timers that are ready to run and you get to the timer phase of the event loop and call the callback for the first pending timer and in that timer callback, you resolve a promise, then as soon as that timer callback returns back to the system, then it will serve that resolved promise. So, microtasks (such as promises and process.nextTick()) get served (if waiting to run) between every operation in the event loop, not just between phases of the event loop, but even between pending events in the same phase. You can read more about these specifics and the changes in node v11 here: New Changes to the Timers and Microtasks in Node v11.0.0 and above.
I believe this was done to improve the performance of promise-related code as promises became more of a central part of the nodejs architecture for asynchronous operations and there is also some standards-related work in this area too to make this consistent across different JS envrionments.
Here's another reference that covers part of this:
Nodejs Event Loop - interaction with top-level code
The reason for this output is the asynchronous nature of javascript.
You set the first 2 outputs in a sort of timeout with the execution time to be 0 this makes them still wait a tick.
Next you have the file read which takes a while to be finished and thus delays the execution of the functions in the callback
The first console.log within the callback is fired as soon as the callback is executed and the rest within the callback follows the first part of your code
Lastly you have the console.log at the bottom which gets executed at first because there is no delay for it and it does not need to wait till the next tick.
As some added help, check out this video.
https://youtu.be/cCOL7MC4Pl0
The presenter gives an amazing talk on the event loop. I think it is a great resource.
While this is particularly for the browser, many aspects are shared in Node.
I am pretty new to JS event loop, I wonder if anyone could give me a brief walk thru about how js engine run this:
function start(){
setTimeout(function(){
console.log("Timeout")
}, 0)
setImmediate(function(){
console.log("Immediate")
})
process.nextTick(function(){
console.log("next tick")
})
}
The result is :
next tick
Timeout
Immediate
I thought when JS engine runs this,
it parses the script, set Timer, set Immediate, set nextTick queue, then goes into POLL stage, checks if anything queued there(nothing in this case)
Before moving to CHECK stage, it runs nextTick queue, print "next tick".
Moves to CHECK stage, run immediate queue, print "Immediate"
Loops back to TIMER stage, print "Timeout"
My confuse is why setTimeout print out before Immediate?
PS,
After I set Timeout delay from 0 to 3 or more, then the order is:
next tick
Immediate
Timeout
But this still does not explain why previous order JS event loop runs.
I wonder is there anything I missed?
setImmediate queues the function behind whatever I/O, event, callbacks that are already in the event queue. So, in this case setTimeout is already in queue.
I think you should read https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/ again. It has answer to all your questions. I am quoting following lines from above mentioned document
setImmediate() vs setTimeout()
setImmediate and setTimeout() are similar, but behave in different ways depending on when they are called.
setImmediate() is designed to execute a script once the current poll phase completes.
setTimeout() schedules a script to be run after a minimum threshold in ms has elapsed.
The order in which the timers are executed will vary depending on the context in which they are called.
Understanding process.nextTick()
process.nextTick() is not technically part of the event loop. Instead, the nextTickQueue will be processed after the current operation completes, regardless of the current phase of the event loop.
Looking back at our diagram, any time you call process.nextTick() in a given phase, all callbacks passed to process.nextTick() will be resolved before the event loop continues. This can create some bad situations because it allows you to "starve" your I/O by making recursive process.nextTick() calls, which prevents the event loop from reaching the poll phase.
I was going through node docs for event loop and I got very confused.
It says -
timers: this phase executes callbacks scheduled by setTimeout() and
setInterval().
I/O callbacks: executes almost all callbacks with the exception of close callbacks, the ones scheduled by timers, and setImmediate().
idle, prepare: only used internally.
poll: retrieve new I/O events; node will block here when appropriate.
check: setImmediate() callbacks are invoked here.
close callbacks: e.g. socket.on('close', ...).
Then in detailed poll phase, they say that it executes timers scheduled with timer and also process i/o events in poll queue. My confusion is taht we already have timer phase and i/o callback phase for those callbacks, then what is the work done by poll phase. It also says that thread may sleep in poll phase but I don't get it properly.
My questions are-
Why poll phase is executing scripts for timers and i/o(s) when we already have timer and i/o callback phase ?
Is it like poll phase executes callbacks on behalf of timer and i/o callback phase and timer and callback phase is only for internal processing no callbacks are executed in this phase ?
Where can we place promises in this loop ? Earlier I thought that promises can be thought simply as callbacks and we can treat them like callbacks only, but in this video, he says that promises goes into an internal event loop, but does not talk in detail.
I am very confused at this point. Any help will be appreciated.
The poll phase boils down to an asynchronous I/O wait. Libuv will use different APIs depending on the OS but they all generally have the same pattern. I'm going to use select() as an example.
The poll is basically a system call like this:
select(maxNumberOfIO, readableIOList, writableIOList, errorIOList, timeout);
This function blocks. If no timeout value is specified it blocks forever.
The result is that node.js will not be able to execute any javascript as long as there is no I/O activity. This obviously makes it impossible to execute time-based callbacks like setTimeout() or setInterval().
Therefore, what node needs to do before calling such a function is to calculate what value to pass as timeout. It generally does this by going through the list of all timers and figure out the shortest amount of time it can wait for I/O (the next nearest timer) and use that as the timeout value. It basically processes all the timers but not to execute their callbacks, it does it to figure out the wait time.
Nodejs has 5 major phases.
1) timers phase.
2) pending call back phase.
3) poll phase
4) check (set immediate).
5) close
Answer to your questions.
1)The call backs to timers and check phase are executed in their respective phases and not in poll phase.
2)All the I/o related call backs and other are executed in the poll phase. The pending call back phase is only for system level callbacks like tcp errors, none of our concern
3)After each phase, node js has an internal event loop which resolves all the process.nextTick callbacks, and another smaller event loop which executes the resolved promises then callbacks i.e Promise.resolve.then() callbacks.
I was just reading about that myself. As far as the timers are concerned the documentation about the event loop gives a decent answer in the form of an example. Say a setTimeout timer is set to trigger after 100ms but an I/O process is in progress (in the polling phase) and requires more than 100ms to execute, say 150ms. Once it is finished the polling phase will then wrap back to the timer phase and execute the setTimeout later than the expected 100ms, at 150ms.
Hope that helps answer how the polling phase relates to the timer phase. In essence the polling phase, as I understand it, can 'make the decision' to run the timer phase again if necessary.
From what I see, if an event in Node take a "long time" to be dispatched, Node creates some kind of "queue of events", and they are triggered as soon as possible, one by one.
How long can this queue be?
While this may seem like a simple question, it is actually a rather complex problem; unfortunately, there's no simple number that anyone can give you.
First: wall time doesn't really play a part in anything here. All events are dispatched in the same fashion, whether or not things are taking "a long time." In other words, all events pass through a "queue."
Second: there is no single queue. There are many places where different kinds of events can be dispatched into JS. (The following assumes you know what a tick is.)
There are the things you (or the libraries you use) pass to process.nextTick(). They are called at the end of the current tick until the nextTick queue is empty.
There are the things you (or the libraries you use) pass to setImmediate(). They are called at the start of the next tick. (This means that nextTick tasks can add things to the current tick indefinitely, preventing other operations from happening whereas setImmediate tasks can only add things to the queue for the next tick.)
I/O events are handled by libuv via epoll/kqueue/IOCP on Linux/Mac/Windows respectively. When the OS notifies libuv that I/O has happened, it in turn invokes the appropriate handler in JS. A given tick of the event loop may process zero or more I/O events; if a tick takes a long time, I/O events will queue in an operating system queue.
Signals sent by the OS.
Native code (C/C++) executed on a separate thread may invoke JS functions. This is usually accomplished through the libuv work queue.
Since there are many places where work may be queued, it is not easy to answer "how many items are currently queued", much less what the absolute limit of those queues are. Essentially, the hard limit for the size of your task queues is available RAM.
In practice, your app will:
Hit V8 heap constraints
For I/O, max out the number of allowable open file descriptors.
...well before the size of any queue becomes problematic.
If you're just interested in whether or not your app under heavy load, toobusy may be of interest -- it times each tick of the event loop to determine whether or not your app is spending an unusual amount of time processing each tick (which may indicate that your task queues are very large).
Handlers for a specific event are called synchronously (in the order they were added) as soon as the event is emitted, they are not delayed at all.
The total number of event handlers is limited only by v8 and/or the amount of available RAM.
I believe you're talking about operations that can take an undefined amount of time to complete, such as an http request or filesystem access.
Node gives you a method to complete these types of operations asynchronously, meaning that you can tell node, or a 3rd party library, to start an operation, and then call some code (a function that you define) to inform you when the operation is complete. This can be done through event listeners, or callback functions, both of which have their own limitations.
With event listeners the maximum amount of listeners you can have is dependent on the maximum array size of your environment. In the case of node.js the javascript engine is v8, but according to this post there is a maximum set out by the 5th ECMA standard of ~4billion elements, which is a limit that you shouldn't ever overcome.
With callbacks the limitation you have is the max call stack size, meaning how deep your functions can call each other. For instance you can have a callback calling a callback calling a callback calling another callback, etc etc. The call stack size dictates how may callbacks calling callbacks you can have. Note that the call stack size can be a limitation with event listeners as well as they're essentially callbacks that can be executed multiple times.
And these are the limitations with each.