It is to my understanding that the node event loop will continue to handle requests until the event loop is empty, at which point it will look the the event queue to complete the blocking I/O requests.
My question is.. What happens if the event loop never becomes empty? Not due to bad code (i.e. never ending loop) but due to consistent client requests (thinking something like google that gets never ending requests)?
I realize there is a possibility I am misunderstanding a fundamental aspect of how client requests are handled by a server.
There are actually several different phases of the event loop (timers, I/O, check events, pending callbacks, etc...) and they are checked in a circular order. In addition some things (like promises and other microtasks) go to the front of the line no matter what phase of the event loop is currently processing.
It is possible that a never ending set of one type of event can block the event queue from serving other types of events. That would be a design/implementation problem that needs to be prevented.
You can read a bit more about the different types of things in the event loop here: https://developer.ibm.com/tutorials/learn-nodejs-the-event-loop/ and https://www.geeksforgeeks.org/node-js-event-loop/ and https://snyk.io/blog/nodejs-how-even-quick-async-functions-can-block-the-event-loop-starve-io.
While it is possible to overload the event loop such that is doesn't get out of one phase of the event loop, it's not very common because of the way most processing in nodejs consists of multiple events which gives other things a chance to interleave. For example, processing an incoming http request consists of connecting, reading, writing, closing, etc... and the processing of that event may involve other types of events. So, it can happen that you overload one type of event (I've done it only once in my coding and that was because of poorly written communication between the main thread and a bunch of WorkerThreads that was easily fixed to not have the problem once I realized what the problem was.
Related
Conceptually just having one queue for jobs seems to be sufficient for most use-cases.
What are the reasons for having multiple queues and distinguishing those into "microtasks" and (macro)"tasks"?
Having multiple (macro) task queues allows for task prioritization.
For instance, an User Agent (UA) can choose to prioritize a user event (e.g a click event) over a network event, even if the latter was actually registered by the system first, because the former is probably more visible to the user and requires lower latency.
In HTML this is allowed by the first step of the Event Loop's Processing model, which states that the UA must choose the next task to execute from one of its task queues.
(note: HTML specs do not require that there are multiple queues, but they do define multiple task sources to guarantee the execution order of similar tasks).
Now, why have yet an other beast called microtask queue?
We can find a few early discussions about "how" it should be implemented, but I didn't dig as far as to find who proposed this idea the first and for what use case.
However from the discussions I found, we can see that a few proposals needing such a mechanism were on the road:
Mutation Observers
Now-deprecated ES Object.observe()
At-that-time-incoming ES Promises,
Also-incoming-at-that-time-and-I-don't-know-why-it's-cited ES WeakRefs
HTML Custom Elements callback
The first two being also the first to be implemented in browsers, we can probably say that their use case was the main reason for implementing this new kind of queue.
Both actually did similar things: they listened for changes on an object (or the DOM tree), and coalesced all the changes that occurred during a job into a single event (not to be read as Event).
One could argue that this event could have been queued in the next (macro) task, with the highest priority, except that the Event Loop was already a bit complex, and every jobs did not necessarily come from tasks.
For instance, the rendering is actually a part of every Event Loop iteration, except that most of the time it early exits because it wasn't the time to render.
So if you do your DOM modifications during a rendering frame, you could have the modifications rendered, and only after the whole rendering took place you'd get the callback.
Since the main use case for observers is to act on the observed changes before they trigger performance heavy side-effects, I guess you can see how it was necessary to have a mean to insert our callback right after the job which has done the modifications.
PS: At that time I didn't know much about the Event Loop, I was far from specs matters and I may have put some anachronisms in this answer.
When Node.js starts, it initializes the event loop, processes the provided input script which may make async API calls, schedule timers, or call process.nextTick(), then begins processing the event loop.
There are seven phases and each phase has its own event queue which is based on FIFO.
So application makes a request event, event demultiplexer gathers those requests and pushes to respective event queues.
For example, If my code makes two reqeusts one is setTimeOut() and another is some API Call, demultiplexer will push the first one in timer queue and other in poll queue.
But events are there, and loop watches over those queues and events, on completion in pushes the registered callback to the callstack where it is processed.
My question is,
1). Who handles events in event queue to OS?
2). Does event loop polls for event completion in each event queue or does OS notifies back?
3). Where and who decides whether to call native asyncrhonous API or handle over to a thread pool?
I am very verge of understanding this, I have been strugling a lot to grasp the concepts. There are a lot of false information about node.js event loop and how it handles asynchronous calls using one thread.
Please answer this questions if possible. Below are the references where I could get some better insight from.
https://github.com/nodejs/nodejs.org/blob/master/locale/en/docs/guides/event-loop-timers-and-nexttick.md
https://dev.to/lunaticmonk/understanding-the-node-js-event-loop-phases-and-how-it-executes-the-javascript-code-1j9
how does reactor pattern work in Node.js?
https://www.youtube.com/watch?v=PNa9OMajw9w&t=3s
Who handles events in event queue to OS?
How OS events work depends upon the specific type of event. Disk I/O works one way and Networking works a different way. So, you can't ask about OS events generically - you need to ask about a specific type of event.
Does event loop polls for event completion in each event queue or does OS notifies back?
It depends. Timers for example are built into the event loop and the head of the timer list is checked to see if it's time has come in each timer through the event loop. File I/O is handled by a thread pool and when a disk operation completes, the thread inserts a completion event into the appropriate queue directly so the event loop will just find it there the next time through the event loop.
Where and who decides whether to call native asynchronous API or handle over to a thread pool?
This was up to the designers of nodejs and libuv and varies for each type of operation. The design is baked into nodejs and you can't yourself change it. Nodejs generally uses libuv for cross platform OS access so, in most cases, it's up to the libuv design for how it handles different types of OS calls. In general, if all the OSes that nodejs runs on offer a non-blocking, asynchronous mechanism, then libuv and nodejs will use it (like for networking). If they don't (or it's problematic to make them all work similarly), then libuv will build their own abstraction (as with file I/O and a thread pool).
You do not need to know the details of how this works to program asynchronously in nodejs. You make a call and get a callback (or resolved promise) when its done, regardless of how it works internally. For example, nodejs offers some asynchronous crypto APIs. They happen to be implemented using a thread pool, but you don't need to know that in order to use them.
From what I see, if an event in Node take a "long time" to be dispatched, Node creates some kind of "queue of events", and they are triggered as soon as possible, one by one.
How long can this queue be?
While this may seem like a simple question, it is actually a rather complex problem; unfortunately, there's no simple number that anyone can give you.
First: wall time doesn't really play a part in anything here. All events are dispatched in the same fashion, whether or not things are taking "a long time." In other words, all events pass through a "queue."
Second: there is no single queue. There are many places where different kinds of events can be dispatched into JS. (The following assumes you know what a tick is.)
There are the things you (or the libraries you use) pass to process.nextTick(). They are called at the end of the current tick until the nextTick queue is empty.
There are the things you (or the libraries you use) pass to setImmediate(). They are called at the start of the next tick. (This means that nextTick tasks can add things to the current tick indefinitely, preventing other operations from happening whereas setImmediate tasks can only add things to the queue for the next tick.)
I/O events are handled by libuv via epoll/kqueue/IOCP on Linux/Mac/Windows respectively. When the OS notifies libuv that I/O has happened, it in turn invokes the appropriate handler in JS. A given tick of the event loop may process zero or more I/O events; if a tick takes a long time, I/O events will queue in an operating system queue.
Signals sent by the OS.
Native code (C/C++) executed on a separate thread may invoke JS functions. This is usually accomplished through the libuv work queue.
Since there are many places where work may be queued, it is not easy to answer "how many items are currently queued", much less what the absolute limit of those queues are. Essentially, the hard limit for the size of your task queues is available RAM.
In practice, your app will:
Hit V8 heap constraints
For I/O, max out the number of allowable open file descriptors.
...well before the size of any queue becomes problematic.
If you're just interested in whether or not your app under heavy load, toobusy may be of interest -- it times each tick of the event loop to determine whether or not your app is spending an unusual amount of time processing each tick (which may indicate that your task queues are very large).
Handlers for a specific event are called synchronously (in the order they were added) as soon as the event is emitted, they are not delayed at all.
The total number of event handlers is limited only by v8 and/or the amount of available RAM.
I believe you're talking about operations that can take an undefined amount of time to complete, such as an http request or filesystem access.
Node gives you a method to complete these types of operations asynchronously, meaning that you can tell node, or a 3rd party library, to start an operation, and then call some code (a function that you define) to inform you when the operation is complete. This can be done through event listeners, or callback functions, both of which have their own limitations.
With event listeners the maximum amount of listeners you can have is dependent on the maximum array size of your environment. In the case of node.js the javascript engine is v8, but according to this post there is a maximum set out by the 5th ECMA standard of ~4billion elements, which is a limit that you shouldn't ever overcome.
With callbacks the limitation you have is the max call stack size, meaning how deep your functions can call each other. For instance you can have a callback calling a callback calling a callback calling another callback, etc etc. The call stack size dictates how may callbacks calling callbacks you can have. Note that the call stack size can be a limitation with event listeners as well as they're essentially callbacks that can be executed multiple times.
And these are the limitations with each.
I'm a little bit confused about how browsers handle JavaScript events.
Let's say I have two event handlers attached to buttons A and B. Both event handlers take exactly the same time to complete. If I click on button A first and button B next, is it true that the event handler for the button A is always executed first (because the event loop is a FIFO queue), but when they finish is completely unpredictable? If so, what actually determines this order?
Yes. The order of event handlers executed is guaranteed, and in practice they will not overlap.
This is the beauty of the event loop as a concurrency model. You don't have to think about threading issues like deadlocks, livelocks and race conditions most of the time (though not always).
Order of execution is simple and JavaScript in the browser is single threaded most of the time and in practice you do not have to worry about order of execution of things.
However the fact order of mouse events is guaranteed has hardly anything has to do with JavaScript. This is not a part of the JavaScript language but a part of something called the DOM API, the DOM (document object model) is how JavaScript interacts with your browser and the HTML you write.
Things called Host Objects are defined in the JavaScript specification as external objects JS in the browser works with, and their behavior in this case is specified in the DOM API.
Whether or not the order DOM events are registered is guaranteed is not a part of JavaScript but a part of that API. More specifically, it is defined right here. So to your question: Yes, order of event execution is certain except for control keys (like (control alt delete)) which can mess order of evaluation up.
The Javascript engine is single threaded. All of your event handlers happen sequentially; the click handler for A will be called, and finish before the handler for B ever starts. You can see this by sleep()ing in one handler, and verifying that the second handler will not start until the first has finished.
Note that setTimeout is not valid for this test, because it essentially registers a function with the engine to be called back at a later time. setTimeout returns immediately.
This fiddle should demonstrate this behavior.
Well the commands are indeed in a FIFO when executed by javascript. However, the handlers may take different amounts of time to send you the result. In this case the response from handler B may come back earlier and response from handler A may come later.
I have been looking into Node.JS and all the documentation and blogs talk about how it uses an event-loop rather than a per-request model.
I am having some confusion understanding the difference. I feel like I am 80% there understanding it but not fully getting it yet.
A threaded model will spawn a new thread for every request. This means that you get quite some overhead in terms of computation and memory. An event loop runs in a single thread, which means you don't get the overhead.
The result of this is that you must change your programming model. Because all these different things are happening in the same thread, you cannot block. This means you cannot wait for something to happen because that would block the whole thread. Instead you define a callback that is called once the action is complete. This is usually referred to as non-blocking I/O.
Pseudo example for blocking I/O:
row = db_query('SELECT * FROM some_table');
print(row);
Pseudo example for non-blocking I/O:
db_query('SELECT * FROM some_table', function (row) {
print(row);
});
This example uses lambdas (anonymous functions) like they are used in JavaScript all the time. JS makes heavy use of events, and that's exactly what callbacks are about. Once the action is complete, an event is fired which triggers the callback. This is why it is often referred to as an evented model or also asynchronous model.
The implementation of this model uses a loop that processes and fires these events. That's why it is called an event queue or event loop.
Prominent examples of event queue frameworks include:
EventMachine (Ruby)
Tornado (Python)
node.js (V8 server-side JavaScript)
Think of incoming requests or callbacks as events, that are enqueued and processed.
That is exactly the same what is done in most of the GUI systems. The system can't know when a user will click a button or do some interaction. But when he does, the event will propagated to the event loop, which is basically a loop that checks for new events in the queue and process them.
The advantage is, that you don't have to wait for results for yourself. Instead, you register callback functions that are executed when the event is triggered. This allows the framework to handle I/O stuff and you can easily rely on it's internal efficiency when dealing with long-taking actions instead of blocking processes by yourself.
In short, everythings runs in parallel but your code. There will never be two fragments of callback functions running concurrently – the event loop is a single thread. The processes that execute stuff externally and finally propagate events however can be distributed in multiple threads/processes.
An evented loop allows you to handle the time it takes to talk to the hard drive or network. take this list of time:
Source | CPU Cycles
L1 | 3 Cycles
L2 | 14 Cycles
RAM | 250 Cycles
Disk | 41,000,000 Cycles
Network| 240,000,000 Cycles
That time you're running curl in PHP is just wasting CPU.