I've seen many people saying that Promise.all can't achieve parallelism, since node/javascript runs on a single-threaded environment. However, if i, for instance, wrap 5 promises inside a Promise.all, in which every single one of the promises resolves after 3 seconds (a simple setTimeout promise), how come Promise.all resolves all of them in 3 seconds instead of something like 15 seconds (5 x 3 sec each)?
See the example below:
function await3seconds () {
return new Promise(function(res) {
setTimeout(() => res(), 3000)
})
}
console.time("Promise.all finished in::")
Promise.all([await3seconds(), await3seconds(), await3seconds(), await3seconds(), await3seconds()])
.then(() => {
console.timeEnd("Promise.all finished in::")
})
It logs:
Promise.all finished in::: 3.016s
How is this behavior possible without parallelism? Concurrent execution wouldn't be able to proccess all of these promises in 3 seconds either.
It's particularly useful to understand what this line of code actually does:
Promise.all([await3seconds(), await3seconds(), await3seconds(), await3seconds(), await3seconds()]).then(...)
That is fundamentally the same as:
const p1 = await3seconds();
const p2 = await3seconds();
const p3 = await3seconds();
const p4 = await3seconds();
const p5 = await3seconds();
Promise.all([p1, p2, p3, p4, p5]).then(...)
What I'm trying to show here is that ALL your functions are executed serially one after the other in the order declared and they have all returned BEFORE Promise.all() is even executed.
So, some conclusions from that:
Promise.all() didn't "run" anything. It accepts an array of promises and it justs monitors all those promises, collects their results in order and notifies you (via .then() or await) when they are all done or tells you when the first one rejects.
Your functions are already executed and have returned a promise BEFORE Promise.all() even runs. So, Promise.all() doesn't determine anything about how those functions run.
If the functions you were calling were blocking, the first would run to completion before the second was even called. Again, this has nothing to do with Promise.all() before the functions are all executed before Promise.all() is even called.
In your particular example, your functions each start a timer and immediately return. So, you essentially start 5 timers within ms of each other that are all set to fire in 3 seconds. setTimeout() is non-blocking. It tells the system to create a timer and gives the system a callback to call when that timer fires and then IMMEDIATELY returns. Sometime later, when the event loop is free, the timer will fire and the callback will get called. So, that's why all the timers are set at once and all fire at about the same time. If you wanted them to each be spread out by 3 seconds apart, you'd have to write this code differently, either to set increasing times for each timer or to not start the 2nd timer until the first one fires and so on.
So, what Promise.all() allows you to do is to monitor multiple asynchronous operations that are, by themselves (independent of Promise.all()) capable of running asynchronously. Nodejs itself, nothing to do with Promise.all(), has the ability to run multiple asynchronous operations in parallel. For example, you can make multiple http requests or make multiple read requests from the file system and nodejs will run those in parallel. They will all be "in flight" at the same time.
So, Promise.all() isn't enabling parallelism of asynchronous operations. That capability is built into the asynchronous operations themselves and how they interact with nodejs and how they are implemented. Promise.all() allows you to track multiple asynchronous operations and know when they are all done, get their results in order and/or know when there was an error in one of the operations.
If you're curious how timers work in nodejs, they are managed by libuv which is a cross platform library that nodejs uses for managing the event loop, timers, file system access, networking and a whole bunch of things.
Inside of libuv, it manages a sorted list of pending timers. The timers are sorted by their next time to fire so the soonest timer to fire is at the start of the list.
The event loop within nodejs goes in a cycle to check for a bunch of different things and one of those things is to see if the current head of the timer list has reached its firing time. If so, it removes that timer from the list, grabs the callback associated with that timer and calls it.
Other types of asynchronous operations such as file system access work completely differently. The asynchronous file operations in the fs module, actually use a native code thread pool. So, when you request an asynchronous file operation, it actually grabs a thread from the thread pool, gives it the job for the particular file operation you requested and sends the thread on its way. That native code thread, then runs independently from the Javascript interpreter which is free to go do other things. At some future time when the thread finishes the file operation, it calls a thread safe interface of the event loop to add a file completion event to a queue. When whatever Javascript is currently executing finishes and returns control back to the event loop, one of the things the event loop will do is check if there are any file system completion events waiting to be executed. If so, it will remove it from the queue and call the callback associated with it.
So, while individual asynchronous operation can themselves run in parallel if designed appropriately (usually backed by native code), signaling completion or errors to your Javascript all runs through the event loop and the interpreter is only running one piece of Javascript at a time.
Note: this is ignoring for the purposes of this discussion, the WorkerThread capability which actually fires up a whole different interpreter and can run multiple sets of Javascript at the same time. But, each individual interpreter still runs your Javascript single threaded and is still coordinated with the outside world through the event loop.
First, Promise.all has nothing to do with JS code running in parallel. You can think of Promise.all as code organizer, it puts code together and wait for response.
What's responsible for the code to run in a way that "looks like" it's parallel is the event-based nature of JS. So in your case:
Promise.all([await3seconds(), await3seconds(),
await3seconds(), await3seconds(), await3seconds()])
.then(() => {
console.timeEnd("Promise.all finished in::")
})
Let's say that each function inside Promise.all is called a1 : a5, What will happen is:
The Event loop will take a1 : a5 and put them in the "Callback Queue/Task Queue" sequentially (one after the other), But it will not take too much time, because it's just putting it in there, not executing anything.
The timer will start immediately after each function is put by the Event Loop in the "Callback Queue/Task Queue" (so there will be a minor delay between the start of each one).
Whenever a timer finishes, the Event loop will take the related callback function and put it in the "Call Stack" to be executed.
Promise.all will resolve after the last function is popped out of the "Call Stack".
As you can see in here
Promise.all finished in::: 3.016s
The 0.16s delay is a combination between the time the Event loop took to put those callback functions sequentially in the "Callback Queue/Task Queue" + the time each function took to execute the code inside it after their timer has finished.
So the code is not executed in parallel, it's still sequential, but the event-based nature of JS is used to mimic the behavior of parallelism.
Look at this article to relate more to what I am trying to say.
Synchronous vs Asynchronous JavaScript
No they are not executed in parallel but you can conceptualize them this way. This is just how the event queue works. If each promise contained a heavy compute task, they would still be executed one at a time -
function await3seconds(id) {
return new Promise(res => {
console.log(id, Date.now())
setTimeout(_=> {
console.log(id, Date.now())
res()
}, 3000)
})
}
console.time("Promise.all finished in::")
Promise.all([await3seconds(1), await3seconds(2), await3seconds(3), await3seconds(4), await3seconds(5)])
.then(() => {
console.timeEnd("Promise.all finished in::")
})
time
1
2
3
4
5
1621997095724
start
⌛
⌛
⌛
⌛
1621997095725
↓
start
⌛
⌛
⌛
1621997095725
↓
↓
start
⌛
⌛
1621997095725
↓
↓
↓
start
⌛
1621997095726
↓
↓
↓
↓
start
1621997098738
end
↓
↓
↓
↓
1621997098740
✓
end
↓
↓
↓
1621997098740
✓
✓
end
↓
↓
1621997098741
✓
✓
✓
end
↓
1621997098742
✓
✓
✓
✓
end
In this related Q&A we build a batch processing Pool that emulates threads. Check it out if that kind of thing interests you!
Related
I'm learning the mechanism of Event-Loop in Node.js, and I'm doing some exercises, but have some confusions as explained bellow.
const fs = require("fs");
setTimeout(() => console.log("Timer 1"), 0);
setImmediate(() => console.log("Immediate 1"));
fs.readFile("test-file-with-1-million-lines.txt", () => {
console.log("I/O");
setTimeout(() => console.log("Timer 2"), 0);
setTimeout(() => console.log("Timer 3"), 3000);
setImmediate(() => console.log("Immediate 2"));
});
console.log("Hello");
I expected to see the following output:
Hello
Timer 1
Immediate 1
I/O
Timer 2
Immediate 2
Timer 3
but I get the following output:
Hello
Timer 1
Immediate 1
I/O
Immediate 2
Timer 2
Timer 3
Would you please clarify for me how are these lines executed step by step.
First off, I should mention that if you really want asynchronous operation A to be processed in a specific order with relation to asynchronous operation B, you should probably write your code such that it guarantees that without relying on the details of exactly what gets to run first. But, that said, I have run into issues where one type of asynchronous operation can "hog" the event loop and starve other types of events and it can be useful to understand what's really going on inside if/when that happens.
Broken down to its core, your question is really about why Immediate2 logs before Timer2 when scheduled from within an I/O callback, but not when called from top level code? Thus it is inconsistent.
This has to do with where the event loop is in its cycle through various checks it is doing when the setTimeout() and setImmediate() are called (when they are scheduled). It is somewhat explained here: https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/#setimmediate-vs-settimeout.
If you look at this somewhat simplified diagram of the event loop (from the above article):
You can see that there are a number of different parts to the event loop cycle. setTimeout() is served by the "timers" block at the top of the diagram. setImmediate() is served in the "check" block near the bottom of the diagram. File I/O is served in the "poll" block in the middle.
So, if you schedule both a setImmediate(fn1) and a setTimeout(fn2, 0) from within a file I/O callback (which is your case for Intermediate2 and Timer2), then the event loop processing happens to be in the poll phase when these two are scheduled. So, the next phase of the event loop is the "check" phase and the setImmediate(fn1) gets processed. Then, after the "check" phase and the "close callbacks" phase, then it cycles back around to the "timers" phase and you get the setTimeout(fn2,0).
If, on the other hand, you call those same two setImmediate() and setTimeout() from code that runs from a different phase of the event loop, then the timer might get processed first before the setImmediate() - it will depend upon exactly where that code was executed from in the event loop cycle.
This structure of the event loop is why some people describe setImmediate() as "it runs right after I/O" because it's positioned in the loop to be processed right after the "poll" phase. If you are in the middle of processing some file I/O in an I/O callback and you want something to run as soon as the stack unwinds, you can use setImmediate() to accomplish that. It will always run after the current I/O callback finishes, but before timers.
Note: Missing from this simplified description is promises which have their own special treatment. Promises are considered microtasks and they have their own queue. They get to run a lot more often. Starting with node v11, they get to run in every phase of the event loop. So, if you have three pending timers that are ready to run and you get to the timer phase of the event loop and call the callback for the first pending timer and in that timer callback, you resolve a promise, then as soon as that timer callback returns back to the system, then it will serve that resolved promise. So, microtasks (such as promises and process.nextTick()) get served (if waiting to run) between every operation in the event loop, not just between phases of the event loop, but even between pending events in the same phase. You can read more about these specifics and the changes in node v11 here: New Changes to the Timers and Microtasks in Node v11.0.0 and above.
I believe this was done to improve the performance of promise-related code as promises became more of a central part of the nodejs architecture for asynchronous operations and there is also some standards-related work in this area too to make this consistent across different JS envrionments.
Here's another reference that covers part of this:
Nodejs Event Loop - interaction with top-level code
The reason for this output is the asynchronous nature of javascript.
You set the first 2 outputs in a sort of timeout with the execution time to be 0 this makes them still wait a tick.
Next you have the file read which takes a while to be finished and thus delays the execution of the functions in the callback
The first console.log within the callback is fired as soon as the callback is executed and the rest within the callback follows the first part of your code
Lastly you have the console.log at the bottom which gets executed at first because there is no delay for it and it does not need to wait till the next tick.
As some added help, check out this video.
https://youtu.be/cCOL7MC4Pl0
The presenter gives an amazing talk on the event loop. I think it is a great resource.
While this is particularly for the browser, many aspects are shared in Node.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm struggling to grasp the concept of asynchronousness. Is the following roughly correct about asynchronous operations?
A problem can occur if a piece of code takes a long time to complete. This is because i) it stops code below from running and it might be nice to run this whilst the hard code loads in the background. And ii) Indeed, JS might try to execute the code below before the hard code’s finished. If the code below relies on the hard code, that’s a problem.
A solution is: if an operation takes a long time to complete, you want to process it in a separate thread while the original thread is processed. Just make sure the main thread doesn't reference things that the asynchronous operation returns. JS employs event-ques for this solution. Asynchronous operations are executed in an event-que which executes after the main-thread.
But even the event-que can suffer from the same problem as the main-thread. If fetch1, which is positioned above fetch2, takes a long time to return a promise, and fetch2 doesn’t, JS might start executing fetch2 before executing fetch1. This is where Promise.all is useful because it won't proceed with the next step in the asynchronous operation until both both fetch1 & fetch2’s promises are resolved.
On a separate note, I’ve read when chaining .then, that counts as one asynchronous operation so we can always guarantee that the subsequent .thenwill only execute when the .then before it has executed|resolved its promise.
Almost correct but not quite.
If you are talking about "asynchronousness" the word in the English language then it just means that things can happen out of order. This concept is used in a lot of languages including multithreading in Java and C/C++.
If you are talking about the specific concept of asynchronousness as it relates to node.js or asynchronous I/O in C/C++ then you do have some misunderstandings in how this works at the low level.
A problem can occur if a piece of code takes a long time to complete. This is because i) it stops code below from running and it might be nice to run this whilst the hard code loads in the background. And ii) Indeed, JS might try to execute the code below before the hard code’s finished. If the code below relies on the hard code, that’s a problem.
When talking about javascript or asynchronous I/O in C/C++ (where javascript got its asynchronousness from) this is not true.
What actually happens is that waiting for something to happen may take a long time to complete. Instead of waiting why not tell the OS to execute some code (your callback) once that thing happens.
At the OS level most modern operating systems have API that let you tell it to wake your process up when something happens. That thing may be a keyboard event, a mouse event, an I/O event (from disk or network), a system reconfiguration event (eg. changing monitor resolution) etc.
Most traditional languages implement blocking I/O. What happens is that when you try to read something form disk or network your process goes to sleep immediately and the OS will wake it up again when the data arrives:
Traditional blocking I/O
time
│
├────── your code doing stuff ..
├────── read_data_from_disk() ───────────────────┐
┆ ▼
: OS puts process to sleep
.
. other programs running ..
.
: data arrives ..
┆ OS wakes up your process
├────── read_data_from_disk() ◀──────────────────┘
├────── your program resume doing stuff ..
▼
This means that your program can only wait for one thing at a time. Which means that most of the time your program is not using the CPU. The traditional solution to listen to more events is multithreading. Each thread will seperately block on their events but your program can spawn a new thread for each event it is interested in.
It turns out that naive multithreading where each thread waits for one event is slow. Also it ends up consuming a lot of RAM especially for scripting languages. So this is not what javascript does.
Note: Historically the fact that javascript uses a single thread instead of multithreading is a bit of an accident. It was just the result of decisions made by the team that added progressive JPEG rendering and GIF animations to early browsers. But by happy coincidence this is exactly what makes things like node.js fast.
What javascript does instead is wait for multiple events instead of waiting for a single event. All modern OSes have API that lets you wait for multiple events. They range from queue/kqueue on BSD and Mac OSX to poll/epoll on Linux to overlapped I/O on Windows to the cross-platform POSIX select() system call.
The way javascript handles external events is something like the following:
Non-blocking I/O (also known as asynchronous I/O)
time
│
├────── your code doing stuff ..
├────── read_data_from_disk(read_callback) ───▶ javascript stores
│ your callback and
├────── your code doing other stuff .. remember your request
│
├────── wait_for_mouse_click(click_callback) ─▶ javascript stores
│ your callback and
├────── your code doing other stuff .. remember your request
│
├────── your finish doing stuff.
┆ end of script ─────────────▶ javascript now is free to process
┆ pending requests (this is called
┆ "entering the event loop").
┆ Javascript tells the OS about all the
: events it is interested in and waits..
. │
. └───┐
. ▼
. OS puts process to sleep
.
. other programs running ..
.
. data arrives ..
. OS wakes up your process
. │
. ┌───┘
: ▼
┆ Javascript checks which callback it needs to call
┆ to handle the event. It calls your callback.
├────── read_callback() ◀────────────────────┘
├────── your program resume executing read_callback
▼
The main difference is that synchronous multithreaded code waits for one event per thread. Asynchronous code either single threaded like javascript or multi threaded like Nginx or Apache wait for multiple events per thread.
Note: Node.js handles disk I/O in separate threads but all network I/O are processed in the main thread. This is mainly because asynchronous disk I/O APIs are incompatible across Windows and Linux/Unix. However it is possible to do disk I/O in the main thread. The Tcl language is one example that does asynchronous disk I/O in the main thread.
A solution is: if an operation takes a long time to complete, you want to process it in a separate thread while the original thread is processed.
This is not what happens with asynchronous operations in javascript with the exception of web workers (or worker threads in Node.js). In the case of web workers then yes, you are executing code in a different thread.
But even the event-que can suffer from the same problem as the main-thread. If fetch1, which is positioned above fetch2, takes a long time to return a promise, and fetch2 doesn’t, JS might start executing fetch2 before executing fetch1
This is not what is happening. What you are doing is as follows:
fetch(url_1).then(fetch1); // tell js to call fetch1 when this completes
fetch(url_2).then(fetch2); // tell js to call fetch2 when this completes
It is not that js "might" start executing. What happens with the code above is both fetches are executed synchronously. That is, the first fetch strictly happens before the second fetch.
However, all the above code does is tell javascript to call the functions fetch1 and fetch2 back at some later time. This is an important lesson to remember. The code above does not execute the fetch1 and fetch2 functions (the callbacks). All you are doing is tell javascript to call them when the data arrives.
If you do the following:
fetch(url_1).then(fetch1); // tell js to call fetch1 when this completes
fetch(url_2).then(fetch2); // tell js to call fetch2 when this completes
while (1) {
console.log('wait');
}
Then the fetch1 and fetch2 will never get executed.
I'll pause here to let you ponder on that.
Remember how asynchronous I/O is handled. All I/O (often called asynchronous) function calls don't actually cause the I/O to be accessed immediately. All they do is just remind javascript that you want something (a mouse click, a network request, a timeout etc.) and you want javascript to execute your function later when that thing completes. Asynchronous I/O are only processed at the end of your script when there is no more code to execute.
This does mean that you cannot use an infinite while loop in a javascript program. Not because javascript does not support it but there is a built-in while loop that surrounds your entire program: this big while loop is called the event loop.
On a separate note, I’ve read when chaining .then, that counts as one asynchronous operation.
Yes, this is by design to avoid confusing people on when promises are processed.
If you are interested at how the OS handles all this without further creating threads you may be interested in my answers to these related questions:
Is there any other way to implement a "listening" function without an infinite while loop?
node js - what happens to incoming events during callback excution
TLDR
If nothing else I'd like you to understand two things:
Javascript is a strictly synchronous programming language. Each statement in your code is executed strictly sequentially.
Asynchronous code in all languages (yes, including C/C++ and Java and Python etc.) will call your callback at any later time. Your callback will not be called immediately. Asynchronousness is a function-call level concept.
It's not that javascript is anything special when it comes to asynchronousness*. It's just that most javascript libraries are asynchronous by default (though you can also write asynchronous code in any other language but their libraries are normally synchronous by default).
*Note: of course, things like async/await does make javascript more capable of handling asynchronous code.
Side note: Promises are nothing special. It is just a design pattern. It is not something built-in to javascript syntax. It is just that newer versions of javascript comes with Promises as part of its standard library. You could have always used promises even with very old versions of javascript and in other languages (Java8 and above for example call have promises in their standard library but call them Futures).
This page explains Node.js event loop very well (since it's by Node.js):
https://nodejs.dev/learn/the-nodejs-event-loop
A problem can occur if a piece of code takes a long time to complete.
This is because i) it stops code below from running and it might be
nice to run this whilst the hard code loads in the background. And ii)
Indeed, JS might try to execute the code below before the hard code’s
finished. If the code below relies on the hard code, that’s a problem.
Yes, though to reword: having a user wait for code to resolve is bad user experience rather than a runtime error. Still, a runtime error would indeed occur if a code block was dependent on a variable that returned undefined because it had yet to resolve.
A solution is: if an operation takes a long time to complete, you want
to process it in a separate thread while the original thread is
processed. Just make sure the main thread doesn't reference things
that the asynchronous operation returns. JS employs event-ques for
this solution. Asynchronous operations are executed in an event-que
which executes after the main-thread.
Yes, but it is important to point out these other threads are occurring in the browser or on an api server, not within your JS script. Also, async functions are still called in the main call stack, but their resolve is placed in either the job queue or message queue.
But even the event-que can suffer from the same problem as the
main-thread. If fetch1, which is positioned above fetch2, takes a long
time to return a promise, and fetch2 doesn’t, JS might start executing
fetch2 before executing fetch1. This is where Promise.all is useful
because it won't proceed with the next step in the asynchronous
operation until both both fetch1 & fetch2’s promises are resolved.
Yes, a promise's resolve is executed as soon as the current function in the call stack resolves. If one promise resolves before another, its resolve will executed first event if it was called second. Also note that Promise.all doesn't change the resolve time, but rather returns the resolves together in the order their promise was executed.
On a separate note, I’ve read when chaining .then, that counts as one
asynchronous operation so we can always guarantee that the subsequent
.thenwill only execute when the .then before it has executed|resolved
its promise.
Yes, though the newer and cleaner syntax is async await:
function A () {
return new Promise((resolve, reject)=> setTimeout(()=> resolve("done"),2000))
}
async function B () {
try {
console.log("waiting...");
const result = await A();
console.log(result);
} catch (e) {
console.log(e);
}
}
B();
And below shows the Node.js callstack in action:
function A () {
return new Promise((resolve, reject)=> resolve(console.log("A")))
}
function B () {
console.log("B");
C();
}
function C () {
console.log("C");
}
function D () {
setTimeout(()=> console.log("D"),0);
}
B();
D();
A();
B gets called, C gets called, B resolves, D gets called, setTimeout gets called but its resolve is moved to the message queue, D resolves, A gets called, the promise gets called and immediately resolves, A resolves, the call stack completes, the message queue is accessed
I have this ultra minimal Node.js server:
http.createServer(function (req, res) {
var path = url.parse(req.url).pathname;
if (path === "/") {
res.writeHead(200, { 'Content-Type': 'text/plain' });
res.write('Hello World');
res.end('\n');
} else {
if (!handler.handle(path, req, res)) {
res.writeHead(404);
res.write("404");
res.end();
}
}
}).listen(port);
After doing some benchmarks I saw considerable degradation of performances under high concurrency (> 10000 concurrent connections). I started to dig deeper on Node.js concurrency and the more I search, the more I am confused...
I created a minimal example, out of the http paradigm in order to try understand things a bit better:
function sleep(duration) {
return new Promise(resolve => setTimeout(resolve, duration));
}
var t0 = performance.now();
async function init() {
await Promise.all([sleep(1000), sleep(1000), sleep(1000)]);
var t1 = performance.now();
console.log("Execution took " + (t1 - t0) + " milliseconds.")
}
init()
// Execution took 1000.299999985145 milliseconds.
From what I understand, Javascript operates on a single thread. That being said, I can't wrap my head around it acting like this:
| 1 second |
Thread #One >>>>>>>>>>>>>>
Thread #One >>>>>>>>>>>>>>
Thread #One >>>>>>>>>>>>>>
... obviously this doesn't makes sense.
So, is this a terminology problem around thread Vs worker with something like this:
| 1 second |
Thread #One (spawns 3 workers)
Worker #One >>>>>>>>>>>>>>
Worker #Two >>>>>>>>>>>>>>
Worker #Three >>>>>>>>>>>>>>
???
How is Node.js single-threaded but able to process three functions in parallel ?? If I am right with the parallel workers, is http spawning multiple workers for each incoming connections ?
A thread in a JavaScript program works by servicing an task (event) / job queue. Conceptually, it's a loop: Pick up a job from the queue, run that job to completion, pick up the next job, run that job to completion.
With that in mind, your example with promises works like this:
Running Node.js with your input file parses the code and queues a job to run the top-level code in the script.
The main thread picks up that job and runs your top-level code, which:
Creates some functions
Does var t0 = performance.now();
Calls sleep(1000), which
Creates a promise
Sets a timer to do a callback in roughly 1000ms
Returns the promise
Calls sleep(1000) twice more. Now there are three promises, and three timer callbacks scheduled for roughly the same time.
Awaits Promise.all on those promises. This saves the state of the async function and returns a promise (which nothing uses, because nothing is using the return value of init).
Now the job to run the top-level code is complete.
The thread is now idling, waiting for a job to perform (an I/O completion, timer, etc.).
After about a second of idling, a job is queued to call the first timer callback. How this happens is implementation specific:
It might be that the main thread, as part of its loop checking the queue for work, also checks the list of timers to see if any of them are due to be run.
It might be that the timer system has its own non-JavaScript thread that does those checks (or gets a callback from an OS mechanism) and queues the job when a timer callback needs to run.
In the case of I/O (since timers are a stand-in for I/O in your example), Node.js uses a separate thread or threads to process I/O completions from the operating system and queue jobs for them in the JavaScript job queue.
The JavaScript thread picks up the job to call the timer callback. It does that, which fulfills that first sleep promise.
Fulfilling a promise queues a "microtask" (or "promise job") to call the promise's fulfillment handler (then handler) rather than a "macrotask" (standard job). Modern JavaScript engines handle promise jobs at the end of the standard job that queues them (even if there's another standard job already in the main queue waiting to be done — promise jobs jump the main queue). So:
Fulfilling the first sleep promise queues a promise job.
At the end of the standard job for the timer callback, the thread picks the promise job and runs it.
The promise job triggers a call to a fulfillment handler within Promise.all's logic that stores the fulfillment value and checks to see if all of the promises it's waiting for are settled. In this case, they aren't (two are still outstanding), so there's nothing further to do and the promise job is complete.
The thread goes back to the main queue.
Almost immediately, there's a job in the queue to call the next timer callback (or possibly two jobs to call both of them).
The thread picks up the first one and does it, which fulfills the second sleep promise, queuing a promise job to run its fulfillment handler
It picks up the queued promise job and calls the fulfillment hander in Promise.all's logic, which stores the fulfillment value and checks if all the promises are settled. They aren't, so it just returns.
It picks up the next standard job and does it, which fulfills the third sleep promise, queuing a promise job for its fulfillment handler.
It picks up that promise job, running the fulfillment handler in Promise.all's logic, which stores the fulfillment result, sees that all of the promises are settled, and queues a promise job to run its fulfillment handler.
It picks up that new promise job, which advances the state of the async init function:
The init function does var t1 = performance.now();, and shows the difference between t1 and t0, which is roughly one second.
There's only one JavaScript thread involved (in that code). It's running a loop, servicing jobs in the queue. There may be a separate thread for the timers, or the main thread may just be checking the timer list between jobs. (I'd have to dig into the Node.js code to know which, but I suspect the latter because I suspect they're using OS-specific scheduling features.) If it were I/O completions instead of timers, I'm fairly sure those are handled by a separate non-JavaScript thread which responds to completions from the OS and queues jobs for the JavaScript thread.
I am a bit confused about when the event loop spins in the browser.
The questions are:
Does a task and the pending microtasks, happen in the same loop iteration/turn/tick?
Which are the actual conditions that need to be met in order for the loop to turn?
Are these conditions the same in node.js event loop? - I don't know if this is a stupid question.
Let's say we have a webpage and in the front end we have JavaScript code that schedules a task and waits for a promise (which is a microtask). Is the execution of the promise considered to happen in the same turn of the event loop as the task, or in different iterations?
I currently assume that they all happen in the same iteration. Since if betting otherwise, in the case of microtasks executing while mid-task, that would mean that the task would require multiple loop iterations in order to fully complete. Which seems wired to me. Would it be correct to also say that the update rendering part, that may occur after each task, happens in the same loop turn?
Thank you in advance!
------------------------------------------------------------------------------------
I know I am supposed to add a comment, but it is going to be a long one, and I also need to write code, so I am editing the question and asking for clarification here.
#T.J. Crowder Thank you so much for your time and detailed explanation!
I had indeed misread "microtasks are processed after callbacks (as long as no other JavaScript is mid-execution)" in this great article and had gotten a bit confused.
I also had questions about the 4ms setTimout for which I couldn't find information about, so thanks for that info also.
One last thing, though... If we were to mark the loop ticks between the example code, where would we put them (assuming console.logs do not exist)?
Suppose we have a function named exampleCode, having the following body:
setTimeout(setTimeoutCallback, 0);
Promise.resolve().then(promiseCallback);
For the above code, my guess would be...
Just before executing exampleCode (macro)task:
first loop tick
setTimeoutCallback (macro)task scheduling
Promise.then microtask scheduling
promiseCallback execution
second loop tick
setTimeoutCallback execution
third loop tick
Or is there an additional loop tick between between Promise.then microtask scheduling and promiseCallback execution ?
Thank you in advance once again!
Does a task and the pending microtasks, happen in the same loop iteration/turn/tick?
The task occurs, then when it ends, any pending microtasks it scheduled are run.
Which are the actual conditions that need to be met in order for the loop to turn?
It's not clear what you mean by this. It may be easier to think in terms of jobs and a job queue (which is the ECMAScript spec's terminology): If there is a pending job and the thread servicing that queue is not doing something else, it picks up the job and runs it to completion.
Are these conditions the same in node.js event loop?
Close enough, yes.
Let's say we have a webpage and in the front end we have JavaScript code that schedules a task and waits for a promise (which is a microtask). Is the execution of the promise considered to happen in the same turn of the event loop as the task, or in different iterations?
In a browser (and in Node), it happens after the task completes, when the task's microtasks (if any) are run, before the next queued task/job gets picked up.
For instance:
// This code is run in a task/job
console.log("Scheduling (macro)task/job");
setTimeout(() => {
console.log("timeout callback ran");
}, 0);
console.log("Scheduling microtask/job");
Promise.resolve().then(() => {
console.log("promise then callback ran");
});
console.log("main task complete");
On a compliant browser (and Node), that will output:
Scheduling (macro)task/job
Scheduling microtask/job
main task complete
promise then callback ran
timeout callback ran
...because the microtask ran when the main task completed, before the next macrotask ran.
(Note that setTimeout(..., 0) will indeed schedule the timer to run immediately on compliant browsers provided it's not a nested timeout; more here. You'll see people saying there is no "setTimeout 0", but that's outdated information. It's only clamped to 4ms if the timer's nesting level is > 5.)
More to explore:
MDN Concurrency Model and Event Loop
ECMAScript Spec: Jobs and Job Queues
WHAT-WG "HTML5" Spec: Event Loops, Processing Model, Timers, and Timer Initialization Steps
Re the code and question in the edit/comment:
setTimeout(setTimeoutCallback, 0);
Promise.resolve().then(promiseCallback);
Your guess looks pretty good. Here's how I'd describe it:
Schedule the task to run that script
(When thread is next free)
Pick up the next task (from #1 above)
Run that task:
Create the task for the timer callback
In parallel, queue the task when the time comes
Queue a microtask for the promise then callback
End of task
Microtask check
Run the then callback
(When thread is next free)
Pick up the next task (the one for the timer callback)
Run the task
End of task
Microtask check (none to do in this case)
The processing model text explicitly calls out that the task ends prior to the microtask check, but I don't think that's observable in any real way.
I am thinking about it and this is what I came up with:
Let's see this code below:
console.clear();
console.log("a");
setTimeout(function(){console.log("b");},1000);
console.log("c");
setTimeout(function(){console.log("d");},0);
A request comes in, and JS engine starts executing the code above step by step. The first two calls are sync calls. But when it comes to setTimeout method, it becomes an async execution. But JS immediately returns from it and continue executing, which is called Non-Blocking or Async. And it continues working on other etc.
The results of this execution is the following:
a c d b
So basically the second setTimeout got finished first and its callback function gets executed earlier than the first one and that makes sense.
We are talking about single-threaded application here. JS Engine keeps executing this and unless it finishes the first request, it won't go to second one. But the good thing is that it won't wait for blocking operations like setTimeout to resolve so it will be faster because it accepts the new incoming requests.
But my questions arise around the following items:
#1: If we are talking about a single-threaded application, then what mechanism processes setTimeouts while the JS engine accepts more requests and executes them? How does the single thread continue working on other requests? What works on setTimeout while other requests keep coming in and get executed.
#2: If these setTimeout functions get executed behind the scenes while more requests are coming in and being executed, what carries out the async executions behind the scenes? What is this thing that we talk about called the EventLoop?
#3: But shouldn't the whole method be put in the EventLoop so that the whole thing gets executed and the callback method gets called? This is what I understand when talking about callback functions:
function downloadFile(filePath, callback)
{
blah.downloadFile(filePath);
callback();
}
But in this case, how does the JS Engine know if it is an async function so that it can put the callback in the EventLoop? Perhaps something like the async keyword in C# or some sort of an attribute which indicates the method JS Engine will take on is an async method and should be treated accordingly.
#4: But an article says quite contrary to what I was guessing on how things might be working:
The Event Loop is a queue of callback functions. When an async
function executes, the callback function is pushed into the queue. The
JavaScript engine doesn't start processing the event loop until the
code after an async function has executed.
#5: And there is this image here which might be helpful but the first explanation in the image is saying exactly the same thing mentioned in question number 4:
So my question here is to get some clarifications about the items listed above?
1: If we are talking about a single-threaded application, then what processes setTimeouts while JS engine accepts more requests and executes them? Isn't that single thread will continue working on other requests? Then who is going to keep working on setTimeout while other requests keep coming and get executed.
There's only 1 thread in the node process that will actually execute your program's JavaScript. However, within node itself, there are actually several threads handling operation of the event loop mechanism, and this includes a pool of IO threads and a handful of others. The key is the number of these threads does not correspond to the number of concurrent connections being handled like they would in a thread-per-connection concurrency model.
Now about "executing setTimeouts", when you invoke setTimeout, all node does is basically update a data structure of functions to be executed at a time in the future. It basically has a bunch of queues of stuff that needs doing and every "tick" of the event loop it selects one, removes it from the queue, and runs it.
A key thing to understand is that node relies on the OS for most of the heavy lifting. So incoming network requests are actually tracked by the OS itself and when node is ready to handle one it just uses a system call to ask the OS for a network request with data ready to be processed. So much of the IO "work" node does is either "Hey OS, got a network connection with data ready to read?" or "Hey OS, any of my outstanding filesystem calls have data ready?". Based upon its internal algorithm and event loop engine design, node will select one "tick" of JavaScript to execute, run it, then repeat the process all over again. That's what is meant by the event loop. Node is basically at all times determining "what's the next little bit of JavaScript I should run?", then running it. This factors in which IO the OS has completed, and things that have been queued up in JavaScript via calls to setTimeout or process.nextTick.
2: If these setTimeout will get executed behind the scenes while more requests are coming and in and being executed, the thing carry out the async executions behind the scenes is that the one we are talking about EventLoop?
No JavaScript gets executed behind the scenes. All the JavaScript in your program runs front and center, one at a time. What happens behind the scenes is the OS handles IO and node waits for that to be ready and node manages its queue of javascript waiting to execute.
3: How can JS Engine know if it is an async function so that it can put it in the EventLoop?
There is a fixed set of functions in node core that are async because they make system calls and node knows which these are because they have to call the OS or C++. Basically all network and filesystem IO as well as child process interactions will be asynchronous and the ONLY way JavaScript can get node to run something asynchronously is by invoking one of the async functions provided by the node core library. Even if you are using an npm package that defines it's own API, in order to yield the event loop, eventually that npm package's code will call one of node core's async functions and that's when node knows the tick is complete and it can start the event loop algorithm again.
4 The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.
Yes, this is true, but it's misleading. The key thing is the normal pattern is:
//Let's say this code is running in tick 1
fs.readFile("/home/barney/colors.txt", function (error, data) {
//The code inside this callback function will absolutely NOT run in tick 1
//It will run in some tick >= 2
});
//This code will absolutely also run in tick 1
//HOWEVER, typically there's not much else to do here,
//so at some point soon after queueing up some async IO, this tick
//will have nothing useful to do so it will just end because the IO result
//is necessary before anything useful can be done
So yes, you could totally block the event loop by just counting Fibonacci numbers synchronously all in memory all in the same tick, and yes that would totally freeze up your program. It's cooperative concurrency. Every tick of JavaScript must yield the event loop within some reasonable amount of time or the overall architecture fails.
Don't think the host process to be single-threaded, they are not. What is single-threaded is the portion of the host process that execute your javascript code.
Except for background workers, but these complicate the scenario...
So, all your js code run in the same thread, and there's no possibility that you get two different portions of your js code to run concurrently (so, you get not concurrency nigthmare to manage).
The js code that is executing is the last code that the host process picked up from the event loop.
In your code you can basically do two things: run synchronous instructions, and schedule functions to be executed in future, when some events happens.
Here is my mental representation (beware: it's just that, I don't know the browser implementation details!) of your example code:
console.clear(); //exec sync
console.log("a"); //exec sync
setTimeout( //schedule inAWhile to be executed at now +1 s
function inAWhile(){
console.log("b");
},1000);
console.log("c"); //exec sync
setTimeout(
function justNow(){ //schedule justNow to be executed just now
console.log("d");
},0);
While your code is running, another thread in the host process keep track of all system events that are occurring (clicks on UI, files read, networks packets received etc.)
When your code completes, it is removed from the event loop, and the host process return to checking it, to see if there are more code to run. The event loop contains two event handler more: one to be executed now (the justNow function), and another within a second (the inAWhile function).
The host process now try to match all events happened to see if there handlers registered for them.
It found that the event that justNow is waiting for has happened, so it start to run its code. When justNow function exit, it check the event loop another time, searhcing for handlers on events. Supposing that 1 s has passed, it run the inAWhile function, and so on....
The Event Loop has one simple job - to monitor the Call Stack, the Callback Queue and Micro task queue. If the Call Stack is empty, the Event Loop will take the first event from the micro task queue then from the callback queue and will push it to the Call Stack, which effectively runs it. Such an iteration is called a tick in the Event Loop.
As most developers know, that Javascript is single threaded, means two statements in javascript can not be executed in parallel which is correct. Execution happens line by line, which means each javascript statements are synchronous and blocking. But there is a way to run your code asynchronously, if you use setTimeout() function, a Web API given by the browser, which makes sure that your code executes after specified time (in millisecond).
Example:
console.log("Start");
setTimeout(function cbT(){
console.log("Set time out");
},5000);
fetch("http://developerstips.com/").then(function cbF(){
console.log("Call back from developerstips");
});
// Millions of line code
// for example it will take 10000 millisecond to execute
console.log("End");
setTimeout takes a callback function as first parameter, and time in millisecond as second parameter.
After the execution of above statement in browser console it will print
Start
End
Call back from developerstips
Set time out
Note: Your asynchronous code runs after all the synchronous code is done executing.
Understand How the code execution line by line
JS engine execute the 1st line and will print "Start" in console
In the 2nd line it sees the setTimeout function named cbT, and JS engine pushes the cbT function to callBack queue.
After this the pointer will directly jump to line no.7 and there it will see promise and JS engine push the cbF function to microtask queue.
Then it will execute Millions of line code and end it will print "End"
After the main thread end of execution the event loop will first check the micro task queue and then call back queue. In our case it takes cbF function from the micro task queue and pushes it into the call stack then it will pick cbT funcion from the call back queue and push into the call stack.
JavaScript is high-level, single-threaded language, interpreted language. This means that it needs an interpreter which converts the JS code to a machine code. interpreter means engine. V8 engines for chrome and webkit for safari. Every engine contains memory, call stack, event loop, timer, web API, events, etc.
Event loop: microtasks and macrotasks
The event loop concept is very simple. There’s an endless loop, where the JavaScript engine waits for tasks, executes them and then sleeps, waiting for more tasks
Tasks are set – the engine handles them – then waits for more tasks (while sleeping and consuming close to zero CPU). It may happen that a task comes while the engine is busy, then it’s enqueued. The tasks form a queue, so-called “macrotask queue”
Microtasks come solely from our code. They are usually created by promises: an execution of .then/catch/finally handler becomes a microtask. Microtasks are used “under the cover” of await as well, as it’s another form of promise handling. Immediately after every macrotask, the engine executes all tasks from microtask queue, prior to running any other macrotasks or rendering or anything else.