Native javascript is single threaded and synchronous. Only a few objects can run asynchronously and get added to the callback queue such as HTTP requests, timers and events. These asynchronous objects are a result of the actual javascript environment and not javascript itself. setTimeout() seems to be the go to for examples for asynchronous code. That function gets moved to the web API container and then eventually the callback queue. There doesn't seem to be a way to write asynchronous code in javascript that doesn't involve using objects that get moved to the Web API container. I can write my own custom objects with callbacks, but the most that will do is structure it to run in the proper order. You can never write javascript code that runs in parallel that doesn't rely on those objects.
This is my understanding of how it works, if I am wrong please correct me.
setTimeout and setInterval are just a convenient way to trigger some asynchronous behavior for an example. It is part of standard javascript library in all implementations, and not just part of the browser environment.
But all other sources of async code depend on some external process. When making an HTTP request, your javascript thread tells the browser to make a request (what headers, what url, etc). The browser, according to its own compiled internals, then formats the request, sends it, waits for a response, and eventually adds an item to javascript's event loop to be processed next time the event loop runs. File system access and database queries are two other common examples of async code that depends on external processes (the OS, and a database, respectively)
How javascript handles async code in a single threaded process is all down to this event loop. This event loop in psuedo code is basically this:
while (queue.waitForMessage()) {
queue.processNextMessage();
}
setTimeout tells the environment to pop something into that queue at some point of the future. But the processing of that queue is single threaded. Only one event message can be handled at the same time, but any number can be added to that queue.
You can get true concurrency with workers, but this basically adds a new javascript process that is itself single threaded, and has a method of communicating with messages to and from the main javascript process.
Related
I've strong background in a JS frameworks. Javascript is single threaded language and during the execution of code when it encounters any async task, the event loop plays the important role there.
Now I've been into Python/Django and things are going good but I'm very curious about how python handle any asynchronous tasks whenever I need to make a database query in django, I simple use the queryset method like
def func(id):
do_somthing()
user = User.objects.get(id=id)
do_somthing_with_user()
func()
another_func()
I've used the queryset method that fetch something from database and is a asynchronous function without await keyword. what will be the behaviour of python there, is it like block the execution below until the queryset response, is it continue the execution for the rest of code on another thread.
Also whenever we need to make an external request from python, we uses requests library method without await keyword,
r = request.get("https://api.jsonplaceholder.com/users")
# will the below code executes when the response of the request arrive?
do_somthing()
.
.
.
do_another_thing()
does python block the execution of code until the response arrives?
In JS the async task is handled by event loop while the rest of the code executes normally
console.log("first")
fetch("https://api.jsonplaceholder.com/users").then(() => console.log("third"))
console.log("second")
the logs in console will be
first
second
third
How Python Handles these type things under the hood?
The await command needs to be used specifically when you are running an asynchronous function that calls synchronous functions. This synchronous function is ran as a coroutine and has an object name coroutine when debugging. To run a coroutine, it must be scheduled on the event loop. Using the
await keyword releases control back to the event loop. Instead of awaiting a coroutine, you can also setup a task to schedule a coroutine to run at a certain time using the asyncio package. See documentation
asyncio.create_task(coro, *, name=None, context=None)
Python does not block execution but rather ensures that it happens at the correct time using CPU parallelism. Using a method like request.get() should not create any forthcoming errors because you are accessing values over the internet and not locally. Network requests move, essentially at the speed of light, where as database requests on the server are limited by the speed of the SSD or HDD. This was more of an issue on old drives where you are literally waiting for a disk to spin, but the idea of properly managing these local resources is still a issue that has to be solved in order for the asynchronous application to work. When making calls to the sql database, make sure to use the decorator #database_sync_to_async above the regular 'synchronous' functions that your asynchronous function calls.
from channels.db import database_sync_to_async
For more information check out the django documentation on asynchronous support.
I know how node.js executes code asynchronously without blocking the main thread of execution by using the event-loop for scheduling the asynchronous tasks, but I'm not clear on how the main thread actually decides to put aside a piece of code for asynchronous execution.
(Basically what indicates that this piece of code should be executed asynchronously and not synchronously, what are the differentiating factors?)
And also, what are the asynchronous and synchronous APIs provided by node?
There is a mistake in your assumption when you ask:
what indicates that this piece of code should be executed asynchronously and not synchronously
The mistake is thinking that some code are executed asynchronously. This is not true.
Javascript (including node.js) executes all code synchronously. There is no part of your code that is executed asynchronously.
So at first glance that is the answer to your question: there is no such thing as asynchronous code execution.
Wait, what?
But what about all the asynchronous stuff?
Like I said, node.js (and javascript in general) executes all code synchronously. However javascript is able to wait for certain events asynchronously. There are no asynchronous code execution, however there is asynchronous waiting.
What's the difference between code execution and waiting?
Let's look an an example. For the sake of clarity I'll use pseudocode in a fake made-up language to remove any confusion from javascript syntax. Let's say we want to read from a file. This fake language supports both synchronous and asynchronous waiting:
Example1. Synchronously wait for the drive to return bytes form the file
data = readSync('filename.txt');
// the line above will pause the execution of code until all the
// bytes have been read
Example2. Asynchronously wait for the drive to return bytes from the file
// Since it is asynchronous we don't want the read function to
// pause the execution of code. Therefore we cannot return the
// data. We need a mechanism to accept the returned value.
// However, unlike javascript, this fake language does not support
// first-class functions. You cannot pass functions as arguments
// to other functions. However, like Java and C++ we can pass
// objects to functions
class MyFileReaderHandler inherits FileReaderHandler {
method callback (data) {
// process file data in here!
}
}
myHandler = new MyFileReaderHandler()
asyncRead('filename.txt', myHandler);
// The function above does not wait for the file read to complete
// but instead returns immediately and allows other code to execute.
// At some point in the future when it finishes reading all data
// it will call the myHandler.callback() function passing it the
// bytes form the file.
As you can see, asynchronous I/O is not special to javascript. It has existed long before javascript in C++ and even C libraries dealing with file I/O, network I/O and GUI programming. In fact it has existed even before C. You can do this kind of logic in assembly (and indeed that's how people design operating systems).
What's special about javascript is that due to it's functional nature (first-class functions) the syntax for passing some code you want to be executed in the future is simpler:
asyncRead('filename.txt', (data) => {
// process data in here
});
or in modern javascript can even be made to look like synchronous code:
async function main () {
data = await asyncReadPromise('filename.txt');
}
main();
What's the difference between waiting and executing code. Don't you need code to check on events?
Actually, you need exactly 0% CPU time to wait for events. You just need to execute some code to register some interrupt handlers and the CPU hardware (not software) will call your interrupt handler when the interrupt occurs. And all manner of hardware are designed to trigger interrupts: keyboards, hard disk, network cards, USB devices, PCI devices etc.
Disk and network I/O are even more efficient because they also use DMA. These are hardware memory readers/writers that can copy large chunks (kilobytes/megabytes) of memory from one place (eg. hard disk) to another place (eg. RAM). The CPU actually only needs to set up the DMA controller and then is free to do something else. Once the DMA controller completes the transfer it will trigger an interrupt which will execute some device driver code which will inform the OS that some I/O event has completed which will inform node.js which will execute your callback or fulfill your Promise.
All the above use dedicated hardware instead of executing instructions on the CPU to transfer data. Thus waiting for data takes 0% CPU time.
So the parallelism in node.js has more to do with how many PCIe lanes your CPU supports than how many CPU cores it has.
You can, if you want to, execute javascript asynchronously
Like any other language, modern javascript has multi-threading support in the form of webworkers in the browser and worker_threads in node.js. But this is regular multi-threading like any other language where you deliberately start another thread to execute your code asynchronously.
When Node.js starts, it initializes the event loop, processes the provided input script which may make async API calls, schedule timers, or call process.nextTick(), then begins processing the event loop.
There are seven phases and each phase has its own event queue which is based on FIFO.
So application makes a request event, event demultiplexer gathers those requests and pushes to respective event queues.
For example, If my code makes two reqeusts one is setTimeOut() and another is some API Call, demultiplexer will push the first one in timer queue and other in poll queue.
But events are there, and loop watches over those queues and events, on completion in pushes the registered callback to the callstack where it is processed.
My question is,
1). Who handles events in event queue to OS?
2). Does event loop polls for event completion in each event queue or does OS notifies back?
3). Where and who decides whether to call native asyncrhonous API or handle over to a thread pool?
I am very verge of understanding this, I have been strugling a lot to grasp the concepts. There are a lot of false information about node.js event loop and how it handles asynchronous calls using one thread.
Please answer this questions if possible. Below are the references where I could get some better insight from.
https://github.com/nodejs/nodejs.org/blob/master/locale/en/docs/guides/event-loop-timers-and-nexttick.md
https://dev.to/lunaticmonk/understanding-the-node-js-event-loop-phases-and-how-it-executes-the-javascript-code-1j9
how does reactor pattern work in Node.js?
https://www.youtube.com/watch?v=PNa9OMajw9w&t=3s
Who handles events in event queue to OS?
How OS events work depends upon the specific type of event. Disk I/O works one way and Networking works a different way. So, you can't ask about OS events generically - you need to ask about a specific type of event.
Does event loop polls for event completion in each event queue or does OS notifies back?
It depends. Timers for example are built into the event loop and the head of the timer list is checked to see if it's time has come in each timer through the event loop. File I/O is handled by a thread pool and when a disk operation completes, the thread inserts a completion event into the appropriate queue directly so the event loop will just find it there the next time through the event loop.
Where and who decides whether to call native asynchronous API or handle over to a thread pool?
This was up to the designers of nodejs and libuv and varies for each type of operation. The design is baked into nodejs and you can't yourself change it. Nodejs generally uses libuv for cross platform OS access so, in most cases, it's up to the libuv design for how it handles different types of OS calls. In general, if all the OSes that nodejs runs on offer a non-blocking, asynchronous mechanism, then libuv and nodejs will use it (like for networking). If they don't (or it's problematic to make them all work similarly), then libuv will build their own abstraction (as with file I/O and a thread pool).
You do not need to know the details of how this works to program asynchronously in nodejs. You make a call and get a callback (or resolved promise) when its done, regardless of how it works internally. For example, nodejs offers some asynchronous crypto APIs. They happen to be implemented using a thread pool, but you don't need to know that in order to use them.
I just read an article about the event loop in JavaScript.
I found two contradictive phrases and I would be glad if someone could clarify.
A downside of this model is that if a message takes too long to
complete, the web application is unable to process user interactions
like click or scroll. The browser mitigates this with the "a script is
taking too long to run" dialog
A very interesting property of the event loop model is that
JavaScript, unlike a lot of other languages, never blocks. Handling
I/O is typically performed via events and callbacks, so when the
application is waiting for an IndexedDB query to return or an XHR
request to return, it can still process other things like user input
So, when is the first one true and when is the second one true?
"A very interesting property of the event loop model is that
JavaScript, unlike a lot of other languages, never blocks.
This is misleading. Without clever programming, JavaScript would always block the UI thread, because runtime logic always blocks the UI, by design. At a smooth sixty frames a second, that means your application logic must always cooperatively yield control (or simply complete execution) within about 16 milliseconds, otherwise your UI will freeze or stutter.
Because of this, most JavaScript APIs that might take a long time (eg. network requests) are designed in such a way to use techniques (eg callbacks, promises) to circumvent this problem, so that they do not block the event loop, avoiding the UI becoming unresponsive.
Put another way: host environments (eg a Web browser or a Node.js runtime instance) are specifically designed to enable the use of an event-based programming model (originally inspired by programming environments like Hypercard on the Mac) whereby the host environment can be asked to perform a long-running task (eg run a timer), without blocking the main thread of execution, and for your program to be notified later, via an "event" when the long-running task is complete, enabling your program to pick-up where it left-off.
Both are correct, even though I agree it is somewhat wrongly expressed.
So by points:
It's true that if a synchronous task takes too long to complete, the event loop "gets stuck" there and then all other queued tasks can't run till it finishes.
Here it is talking about asynchronous tasks so even though an HTTP request, an I/O request or whatever that is async takes too long to process, all the synchronous tasks can keep doing their job, like processing user input
There are two types of code inside Javascript
Synchronous (it's like going one by one).
Asynchronous (it's like skipping for the future)
Synchronous code
You want to find the prime number from 1 to 10000000 with synchronous code you will write a function and that function will perform the calculation and finds out the prime number in the given range but what will happen with synchronous code. The javascript engine is not able to do any task until that task gets finished.
Asynchronous Code
If you wrap the same code inside a callback or more friendly with the SetTimeout method the javascript put that function inside the event queue and perform the other operation when a certain time came the timeout method fires callback certainly when there is nothing inside the call stack, it will ask event loop to pass the first thing which is inside the event queue. So this more about finding an idle time to perform the heavy operation.
Use javascript workers to perform heavy mathematics tasks not
SetTimeout because eventually, it will block the engine when the
function is inside the call stack.
I am thinking about it and this is what I came up with:
Let's see this code below:
console.clear();
console.log("a");
setTimeout(function(){console.log("b");},1000);
console.log("c");
setTimeout(function(){console.log("d");},0);
A request comes in, and JS engine starts executing the code above step by step. The first two calls are sync calls. But when it comes to setTimeout method, it becomes an async execution. But JS immediately returns from it and continue executing, which is called Non-Blocking or Async. And it continues working on other etc.
The results of this execution is the following:
a c d b
So basically the second setTimeout got finished first and its callback function gets executed earlier than the first one and that makes sense.
We are talking about single-threaded application here. JS Engine keeps executing this and unless it finishes the first request, it won't go to second one. But the good thing is that it won't wait for blocking operations like setTimeout to resolve so it will be faster because it accepts the new incoming requests.
But my questions arise around the following items:
#1: If we are talking about a single-threaded application, then what mechanism processes setTimeouts while the JS engine accepts more requests and executes them? How does the single thread continue working on other requests? What works on setTimeout while other requests keep coming in and get executed.
#2: If these setTimeout functions get executed behind the scenes while more requests are coming in and being executed, what carries out the async executions behind the scenes? What is this thing that we talk about called the EventLoop?
#3: But shouldn't the whole method be put in the EventLoop so that the whole thing gets executed and the callback method gets called? This is what I understand when talking about callback functions:
function downloadFile(filePath, callback)
{
blah.downloadFile(filePath);
callback();
}
But in this case, how does the JS Engine know if it is an async function so that it can put the callback in the EventLoop? Perhaps something like the async keyword in C# or some sort of an attribute which indicates the method JS Engine will take on is an async method and should be treated accordingly.
#4: But an article says quite contrary to what I was guessing on how things might be working:
The Event Loop is a queue of callback functions. When an async
function executes, the callback function is pushed into the queue. The
JavaScript engine doesn't start processing the event loop until the
code after an async function has executed.
#5: And there is this image here which might be helpful but the first explanation in the image is saying exactly the same thing mentioned in question number 4:
So my question here is to get some clarifications about the items listed above?
1: If we are talking about a single-threaded application, then what processes setTimeouts while JS engine accepts more requests and executes them? Isn't that single thread will continue working on other requests? Then who is going to keep working on setTimeout while other requests keep coming and get executed.
There's only 1 thread in the node process that will actually execute your program's JavaScript. However, within node itself, there are actually several threads handling operation of the event loop mechanism, and this includes a pool of IO threads and a handful of others. The key is the number of these threads does not correspond to the number of concurrent connections being handled like they would in a thread-per-connection concurrency model.
Now about "executing setTimeouts", when you invoke setTimeout, all node does is basically update a data structure of functions to be executed at a time in the future. It basically has a bunch of queues of stuff that needs doing and every "tick" of the event loop it selects one, removes it from the queue, and runs it.
A key thing to understand is that node relies on the OS for most of the heavy lifting. So incoming network requests are actually tracked by the OS itself and when node is ready to handle one it just uses a system call to ask the OS for a network request with data ready to be processed. So much of the IO "work" node does is either "Hey OS, got a network connection with data ready to read?" or "Hey OS, any of my outstanding filesystem calls have data ready?". Based upon its internal algorithm and event loop engine design, node will select one "tick" of JavaScript to execute, run it, then repeat the process all over again. That's what is meant by the event loop. Node is basically at all times determining "what's the next little bit of JavaScript I should run?", then running it. This factors in which IO the OS has completed, and things that have been queued up in JavaScript via calls to setTimeout or process.nextTick.
2: If these setTimeout will get executed behind the scenes while more requests are coming and in and being executed, the thing carry out the async executions behind the scenes is that the one we are talking about EventLoop?
No JavaScript gets executed behind the scenes. All the JavaScript in your program runs front and center, one at a time. What happens behind the scenes is the OS handles IO and node waits for that to be ready and node manages its queue of javascript waiting to execute.
3: How can JS Engine know if it is an async function so that it can put it in the EventLoop?
There is a fixed set of functions in node core that are async because they make system calls and node knows which these are because they have to call the OS or C++. Basically all network and filesystem IO as well as child process interactions will be asynchronous and the ONLY way JavaScript can get node to run something asynchronously is by invoking one of the async functions provided by the node core library. Even if you are using an npm package that defines it's own API, in order to yield the event loop, eventually that npm package's code will call one of node core's async functions and that's when node knows the tick is complete and it can start the event loop algorithm again.
4 The Event Loop is a queue of callback functions. When an async function executes, the callback function is pushed into the queue. The JavaScript engine doesn't start processing the event loop until the code after an async function has executed.
Yes, this is true, but it's misleading. The key thing is the normal pattern is:
//Let's say this code is running in tick 1
fs.readFile("/home/barney/colors.txt", function (error, data) {
//The code inside this callback function will absolutely NOT run in tick 1
//It will run in some tick >= 2
});
//This code will absolutely also run in tick 1
//HOWEVER, typically there's not much else to do here,
//so at some point soon after queueing up some async IO, this tick
//will have nothing useful to do so it will just end because the IO result
//is necessary before anything useful can be done
So yes, you could totally block the event loop by just counting Fibonacci numbers synchronously all in memory all in the same tick, and yes that would totally freeze up your program. It's cooperative concurrency. Every tick of JavaScript must yield the event loop within some reasonable amount of time or the overall architecture fails.
Don't think the host process to be single-threaded, they are not. What is single-threaded is the portion of the host process that execute your javascript code.
Except for background workers, but these complicate the scenario...
So, all your js code run in the same thread, and there's no possibility that you get two different portions of your js code to run concurrently (so, you get not concurrency nigthmare to manage).
The js code that is executing is the last code that the host process picked up from the event loop.
In your code you can basically do two things: run synchronous instructions, and schedule functions to be executed in future, when some events happens.
Here is my mental representation (beware: it's just that, I don't know the browser implementation details!) of your example code:
console.clear(); //exec sync
console.log("a"); //exec sync
setTimeout( //schedule inAWhile to be executed at now +1 s
function inAWhile(){
console.log("b");
},1000);
console.log("c"); //exec sync
setTimeout(
function justNow(){ //schedule justNow to be executed just now
console.log("d");
},0);
While your code is running, another thread in the host process keep track of all system events that are occurring (clicks on UI, files read, networks packets received etc.)
When your code completes, it is removed from the event loop, and the host process return to checking it, to see if there are more code to run. The event loop contains two event handler more: one to be executed now (the justNow function), and another within a second (the inAWhile function).
The host process now try to match all events happened to see if there handlers registered for them.
It found that the event that justNow is waiting for has happened, so it start to run its code. When justNow function exit, it check the event loop another time, searhcing for handlers on events. Supposing that 1 s has passed, it run the inAWhile function, and so on....
The Event Loop has one simple job - to monitor the Call Stack, the Callback Queue and Micro task queue. If the Call Stack is empty, the Event Loop will take the first event from the micro task queue then from the callback queue and will push it to the Call Stack, which effectively runs it. Such an iteration is called a tick in the Event Loop.
As most developers know, that Javascript is single threaded, means two statements in javascript can not be executed in parallel which is correct. Execution happens line by line, which means each javascript statements are synchronous and blocking. But there is a way to run your code asynchronously, if you use setTimeout() function, a Web API given by the browser, which makes sure that your code executes after specified time (in millisecond).
Example:
console.log("Start");
setTimeout(function cbT(){
console.log("Set time out");
},5000);
fetch("http://developerstips.com/").then(function cbF(){
console.log("Call back from developerstips");
});
// Millions of line code
// for example it will take 10000 millisecond to execute
console.log("End");
setTimeout takes a callback function as first parameter, and time in millisecond as second parameter.
After the execution of above statement in browser console it will print
Start
End
Call back from developerstips
Set time out
Note: Your asynchronous code runs after all the synchronous code is done executing.
Understand How the code execution line by line
JS engine execute the 1st line and will print "Start" in console
In the 2nd line it sees the setTimeout function named cbT, and JS engine pushes the cbT function to callBack queue.
After this the pointer will directly jump to line no.7 and there it will see promise and JS engine push the cbF function to microtask queue.
Then it will execute Millions of line code and end it will print "End"
After the main thread end of execution the event loop will first check the micro task queue and then call back queue. In our case it takes cbF function from the micro task queue and pushes it into the call stack then it will pick cbT funcion from the call back queue and push into the call stack.
JavaScript is high-level, single-threaded language, interpreted language. This means that it needs an interpreter which converts the JS code to a machine code. interpreter means engine. V8 engines for chrome and webkit for safari. Every engine contains memory, call stack, event loop, timer, web API, events, etc.
Event loop: microtasks and macrotasks
The event loop concept is very simple. There’s an endless loop, where the JavaScript engine waits for tasks, executes them and then sleeps, waiting for more tasks
Tasks are set – the engine handles them – then waits for more tasks (while sleeping and consuming close to zero CPU). It may happen that a task comes while the engine is busy, then it’s enqueued. The tasks form a queue, so-called “macrotask queue”
Microtasks come solely from our code. They are usually created by promises: an execution of .then/catch/finally handler becomes a microtask. Microtasks are used “under the cover” of await as well, as it’s another form of promise handling. Immediately after every macrotask, the engine executes all tasks from microtask queue, prior to running any other macrotasks or rendering or anything else.