I have a question on the following code below (Source: https://blog.risingstack.com/node-js-at-scale-understanding-node-js-event-loop/):
'use strict'
const express = require('express')
const superagent = require('superagent')
const app = express()
app.get('/', sendWeatherOfRandomCity)
function sendWeatherOfRandomCity (request, response) {
getWeatherOfRandomCity(request, response)
sayHi()
}
const CITIES = [
'london',
'newyork',
'paris',
'budapest',
'warsaw',
'rome',
'madrid',
'moscow',
'beijing',
'capetown',
]
function getWeatherOfRandomCity (request, response) {
const city = CITIES[Math.floor(Math.random() * CITIES.length)]
superagent.get(`wttr.in/${city}`)
.end((err, res) => {
if (err) {
console.log('O snap')
return response.status(500).send('There was an error getting the weather, try looking out the window')
}
const responseText = res.text
response.send(responseText)
console.log('Got the weather')
})
console.log('Fetching the weather, please be patient)
}
function sayHi () {
console.log('Hi')
}
app.listen(3000);
I have these questions:
When the superagent.get(wttr.in/${city}) in the getWeatherOfRandomCity method makes a web request to http://wttr.in/sf for example, that request will be placed on the task queue and not the main call stack correct?
If there are methods on the main call stack (i.e. the main call stack isn't empty), the end event attached to the superagent.get(wttr.in/${city}).end(...) (that will be pushed to the task queue) will not get called until the main call stack is empty correct? In other words, on every tick of the event loop, it will take one item from the task queue?
Let's say two requests to localhost:3000/ come in one after another. The first request will push the sendWeatherOfRandomCity on the stack, the getWeatherOfRandomCity on the stack, then the web request superagent.get(wttr.in/${city}).end(...) will be placed on the background queue, then console.log('Fetching the weather, please be patient'), then the sendWeatherOfRandomCity will pop off the stack, and finally sayHi() will be pushed on the stack and it will print "Hi" and pop off the stack and finally, the end event attached to superagent.get(wttr.in/${city}).end(...) will be called from the task queue since the main call stack will be empty. Now when the second request comes, it will push all of those same things as the first request onto the main call stack but will the end handler from the first request (still in the task queue) run first or the stuff pushed onto the main call stack by the second web request will run first?
when you make http request, libuv sees that you are attempting to make a network request. Neither libuv nor node has any code to handle all of these operations that are involved with a network request. Instead libuv delegates the request making to the underlying operating system.
It's actually the kernel which is the essential part of our operating system that does the real network request work. Libuv is used to issue the request and then it just waits on the operating system to emit a signal that some response has come back to the request. So because Libuv is delegating the work done to the operating system the operating system itself decides whether to make a new threat or not. Or just generally how to handle the entire process of making the request. Each different os has a different method to handle this: On linux it is epoll, in mac os it is called kqueue, and in windows it is called GetQueuedCompletionStatusEx.
there are 6 phases in event loop and one of them is i/o poll. And every phase has precedence over others. number 1 is always timers. When time is up,(or event is done) timer's callback will be called to event queue and timer function are gonna move to event queue as well. Then event loop will check if it call stack is available. Call stack is where functions go to execute. You can do one thing at a time and the call stack enforces that we can only have one function on the top of the call stack that is the thing we're doing.There's no way to execute two things at the same time in JAVASCRIPT RUNTIME.
if the call stack is empty means main() function is removed, event loop will push the timer function to call stack and your function will be executed. None of the Async callbacks are EVER going to run before the main function is done.
So when it is time for i/o poll phase which handles incoming data and connections, with the same path how timer function followed, your function with handles the fetching will be executed.
Related
Javascript is single threaded and - Node.js uses an asynchronous event-driven design pattern, which means that multiple actions are taken at the same time while executing a program.
With this in mind, I have a pseudo code:
myFunction() // main flow
var httpCallMade = false // a global variable
async myFunction() {
const someData = await callDB() // LINE 1 network call
renderMethod() // LINE 2 flow1
}
redisPubSubEventHandler() { // a method that is called from redis subscription asynchronously somewhere from a background task in the program
renderMethod() // LINE 3 flow2
}
renderMethod(){
if(!httpCallMade) {
httpCallMade = true //set a global flag
const res = makeHTTPCall() // an asynchronous network call. returns a promise.
} // I want to ensure that this block is "synchronized" and is not acessible by flow1 and flow2 simultaneously!
}
myFunction() is called in the main thread - while redisPubSubEventHandler() is called asynchronously from a background task in the program. Both flows would end in calling renderMethod(). The idea is to ensure makeHTTPCall() (inside renderMethod) is only allowed to be called once
Is it guaranteed that renderMethod() would never be executed in parallel by LINE2 and LINE3 at the same time? My understanding is that as soon as renderMethod() is executed - event loop will not allow anything else to happen in server - which guarantees that it is only executed once at a given time (even if it had a network call without await).
Is this understanding correct?
If not, how do I make synchronize/lock entry to renderMethod?
Javascript is single-threaded. Therefore, unless you are deliberately using threads (eg. worker_threads in node.js) no function in the current thread can be executed by two parallel threads at the same time.
This explains why javascript has no mutex or semaphore capability - because generally it is not needed (note: you can still have race conditions because asynchronous code may be executed in a sequence you did not expect).
There is a general confusion that asynchronous code means parallel code execution (multi-threaded). It can but most of the time when a system is labeled as asynchronous or non-blocking or event-oriented INSTEAD of multi-threaded it often means that the system is single-threaded.
In this case asynchronous means parallel WAIT. Not parallel code execution. Code is always executed sequentially - only, due to the ability of waiting in parallel you may not always know the sequence the code is executed in.
There are parts of javascript that execute in a separate thread. Modern browsers execute each tab and iframe in its own thread (but each tab or iframe are themselves single-threaded). But script cannot cross tabs, windows or iframes. So this is a non-issue. Script may access objects inside iframes but this is done via an API and the script itself cannot execute in the foreign iframe.
Node.js and some browsers also do DNS queries in a separate thread because there is no standardized cross-platform non-blocking API for DNS queries. But this is C code and not your javascript code. Your only interaction with this kind of multi-threading is when you pass a URL to fetch() or XMLHttpRequest().
Node.js also implement file I/O, zip compression and cryptographic functions in separate threads but again this is C code, not your javascript code. All results from these separate threads are returned to you asynchronously via the event loop so by the time your javascript code process the result we are back to executing sequentially in the main thread.
Finally both node.js and browsers have worker APIs (web workers for browsers and worker threads for node.js). However, both these API use message passing to transfer data (in node only a pointer is passed in the message thus the underlying memory is shared) and it still protects functions from having their variables overwritten by another thread.
In your code, both myFunction() and redisPubSubEventHandler() run in the main thread. It works like this:
myFunction() is called, it returns immediately when it encounters the await.
a bunch of functions are declared and compiled.
we reach the end of your script:
// I want to ensure that this method is "synchronized" and is not called by flow1 and flow2 simultaneously!
}
<----- we reach here
now that we have reached the end of script we enter the event loop...
either the callDB or the redis event completes, our process gets woken up
the event loop figures out which handler to call based on what event happened
either the await returns and call renderMethod() or redisPubSubEventHandler() gets executed and call renderMethod()
In either case both your renderMethod() calls will execute on the main thread. Thus it is impossible for renderMethod() to run in parallel.
It is possible for renderMethod() to be half executed and another call to renderMethod() happens IF it contains the await keyword. This is because the first call is suspended at the await allowing the interpreter to call renderMethod() again before the first call completes. But note that even in this case you are only in trouble if you have an await between if.. and httpCallMade = true.
You need to differentiate between synchronous and asynchronous, and single- and multi-threaded.
JavaScript is single-threaded so no two lines of the same execution context can run at the same time.
But JavaScript allows asynchronous code execution (await/async), so the code in the execution context does not need to be in the order it appears in the code but that different parts of the code can be executed interleaved (not overlapped) - which could be called "running in parallel", even so, I think this is misleading.
event-driven design pattern, which means that multiple actions are taken at the same time while executing a program.
There are certain actions that can happen at the same time, like IO, multiprocessing (WebWorkers), but that is (with respect to JavaScript Code execution) not multi-threaded.
Is it guaranteed that renderMethod() would never be executed in parallel by LINE2 and LINE3 at the same time?
Depends on what you define as parallel at the same time.
Parts of logic you describe in renderMethod() will (as you do the request asynchronously) run interleaved, so renderMethod(){ if(!httpCallMade) { could be executed multiple times before you get the response (not the Promise) back from makeHTTPCall but the code lines will never executed at the same time.
My understanding is that as soon as renderMethod() is executed - event loop will not allow anything else to happen in server - which guarantees that it is only executed once at a given time (even if it had a network call without await).
The problem here is, that you somehow need to get the data from your async response.
Therefore you either need to mark your function as async and use const res = await makeHTTPCall() this would allow code interleaving at the point of await. Or use .then(…) with a callback, which will be executed asynchronously at a later point (after you left the function)
But from the beginning of the function to the first await other the .then not interleaving could take place.
So your httpCallMade = true would prevent that another makeHTTPCall could take place, before the currently running is finished, under the assumption that you set httpCallMade to false only when the request is finished (in .then callback, or after the await)
// I want to ensure that this method is "synchronized" and is not called by flow1 and flow2 simultaneously!
As soon as a get a result in an asynchronous way, you can't go back to synchronous code execution. So you need to have a guard like httpCallMade to prevent that the logic described in renderMethod can run multiple times interleaved.
Your question really comes down to:
Given this code:
var flag = false;
function f() {
if (!flag) {
flag = true;
console.log("hello");
}
}
and considering that flag is not modified anywhere else, and many different, asynchronous events may call this function f...:
Can "hello" be printed twice?
The answer is no: if this runs on an ECMAScript compliant JS engine, then the call stack must be empty first before the next job is pulled from an event/job queue. Asynchronous tasks/reactions are pushed on an event queue. They don't execute before the currently executing JavaScript has run to completion, i.e. up until the call stack is empty. So they never interrupt running JavaScript code pre-emptively.
This is true even if these asynchronous tasks/events/jobs are scheduled by other threads, lower-level non-JS code,...etc. They all must wait their turn to be consumed by the JS engine. And this will happen one after the other.
For more information, see the ECMAScript specification on "Job". For instance 8.4 Jobs and Host Operations to Enqueue Jobs:
A Job is an abstract closure with no parameters that initiates an ECMAScript computation when no other ECMAScript computation is currently in progress.
[...]
Only one Job may be actively undergoing evaluation at any point in time.
Once evaluation of a Job starts, it must run to completion before evaluation of any other Job starts.
For example, promises generate such jobs -- See 25.6.1.3.2 Promise Resolve Functions:
When a promise resolve function is called with argument resolution, the following steps are taken:
[...]
Perform HostEnqueuePromiseJob(job.[[Job]], job.[[Realm]]).
It sounds like you want to do something like a 'debounce', where any event will cause makeHttpCall() execute, but it should only be executing once at a time, and should execute again after the last call if another event has occurred while it was executing. So like this:
DB Call is made, and makeHttpCall() should execute
While makeHttpCall() is executing, you get a redis pub/sub event that should execute makeHttpCall() again, but that is delayed because it is already executing
Still before the first call is done, another DB call is made and requires makeHttpCall() to execute again. But even though you have received two events, you only need to have it called one time to update something with the most recent information you have.
The first call to makeHttpCall() finishes, but since there have been two events, you need to make a call again.
const makeHttpCall = () => new Promise(resolve => {
// resolve after 2 seconds
setTimeout(resolve, 2000);
});
// returns a function to call that will call your function
const createDebouncer = (fn) => {
let eventCounter = 0;
let inProgress = false;
const execute = () => {
if (inProgress) {
eventCounter++;
console.log('execute() called, but call is in progress.');
console.log(`There are now ${eventCounter} events since last call.`);
return;
}
console.log(`Executing... There have been ${eventCounter} events.`);
eventCounter = 0;
inProgress = true;
fn()
.then(() => {
console.log('async function call completed!');
inProgress = false;
if (eventCounter > 0) {
// make another call if there are pending events since the last call
execute();
}
});
}
return execute;
}
let debouncer = createDebouncer(makeHttpCall);
document.getElementById('buttonDoEvent').addEventListener('click', () => {
debouncer();
});
<button id="buttonDoEvent">Do Event</button>
I have a:
socketio server
array of values
async function which iterates the array and deletes them
Here is the code
// import
const SocketIo = require('socket.io')
// setup
const Server = new SocketIo.Server()
const arrayOfValues = ['some', 'values', 'here']
// server welcome message
Server.on('connection', async Socket=>{
iterateValues() // server no longer responds, even if function is async
Socket.emit('welcome', 'hi')
})
// the async function
async function iterateValues(){
while(true){
if( arrayOfValues[0] ) arrayOfValues.splice(0, 1)
}
}
Whenever the client connects to my server, "iterateValues()" must be invoked and asynchronously remove the items from array. The problem is when "iterateValues()" is invoked anywhere, it halts the script, and the server no longer responds.
I think I misunderstood what async functions actually are. How can I create a function that can run an infinite while loop without halting the thread of other methods?
I think I misunderstood what async functions actually are.
I am not a back end guy, but will give it a shot. Nodejs still runs in a single thread. So an infinite loop will block the whole program. Asynchronous doesn't mean that the provided callback runs in a separate thread. If nodejs was multithreaded you could have infinite loop in one thread, that wouldn't block the other one for example.
Asynchronous means that the provided callback won't be run immediately, rather delayed, but when it is run it is still within that one thread.
I am using this code in for loop: my purpose was to once I get response of 1st request then only it should execute second request then 3rd and so on.
queueScheduler.schedule(() => {
this.ajaxcall().subscribe((res) => {
console.log('blah blah blah', res);
});
});
Now if my loop has 10 request, it means that while 2request is in progress. then 8 requests are in queue. how can I know how many requests are in queue?
How to check that pending requests?
You are scheduling creation of ajaxcall's, but creating and subscribing to the observable is synchronous (the subscription callback will be called asynchronously, but you don't refer to the queueScheduler there).
Therefore, queueScheduler runs the scheduled task synchronously and queueScheduler.actions is the empty queue.
The queueScheduler only queues up tasks if you call queueScheduler inside of a task run by the queueScheduler.
My application is using socket.io, from what I gather, socket.io executes asynchronously. Most of the time this is not a problem, however there is a particular case where 2 users in my app may call the same socket endpoint at the same time, and this causes issues.
What I would rather have is for each socket endpoint to wait until the one before it finishes executing, before it gets executed. If they run asynchronously, I get unexpected results.
On the server I have the following...
// Establish a connection with a WebSocket.
io.on("connection", socket => {
socket.on("add_song", async (data) => {
PlaylistHandler.add_song(io, socket, data);
});
...
...
add_song gets called at the same time by two different io connections (2 different users). I don't want the function PlaylistHandler.add_song to run in parallel for each so I tried using async/await...
await PlaylistHandler.add_song(io, socket, data);
That didn't solve anything because I suspect it is because there are two different io connections making the call.
Is there any way to make the socket call itself run sequentially rather than in parallel?
await doesn't block the event loop, so it indeed doesn't matter here. Both await PlaylistHandler.add_song will get executed in their respective io.on listeners in parallel.
Your best/easiest shot is to set a variable calculating = true at the beginning of your add_song, and to postpone any add_song if we are already doing one.
Hope this snippet will inspire you into achieving a workable solution:
let calculating = false;
function add_song(io, socket, data){
if(calculating){
setTimeout(function(){
add_song(io, socket, data)
}, 500); //depends on how often you want to check, reduce/increase timeout depending on how time-sensitive checking should be
return;
}
calculating = true;
//Do all your usual add_song processing
//After final operation of add_song
calculating = false;
}
I am quite confused about why is my promise blocking the node app requests.
Here is my simplified code:
var express = require('express');
var someModule = require('somemodule');
app = express();
app.get('/', function (req, res) {
res.status(200).send('Main');
});
app.get('/status', function (req, res) {
res.status(200).send('Status');
});
// Init Promise
someModule.doSomething({}).then(function(){},function(){}, function(progress){
console.log(progress);
});
var server = app.listen(3000, function () {
var host = server.address().address;
var port = server.address().port;
console.log('Example app listening at http://%s:%s in %s environment',host, port, app.get('env'));
});
And the module:
var q = require('q');
function SomeModule(){
this.doSomething = function(){
return q.Promise(function(resolve, reject, notify){
for (var i=0;i<10000;i++){
notify('Progress '+i);
}
resolve();
});
}
}
module.exports = SomeModule;
Obviously this is very simplified. The promise function does some work that takes anywhere from 5 to 30 minutes and has to run only when server starts up.
There is NO async operation in that promise function. Its just a lot of data processing, loops etc.
I wont to be able to do requests right away though. So what I expect is when I run the server, I can go right away to 127.0.0.1:3000 and see Main and same for any other requests.
Eventually I want to see the progress of that task by accessing /status but Im sure I can make that work once the server works as expected.
At the moment, when I open / it just hangs until the promise job finishes..
Obviously im doing something wrong...
If your task is IO-bound go with process.nextTick. If your task is CPU-bound asynchronous calls won't offer much performance-wise. In that case you need to delegate the task to another process. An example solution would be to spawn a child process, do the work and pipe the results back to the parent process when done.
See nodejs.org/api/child_process.html for more.
If your application needs to do this often then forking lots of child processes quickly becomes a resource hog - each time you fork, a new V8 process will be loaded into memory. In this case it is probably better to use one of the multiprocessing modules like Node's own Cluster. This module offers easy creation and communication between master-worker processes and can remove a lot of complexity from your code.
See also a related question: Node.js - Sending a big object to child_process is slow
The main thread of Javascript in node.js is single threaded. So, if you do some giant loop that is processor bound, then that will hog the one thread and no other JS will run in node.js until that one operation is done.
So, when you call:
someModule.doSomething()
and that is all synchronous, then it does not return until it is done executing and thus the lines of code following that don't execute until the doSomething() method returns. And, just so you understand, the use of promises with synchronous CPU-hogging code does not help your cause at all. If it's synchronous and CPU bound, it's just going to take a long time to run before anything else can run.
If there is I/O involves in the loop (like disk I/O or network I/O), then there are opportunities to use async I/O operations and make the code non-blocking. But, if not and it's just a lot of CPU stuff, then it will block until done and no other code will run.
Your opportunities for changing this are:
Run the CPU consuming code in another process. Either create a separate program that you run as a child process that you can pass input to and get output from or create a separate server that you can then make async requests to.
Break the non-blocking work into chunks where you execute 100ms chunks of work at a time, then yield the processor back to the event loop (using something like setTimeout() to allow other things in the event queue to be serviced and run before you pick up and run the next chunk of work. You can see Best way to iterate over an array without blocking the UI for ideas on how to chunk synchronous work.
As an example, you could chunk your current loop. This runs up to 100ms of cycles and then breaks execution to give other things a chance to run. You can set the cycle time to whatever you want.
function SomeModule(){
this.doSomething = function(){
return q.Promise(function(resolve, reject, notify){
var cntr = 0, numIterations = 10000, timePerSlice = 100;
function run() {
if (cntr < numIterations) {
var start = Date.now();
while (Date.now() - start < timePerSlice && cntr < numIterations) {
notify('Progress '+cntr);
++cntr;
}
// give some other things a chance to run and then call us again
// setImmediate() is also an option here, but setTimeout() gives all
// other operations a chance to run alongside this operation
setTimeout(run, 10);
} else {
resolve();
}
}
run();
});
}
}