Node.js httpserver.listen method ambiguity

Node.js httpserver.listen method ambiguity - javascript

I have been working in Node.js and I am wondering what exactly does listen method do, in terms of eventloop. If I had a long running request, does it mean that server will never listen since it can only do one work at a time.
var http = require('http');
function handleRequest(request, response) {
response.end('Some Response at ' + request.url);
}
var server = http.createServer(handleRequest);
server.listen(8083, function() {
console.log('Listening...')
})
Is server.listen listening to some event?

You can think of server.listen() as starting your web server so that it is actually listening for incoming requests at the TCP level. From the node.js http documentation for .listen():
Begin accepting connections on the specified port and hostname.
The callback passed to server.listen() is optional. It is only called once to indicate that the server has been successfully started and is now listening for incoming requests. It is not what is called on every new incoming request. The callback passed to .createServer() is what is called for every new incoming request.
Multiple incoming requests can be in process at the same time though due to the single-threaded nature of node.js only one request is actually executing JS code at once.
But, a long running request is generally idle most of the time (e.g. waiting for database I/O or disk I/O or network I/O) so other requests can be processed and run during that idle time. This is the async nature of node.js and why it is important to use asynchronous I/O programming with node.js rather than synchronous I/O processing because asynchronous I/O allows other requests to run during the time when node.js is just waiting for I/O.

Yes, it basically binds an event listener to that port; similar to how event listeners work in your own code. Going more in depth would involve sockets, etc...
https://nodejs.org/api/net.html#net_server_listen_port_host_backlog_callback

The other answers are essentially correct, but I wanted to add more detail.
When you call createServer, the handler you pass in is what gets called on every incoming HTTP connection. But that is merely setting that up: it does not actually start the server or start listening for those connections. That doesn't happen until you call listen.
The (optional) callback for listen is just what gets called when the server has successfully started and is now listening for connections. Most of the time, it's simply used to log to the console that the server is started. You could also use it to record server start time for uptime monitoring. That callback is NOT invoked for every HTTP request: only once on server startup.
You don't even have to supply the callback for listen; it works fine without it. Here are some common variations (note that it's a good practice to let the port be specified by an environment variable, usually PORT; if that environment variable isn't set, there is a default):
// all in one line, no startup message
var server = http.createServer(handler).listen(process.env.PORT || 8083);
// two lines, no startup message
var server = http.createServer(handler); // server NOT started
server.listen(process.env.PORT || 8083); // server started, no confirmation
// most typical variation
var server = http.createServer(handler);
server.listen(process.env.PORT || 8083, function() {
// server started, startup confirmed - note that this only gets called once
console.log('server started at ' + Date.now());
});

Related

What node js returns to the client when it waits for the response?

I am new to nodejs but i am aware of asynchronous model of javascript in general. I come from grails/servlet background. In servlet, when request is sent to the server everything is synchronous, it computes the result and send it back to the client. If the result takes long then we thread it and store it somewhere to retrieve it later but the response is not hold rather another request is performed to get the result.
In nodejs however, my understanding so far is it waits for the response until it is computed from some asynchronous callbacks.
Now my assumption is, nodejs must return something to the client because the javascript callstack doesn't wait for the return. But NO, the proper response is sent to the client.
Now my question is how the client waits until it gets the response from the callbacks or some promises?
Here is an example:
var express = require('express');
var {mongoose} = require('./db/mongoose');
var {User} = require('./models/user');
var app = express();
app.get('/users',(req,res)=>{
User.find().then((result)=>{ // here response is calculated inside then
// which is retrieved later at this point how nodejs waits for
// this result
res.send(result)
},(e)=>{
});
});
app.listen(3000);

Now my assumption is, nodejs must return something to the client because the javascript callstack doesn't wait for the return.
It doesn't.
Now my question is how the client waits until it gets the response from the callbacks or some promises?
It just … waits. It doesn't need to be told to wait. It knows that sending a message over the network won't get an instant response.
If too much time passes before it gets a response, then it will timeout and give up.

Nodejs will be wait for the response and Node Js doesn't do anything for wait for the response. But if you wants to set any user prevent interface then you can use any library for it.
If client gets too much time for wait then nodejs itself throw any exception (if there are issue will be happen at server side) or timeout.

How does single-threaded Node.js handles requests concurrently?

I am currently deeply learning Nodejs platform. As we know, Nodejs is single-threaded, and if it executes blocking operation (for example fs.readFileSync), a thread should wait to finish that operation. I decided to make an experiment: I created a server that responses with the huge amount of data from a file on each request
const { createServer } = require('http');
const fs = require('fs');
const server = createServer();
server.on('request', (req, res) => {
let data;
data =fs.readFileSync('./big.file');
res.end(data);
});
server.listen(8000);
Also, I launched 5 terminals in order to do parallel requests to a server. I waited to see that while one request is being handled, the others should wait for finishing blocking operation from the first request. However, the other 4 requests were responded concurrently. Why does this behavior occur?

What you're likely seeing is either some asynchronous part of the implementation inside of res.end() to actually send your large amount of data or you are seeing all the data get sent very quickly and serially, but the clients can't process it fast enough to actually show it serially and because the clients are each in their own separate process, they "appear" to show it arriving concurrently just because they're too slow reacting to show the actually arrival sequence.
One would have to use a network sniffer to see which of these is actually occurring or run some different tests or put some logging inside the implementation of res.end() or tap into some logging inside the client's TCP stack to determine the actual order of packet arrival among the different requests.
If you have one server and it has one request handler that is doing synchronous I/O, then you will not get multiple requests processes concurrently. If you believe that is happening, then you will have to document exactly how you measured that or concluded that (so we can help you clear up your misunderstanding) because that is not how node.js works when using blocking, synchronous I/O such as fs.readFileSync().
node.js runs your JS as single threaded and when you use blocking, synchronous I/O, it blocks that one single thread of Javascript. That's why you should never use synchronous I/O in a server, except perhaps in startup code that only runs once during startup.
What is clear is that fs.readFileSync('./big.file') is synchronous so your second request will not get started processing until the first fs.readFileSync() is done. And, calling it on the same file over and over again will be very fast (OS disk caching).
But, res.end(data) is non-blocking, asynchronous. res is a stream and you're giving the stream some data to process. It will send out as much as it can over the socket, but if it gets flow controlled by TCP, it will pause until there's more room to send on the socket. How much that happens depends upon all sorts of things about your computer, it's configuration and the network link to the client.
So, what could be happening is this sequence of events:
First request arrives and does fs.readFileSync() and calls res.end(data). That starts sending data to the client, but returns before it is done because of TCP flow control. This sends node.js back to its event loop.
Second request arrives and does fs.readFileSync() and calls res.end(data). That starts sending data to the client, but returns before it is done because of TCP flow control. This sends node.js back to its event loop.
At this point, the event loop might start processing the third or fourth requests or it might service some more events (from inside the implementation of res.end() or the writeStream from the first request to keep sending more data. If it does service those events, it could give the appearance (from the client point of view) of true concurrency of the different requests).
Also, the client could be causing it to appear sequenced. Each client is reading a different buffered socket and if they are all in different terminals, then they are multi-tasked. So, if there is more data on each client's socket than it can read and display immediately (which is probably the case), then each client will read some, display some, read some more, display some more, etc... If the delay between sending each client's response on your server is smaller than the delay in reading and displaying on the client, then the clients (which are each in their own separate processes) are able to run concurrently.
When you are using asynchronous I/O such as fs.readFile(), then properly written node.js Javascript code can have many requests "in flight" at the same time. They don't actually run concurrently at exactly the same time, but one can run, do some work, launch an asynchronous operation, then give way to let another request run. With properly written asynchronous I/O, there can be an appearance from the outside world of concurrent processing, even though it's more akin to sharing of the single thread whenever a request handler is waiting for an asynchronous I/O request to finish. But, the server code you show is not this cooperative, asynchronous I/O.

Maybe is not related directly to your question but i think this is useful,
You can use a stream instead of reading the full file into memory, for example:
const { createServer } = require('http');
const fs = require('fs');
const server = createServer();
server.on('request', (req, res) => {
const readStream = fs.createReadStream('./big.file'); // Here we create the stream.
readStream.pipe(res); // Here we pipe the readable stream to the res writeable stream.
});
server.listen(8000);
The point of doing this is:
Looks nicer.
You don't store the full file in RAM.
This works better because is non blocking, and the res object is already a stream, and this means the data will be transfered in chunks.
Ok so streams = chunked
Why not read chunks from the file and send them in real time instead of reading a really big file and divide that in chunks after?
Also why is really important on a real production server?
Because every time a request is received, your code is going to add that big file into ram, to that add this is concurrent so you are expecting to serve multiple files at the same time, so let's do the most advanced math my poor education allows:
1 request for a 1gb file = 1gb in ram
2 requests for a 1gb file = 2gb in ram
etc
That clearly doesn't scale nicely right?
Streams allows to decouple that data from the current state of the function (inside that scope), so in simple terms its going to be (with the default chunk size of 16kb):
1 request for 1gb file = 16kb in ram
2 requests for 1gb file = 32kb in ram
etc
And also, the OS its already passing a stream to node (fs) so it works with streams end to end.
Hope it helps :D.
PD: Never use sync operations (blocking) inside async operations (non blocking).

Requests handling inside Event Pool using NodeJS

I have read the difference between Multi thread mechanism and NodeJS Single thread mechanism here. I know less about thread concept.
My question
The above article says that all the Non Blocking I/O is handled using single thread in Event loop.
I have read through questions posted in this forum, but all it says is just the overview of how single thread is working and not the deeper mechanism. says something like...
Starts processing the Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If there are like 2 or more Non Blocking requests in Event Queue, Event loop takes each requests and processes it.
First request enter Event Pool and starts processing and does not wait or hold till the response and meanwhile request 2 enters and starts processing without wait.
Now,since the 2nd request has taken the thread for processing (and all request is handled using single thread) , currently what is handling the 1st request process, If there is thread sharing , how is it happening ?
Is the first request process released when handling 2nd request and later comes back to 1st request ? if so how is it happening in thread perspective ?
how does single thread processes 2 or more request concurrently as basically thread will be assigned to a request until all it's process is finished
and how is single thread handled for both Input and Output operation at same time ?
is there any topic i am missing to read so that i'm getting this single thread event loop mechanism ?

First off, "single threaded" applies only to one thread running your Javascript. The node.js has other native threads for implementing some of the functions in the built-in library. For example, file I/O uses a thread pool in order to implement asynchronous file I/O. But, what's most important to understanding how your own Javascript runs is that there is only one thread of your Javascript.
Let's imagine that you have a simple web server like this:
const http = require('http');
function sendFile(res, filename) {
if (filename.startsWith("/")) {
filename = filename.slice(1) + ".html";
}
fs.readFile("1.html", (err, data) => {
if (err) {
res.writeHead(404);
res.end("not found");
} else {
res.writeHead(200, {'Content-Type': 'text/html'});
res.write(data);
res.end();
}
});
}
const server = http.createServer((req, res) => {
if (req.url === "/1" || req.url === "/2" || req.url === "/3") {
sendFile(req.url);
} else {
res.writeHead(404);
res.end("not found");
}
});
server.listen(80);
This web server responds to requests for three URLs /1, /2 and /3.
Now imagine that three separate clients each request one of those URLs. Here's the sequence of events:
Client A requests http://myserver.com/1
Client B requests http://myserver.com/2
Client C requests http://myserver.com/3
Server receives incoming connection from client A, establishes the connection, client sends the request for /1 and the server reads and parses that request.
While the server is busy reading the request from client A, the requests from both client B and client C arrive.
The TCP stack handles incoming connections at the OS level (using other threads i.e. kernel level thread).
Notifications of the arriving connections are put in the node.js event queue. Because the node.js server is busy running Javascript for the client A connection, those two events sit in the event queue for now.
At the same time as those other connections are arriving, the node.js server is starting to run the request handler for /1. It finds a match in the first if statement and calls sendFile("/1").
sendFile() calls fs.readFile() and then returns. Because fs.readFile() is asynchronous, that file operation is started, but is handed over to the I/O system inside of node.js and then the function immediately returns. When sendFile() returns, it goes back to the http server request handler which also then returns. At this point, there's nothing else for this request to do. Control has been returned back to the interpreter to decide what to do next.
The node.js interpreter checks the event queue to see if there is anything in their to process. It finds the incoming request from client B and that request starts processing. This request goes through the same 8 and 9 steps until it returns with another fs.readFile() operations initiated.
Then step 10 is repeated for the incoming request from client C.
Then, some short time later, one of the three fs.readfile() operations that were previously initiated completes and places a completion callback into the Javascript event queue. As soon as the Javascript interpreter has nothing else to do, it finds that event in the event queue and begins to process it. This calls the callback that was passed to fs.readFile() with the two parameters that that function expects and the code in the callback starts to execute.
Assuming the fs.readFile() operation was successful, it calls res.writeHead(), then res.write(), then res.send(). Those three calls all send data to the underlying OS TCP stack where it is then sent back to the client.
After res.end() returns, control is returned back to the interpreter and it checks the event queue for the next event. If another fs.readFile() callback is already in the event queue, then it is pulled out of the event queue and processed like the previous one. If the event queue is empty, then the interpreter waits until something is put in the event queue.
If there are like 2 or more Non Blocking requests in Event Queue, Event loop takes each requests and processes it.
node.js only runs one at a time. But, the key is that asynchronous code in the request handler allows the handler to return control back to the system so that other events can be processed while that first request was waiting for its asynchronous operation to complete. This is a form of cooperative, non-pre-emptive multi-tasking. It's not multiple threads of Javascript. The first request handler actually starts and asynchronous operation and then returns (as if it was done). When it returns, the next event in the queue can start processing. At some later time when the asynchronous operation completes, it will insert its own event into the event queue and it will get back in line to use the single thread of Javascript again.
First request enter Event Pool and starts processing and does not wait or hold till the response and meanwhile request 2 enters and starts processing without wait.
Most of this has already been described above. If the Javascript thread is busy why request 2 enters the event queue, that request will sit in the event queue until the Javascript thread is no longer busy. It may have to wait a short period of time. But, it won't have to wait until request 1 is done, only until request 1 returns control back to the system and is, itself, waiting for some asynchronous operation to complete.
Now,since the 2nd request has taken the thread for processing (and all request is handled using single thread) , currently what is handling the 1st request process, If there is thread sharing , how is it happening ?
While the 2nd request is using the Javascript thread, the 1st request is not running any Javascript. It's native code asynchronous operations may be running in the background (all asynchronous operations require some native code), but there is only one piece of Javascript running at any given time so if the 2nd request is running some Javascript, then the first request is either waiting for its asynchronous operation to finish or that operation has already finished and an event is sitting in the event queue waiting for the 2nd request to be done so that event can get processed.
Is the first request process released when handling 2nd request and later comes back to 1st request ? if so how is it happening in thread perspective ?
This all works through the event queue. 1st request runs until it returns. 2nd request runs until it returns. When async operation from 1st request completes it inserts an item in the event queue. When the JS interpreter is free, it pulls that event from the event queue and runs it. There may be threads involved in native code implementations of asynchronous operations, but there is still only one thread of Javascript.
how does single thread processes 2 or more request concurrently as basically thread will be assigned to a request until all it's process is finished
It never actually runs multiple pieces of Javascript concurrently. The Javascript from each different operation runs until it returns control back to the interpreter. Asynchronous operations (such as file I/O or networking operations) can run concurrently and those are managed by native code, sometimes using additional threads and sometimes not. File I/O uses a thread pool to implement non-blocking, asynchronous file I/O. Networking uses OS event notifications (select, epoll, etc...), not threads.
and how is single thread handled for both Input and Output operation at same time ?
It doesn't in your Javascript. It would typically read, then write. It doesn't do both "at the same time". Now, the TCP stack may be doing some actual parallel work inside the OS, but that's all managed by the OS and even that probably gets serialized at the network interface at some point.Requests are handled by single thread where as input output processes are managed by os level threads created per each process by OS
is there any topic i am missing to read so that i'm getting this single thread event loop mechanism ?
Read every thing you can find about the Javascript event queue. Here are some references to get you started:
How does JavaScript handle AJAX responses in the background?
Where is the node.js event queue?
Node.js server with multiple concurrent requests, how does it work?
Asynchronous process handler in node
How does a single thread handle asynchronous code in Node.js?

How to run child process in Mean Stack

I have an Mean application which uses nodejs, angularjs and expressjs.
Here I have called my server from the angular controller as below
Angular Controller.js
$http.post('/sample', $scope.sample).then(function (response) {
--
--
}
and in Server.js as below
app.post('/sample', userController.postsample);
Here I am doing my operation with mongodb in that post sample from above code.
Here I got struck how to do my calculation part like I have a big calculation which takes some time (assume 1 hour) to complete. So from client side I will trigger that calculation from my angular controller.
My problem is that calculation should run in separately so that other UIs and operations of other pages should not be interupted.
I had seen that child process in nodejs but I didn't understand how to trigger or exec that from child process from controller and if it get request in app.post then is it possible to access other pages.
EDIT:
I have planned to do in Spawn a child_process but I have another problem continuing the above.
Lets consider application contains 3 users and 2 users are accessing the application at same time.
My case is If first person triggered the child_process name it as first operation and it is in process and at that moment when second person need to trigger the process name it as 2nd operation as he also needed to calculate.
Here my questions are
What happens if another person started the spawn command. If it hangs or keep in queue or both execute parallel.
If 2nd operation is in queue then when it will start the operation.
If 2nd operation is in queue then how can i know how many are in queue at a point of time
Can anyone help to solve.

Note: the question was edited - see updates below.
You have few options to do it.
The most straightforward way would be to spawn the child process from your Express controller that would return the response to the client once the calculation is done, but if it takes so long then you may have problems with socket timeouts etc. This will not block your server or the client (if you don't use "Sync" function on the server and synchronous AJAX on the client) but you will have problems with the connection hanging for so long.
Another option would be to use WebSocket or Socket.io for those requests. The client could post a message to the server that it wants some computation to get started and the server could spawn the child process, do other things and when the child returns just send the message to the client. The disadvantage of that is a new way of communication but at least there would be no problems with timeouts.
To see how to combine WebSocket or Socket.io with Express, see this answer that has examples for both WebSocket and Socket.io - it's very simple actually:
Differences between socket.io and websockets
Either way, to spawn a child process you can use:
spawn
exec
execFile
fork
from the core child_process module. Just make sure to never use any functions with "Sync" in their name for what you want to do because those would block your server from serving other requests for the entire time of waiting for the child to finish - which may be an hour in your case, but even if it would be a second it could still ruin the concurrency completely.
See the docs:
https://nodejs.org/api/child_process.html
Update
Some update for the edited question. Consider this example shell script:
#!/bin/sh
sleep 5
date -Is
It waits for 5 seconds and prints the current time. Now consider this example Node app:
let child_process = require('child_process');
let app = require('express')();
app.get('/test', (req, res) => {
child_process.execFile('./script.sh', (err, data) => {
if (err) {
return res.status(500).send('Error');
}
res.send(data);
});
});
app.listen(3344, () => console.log('Listening on 3344'));
Or using ES2017 syntax:
let child_process = require('mz/child_process');
let app = require('express')();
app.get('/test', async (req, res) => {
try {
res.send((await child_process.execFile('./script.sh'))[0]);
} catch (err) {
res.status(500).send('Error');
}
});
app.listen(3344, () => console.log('Listening on 3344'));
It runs that shell script for requests on GET /test and returns the result.
Now start two requests at the same time:
curl localhost:3344/test & curl localhost:3344/test & curl localhost:3344/test &
and see what happens. If the returned times differ by 5 seconds and you get one response after another with 5 seconds intervals then the operations are queued. If you get all responses at the same time with more or less the same timestamp then those are all run in parallel.
Sometimes it's best to make an experiment like this to see what happens.

Node.js SIGTERM implementation considerations

I have an stateless node.js application. It is an api implemented with express which connects to a mongoDB database. Each request is completely independent from other requests (ergo stateless).
I would like to implement a SIGTERM in order to shutdown gracefully but I do not know what I should take into consideration.
What I know for sure is that I should close my db connection. But, if I just do that:
process.on('SIGTERM', function () {
server.close(function () { //Stops express
db.close(false, function() { //Closes database connection
process.exit(0);
});
});
});
Can I assure that no request is being interrupted by doing that? If not, how do I know if a request is being made and how do I wait for it to finish? Should I stop listening for requests during this time? If so, how?
Thanks in advance for the answers.

Assuming that server is a http.Server instance, calling .close() will stop the server from accepting new connections (documentation), but existing requests will continue to run until they're done.
The callback will get called only once all requests have been processed, so it's safe to assume that at that point you can close the database connection (there won't be any requests listening to it anymore).

We Keep Coding

JavaScript is the programming language of the Web.