Blocking requests not running simultaneously on PM2 - javascript

In my express app, I have defined 2 endpoints in my application. One for is-sever-up check and one for simulating a blocking operation.
app.use('/status', (req, res) => {
res.sendStatus(200);
});
app.use('/p', (req, res) => {
const { logger } = req;
logger.info({ message: 'Start' });
let i = 0;
const max = 10 ** 10;
while (i < max) {
i += 1;
}
res.send(`${i}`);
logger.info({ message: 'End' });
});
I am using winston for logging and PM2 for clustering using the following command
$ pm2 start bin/httpServer.js -i 0
It has launched 4 instances.
Now, when I visit the routes /p, /p, /status in order in different tabs with around 1 second delay between (request 1 and request 2) and (request 2 and request 3), I expected to get response for request 1 and request 2 after some time but with around 1 second delay and response for request 3 should come instantly.
Actual: The response for request 3 did come instantly but something weird happened with request 1 and request 2. The request 2 didn't even start until request 1 was completed. Here are logs that I got. You can see the time stamp for the end of request 1 and start of request 2.
{"message":"Start","requestId":"5c1f85bd-94d9-4333-8a87-30f3b3885d9c","level":"info","timestamp":"2020-12-28 07:34:48"}
{"message":"End","requestId":"5c1f85bd-94d9-4333-8a87-30f3b3885d9c","level":"info","timestamp":"2020-12-28 07:35:03"}
{"message":"Start","requestId":"f1f86f68-1ddf-47b1-ae62-f75c7aa7a58d","level":"info","timestamp":"2020-12-28 07:35:03"}
{"message":"End","requestId":"f1f86f68-1ddf-47b1-ae62-f75c7aa7a58d","level":"info","timestamp":"2020-12-28 07:35:17"}
Why did the request 1 and request 2 not start at the same time (with 1 second delay, of course)? And if they are running synchronously, why did request 3 respond instantly and not wait for request 1 and 2 to complete?

That's because connection of the header is keep-alive in the response which your node server respond in default. So, connection will be reused when you use browser (curl also could simulate the reused connection situtation). That means multiple request is served by the same instance within a specified time. Even you have multiple node instances.
Note: You could see specified time in response header like this Keep-Alive: timeout=5
If you use browser, open network tab to see response headers.
If you use curl, add -v options to see response headers
You could try to use multiple separated curl command at the same time in terminal. Separated curl command means connection will not be reused. So, you'll get your expected results. You could add a console.log("status test") in /status router. Then, use pm2 logs to see which instance serve the request like following format (these logs are produced by accessing endpoint with browser).
0|server | status test
0|server | status test
0 means the first instance, you will see this is all the same instance to serve request when you use browser to access endpoint. But, if you use curl, you'll find out the number is always changed which mean every request is served by different node instance.
You could see I sent two request at the same time with curl in terminal. Then, different node instance to serve the request. So, the start and end time of console.log are same. In this example, I have 8 event-loop so I could deal with 8 long-processing (synchronous code) request at the same time.
And, you could use curl to simulate the keep-alive situation. Then, you'll see the request is served by same node instance.
curl http://localhost:8080/status http://localhost:8080/status -v -H "Connection: keep-alive"
You also could use connection close to see the request is served by different node instance.
curl http://localhost:8080/status http://localhost:8080/status -v -H "Connection: close"
You could see the different here.
If you want to close the connection in server side, you could use following code.
res.setHeader("Connection", "close")
This is my test code.
const express = require("express")
const app = express();
const port = 8080;
app.use('/status', (req, res) => {
console.log("status tests");
res.sendStatus(200);
});
app.use('/p', (req, res) => {
console.log(new Date() + " start");
let i = 0;
const max = 10 ** 10;
while (i < max) {
i += 1;
}
res.send(`${i}`);
console.log(new Date() + " end");
});
app.listen(port, () => {
return console.log(`server is listening on ${port}`);
});

Related

Why is request.on data firing with a delay on NodeJS?

There is a simple web server that accepts data. Sample code below.
The idea is to track in real time how much data has entered the server and immediately inform the client about this. If you send a small amount of data, then everything works well, but if you send more than X data in size, then the on.data event on the server is triggered with a huge delay. I can see that data is transfering for 5 seconds already but on.data event is not trigerred.
on.data event seems to be triggered only when data is uploaded completely to the server, so that's why it works fine with small data (~2..20Mb), but with big data (50..200Mb) it doesnt work well.
Or maybe it is due to some kind of buffering..?
Do you have any suggestions why on.data triggered with delay and how to fix it?
const app = express();
const port = 3000;
// PUBLIC API
// upload file
app.post('/upload', function (request, response) {
request.on('data', chunk => {
// message appears with delay
console.log('upload on data', chunk.length);
// send message to the client about chunk.length
});
response.send({
message: `Got a POST request ${request.headers['content-length']}`
});
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
TLDR:
The delay that you are experiencing probably is the Queueing from Resource scheduling from the browser.
The Test
I did some tests with express, and then I found that it uses http to handle requests/response, so I used a raw http server listener to test this scenario, which has the same situation.
Backend code
This code, based on sample of Node transaction samples, will create a http server and give log of time on 3 situations:
When a request was received
When the first data event fires
When the end event fires
const http = require('http');
var firstByte = null;
var server = http.createServer((request, response) => {
const { headers, method, url } = request;
let body = [];
request.on('error', (err) => {
}).on('data', (chunk) => {
if (!firstByte) {
firstByte = Date.now();
console.log('received first byte at: ' + Date.now());
}
}).on('end', () => {
console.log('end receive data at: ' + Date.now());
// body = Buffer.concat(body).toString();
// At this point, we have the headers, method, url and body, and can now
// do whatever we need to in order to respond to this request.
if (url === '/') {
response.statusCode = 200;
response.setHeader('Content-Type', 'text/html');
response.write('<h1>Hello World</h1>');
}
firstByte = null;
response.end();
});
console.log('received a request at: ' + Date.now());
});
server.listen(8083);
Frontend code (snnipet from devtools)
This code will fire a upload to /upload which some array data, I filled the array before with random bytes, but then I removed and see that it did not have any affect on my timing log, so yes.. the upload content for now is just an array of 0's.
console.log('building data');
var view = new Uint32Array(new Array(5 * 1024 * 1024));
console.log('start sending at: ' + Date.now());
fetch("/upload", {
body: view,
method: "post"
}).then(async response => {
const text = await response.text();
console.log('got response: ' + text);
});
Now running the backend code and then running the frontend code I get some log.
Log capture (screenshots)
The Backend log and frontend log:
The time differences between backend and frontend:
Results
looking at the screenshoots and I get two differences between the logs:
The first, and most important, is the difference between frontend fetch start and backend request recevied, I got 1613ms which is "close" (1430ms) to Resource Scheduling in network timing tab, I think there are more things happening between the frontend fetch call and the node backend event, so I can't direct compare the times:
log.backendReceivedRequest - log.frontEndStart
1613
The second is the difference between receving data on backend, which I got
578ms, close to Request sent (585ms) in network timing tab:
log.backendReceivedAllData - log.backendReceivedFirstData
578
I also changed the frontend code to send different sizes of data and the network timing tab still matches the log
The thing that remains unknown for me is... Why does Google Chrome is queueing my fetch since I'm not running any more requests and not using the bandwidth of the server/host? I readed the conditions for Queueing but not found the reason, maybe is allocating the resources on disk, but not sure: https://developer.chrome.com/docs/devtools/network/reference/#timing-explanation
References:
https://nodejs.org/es/docs/guides/anatomy-of-an-http-transaction/
https://developer.chrome.com/docs/devtools/network/reference/#timing-explanation
I found a problem. It was in nginx config. Nginx was setup like a reverse proxy. By default proxy request buffering is enabled, so nginx grabs first whole request body and only then forwards it to nodejs, so that's why I saw delay.
https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_request_buffering

Concurrency in Express

I would like to set up an API using express that can either multithread or multiprocess requests.
For instance, below is an api that sleeps 5 seconds before sending a response. If I call it quickly 3 times, the first response will take 5 seconds, the second will take 10, and the third will take 15, indicating the requests were handled sequentially.
How do I architect an application that can handle the requests concurrently.
const express = require('express')
const app = express()
const port = 4000
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
app.get('/', (req, res) => {
sleep(5000).then(()=>{
res.send('Hello World!')
})
})
app.listen(port, () => console.log(`Example app listening on port ${port}!`))
Edit: request -> response
If I call it quickly 3 times, the first response will take 5 seconds, the second will take 10, and the third will take 15, indicating the requests were handled sequentially.
That's only because your browser is serializing the requests, because they're all requesting the same resource. On the Node.js/Express side, those requests are independent of one another. If they were sent from three separate clients one right after another, they'd each get a response roughly five seconds later (not after 5, 10, and 15 seconds).
For instance, I updated your code to output the date/time of the response:
res.send('Hello World! ' + new Date().toISOString())
...and then opened http://localhost:4000 in three separate browsers as quickly as I could (I don't appear to be all that quick :-)). The times on the responses were:
16:15:58.819Z
16:16:00.361Z
16:16:01.164Z
As you can see, they aren't five seconds apart.
But if I do that in three windows in the same browser, they get serialized:
16:17:13.933Z
16:17:18.938Z
16:17:23.942Z
If I further update your code so that it's handling three different endpoints:
function handler(req, res) {
sleep(5000).then(()=>{
res.send('Hello World! ' + new Date().toISOString())
})
}
app.get('/a', handler);
app.get('/b', handler);
app.get('/c', handler);
Then even on the same browser, requests for /a, /b, and /c are not serialized.

How to send 102 Processing in Express?

I am setting up a new HTTP server to execute a long command and return the response from that shell command to the client.
I run v4.17.1 of Express. Requests from clients have repeatedly timed out when running this command. (I app.use(cors()) if that makes any difference).
app.get("/dl", (req, res) => {
require("child_process").exec("command -url".concat(req.query.url), (err, stdout, stderr) => {
if (err || stderr) res.status(500).send(`err: ${err.message}, stderr: ${stderr}`);
res.status(200).send(stdout);
}
});
Browsers just timeout when I try to run this command because it just takes A LONG TIME. If I can't use 102 Processing that's fine just I would like another solution. Thanks!
I'd suggest not using an HTTP 102. You can read more about why: https://softwareengineering.stackexchange.com/a/316211/79958
I'd also STRONGLY recommend against your current logic using a query parameter. Someone could inject commands that would be executed on the server.
"If I can't use 102 Processing..."
Don't use 102 Processing as it is designed specifically for WebDAV. Please check RFC2518 for detail information.
"I would like another solution"
You can return 200 OK for GET /dl once the HTTP request is received and the child process is launched, indicating: "Hey, client, I've received your request and started the job successfully":
app.get("/dl", (req, res) => {
require("child_process").exec("command -url".concat(req.query.url));
res.status(200).end();
});
Then, in the child process, save the execution result somewhere (in a file, in DB etc.), and mapping the result to the query url:
query url A --> child process result A
query url B --> child process result B
query url C --> child process failed information
In client side, after receive 200 OK for GET /dl request, start a poll -- sending request to server every 5 seconds (or whatever time interval you need), with the previous success query url as parameter, trying to get its result in the above mapping. It would be:
If the result is found in the above mapping, client get what it want, and stop the poll.
If nothing is found in the above mapping, client keeps polling after another 5 seconds.
If failed information is found, or polling is timeout, client give up, stop the poll, and display the error message.

Nodejs http.request strange delay

I have found strange delay of http.request function. Here is my code
var express = require('express');
var http = require('http');
app.set('port', process.env.PORT || 3000);
var app = express();
app.get('/aaa',function(req,res) {
setTimeout(function(){
res.json({"a":1});
},500);
});
app.get('/bbb',function(req,res){
var options = {
host: '127.0.0.1',
port: 3000,
path: '/aaa',
method: 'GET'
};
var request = http.request(options, function(result) {
result.on("data",function(){
});
res.json({"b":2});
});
request.on('error', function() {
res.json({"b":2});
});
request.end();
});
http.createServer(app).listen(app.get('port'), function(){
});
Client call /bbb, then it's handler call /aaa and within 500ms result returns back to client.
I tried to measure response time in different situations using Apache Bench:
1) 1000 requests with 1 concurrent requests.
Average response time: 500ms
2) 1000 requests with 50 concurrent requests.
Average response time: 5000ms
3) 1000 requests with 100 concurrent requests.
Average response time: 10000ms
Why response time is growing?
It's okay when I call /aaa directly
It's not unusual behaviour. The HTTP Client used in the callback to /bbb (http.request) is limited to 5 concurrent sockets per host. In other words, it can only make 5 HTTP requests in parallel. You can find reference to this here in the documentation
Just to confirm you're hitting the limit, you should run your tests using 5 and 6 concurrent requests. You'll see (as I did) average response time drops significantly at 6 concurrent requests. This is because the 6th concurrent request will be queued until one of the 5 preceeding requests to /aaa is completed.
To answer your question about why response time grows: The more concurrency you add in your benchmark, the more average response time will go up because each request has to wait for more requests in the queue to finish before it can get a socket.
You can increase the number of concurrent sockets your HTTP client can handle by modifying the default agent like this:
var http = require("http");
http.globalAgent.maxSockets = 10;
You can also circumvent pooling altogether by passing agent:false to http.get like so:
http.get({hostname:'localhost', port:80, path:'/', agent:false}, function (res) {
// Do stuff
})
Update (8th Feb 2015)
An important change regarding this answer has come up in Node v 0.12.0.
maxSockets are no longer limited to 5. The default is now set to
Infinity with the developer and the operating system given control
over how many simultaneous connections an application can keep open to
a given host.
I had same issue and it is resolved by keeping it very simple get request as below
var req = http.get(requestUrl)
req.end();

Node.js domain cluster worker disconnect

Looking at the example given at the nodejs domain doc page: http://nodejs.org/api/domain.html, the recommended way to restart a worker using cluster is to call first disconnect in the worker part, and listen to the disconnect event in the master part. However, if you just copy/paste the example given, you will notice that the disconnect() call does not shutdown the current worker:
What happens here is:
try {
var killtimer = setTimeout(function() {
process.exit(1);
}, 30000);
killtimer.unref();
server.close();
cluster.worker.disconnect();
res.statusCode = 500;
res.setHeader('content-type', 'text/plain');
res.end('Oops, there was a problem!\n');
} catch (er2) {
console.error('Error sending 500!', er2.stack);
}
I do a get request at /error
A timer is started: in 30s the process will be killed if not already
The http server is shut down
The worker is disconnected (but still alive)
The 500 page is displayed
I do a second get request at error (before 30s)
New timer started
Server is already closed => throw an error
The error is catched in the "catch" block and no result is sent back to the client, so on the client side, the page is waiting without any message.
In my opinion, it would be better to just kill the worker, and listen to the 'exit' event on the master part to fork again. This way, the 500 error is always sent during an error:
try {
var killtimer = setTimeout(function() {
process.exit(1);
}, 30000);
killtimer.unref();
server.close();
res.statusCode = 500;
res.setHeader('content-type', 'text/plain');
res.end('Oops, there was a problem!\n');
cluster.worker.kill();
} catch (er2) {
console.error('Error sending 500!', er2);
}
I'm not sure about the down side effects using kill instead of disconnect, but it seems disconnect is waiting the server to close, however it seems this is not working (at least not like it should)
I just would like some feedbacks about this. There could be a good reason this example is written this way that I've missed.
Thanks
EDIT:
I've just checked with curl, and it works well.
However I was previously testing with Chrome, and it seems that after sending back the 500 response, chrome does a second request BEFORE the server actually ends to close.
In this case, the server is closing and not closed (which means the worker is also disconnecting without being disconnected), causing the second request to be handled by the same worker as before so:
It prevents the server to finish to close
The second server.close(); line being evaluated, it triggers an exception because the server is not closed.
All following requests will trigger the same exception until the killtimer callback is called.
I figured it out, actually when the server is closing and receives a request at the same time, it stops its closing process.
So he still accepts connection, but cannot be closed anymore.
Even without cluster, this simple example illustrates this:
var PORT = 8080;
var domain = require('domain');
var server = require('http').createServer(function(req, res) {
var d = domain.create();
d.on('error', function(er) {
try {
var killtimer = setTimeout(function() {
process.exit(1);
}, 30000);
killtimer.unref();
console.log('Trying to close the server');
server.close(function() {
console.log('server is closed!');
});
console.log('The server should not now accepts new requests, it should be in "closing state"');
res.statusCode = 500;
res.setHeader('content-type', 'text/plain');
res.end('Oops, there was a problem!\n');
} catch (er2) {
console.error('Error sending 500!', er2);
}
});
d.add(req);
d.add(res);
d.run(function() {
console.log('New request at: %s', req.url);
// error
setTimeout(function() {
flerb.bark();
});
});
});
server.listen(PORT);
Just run:
curl http://127.0.0.1:8080/ http://127.0.0.1:8080/
Output:
New request at: /
Trying to close the server
The server should not now accepts new requests, it should be in "closing state"
New request at: /
Trying to close the server
Error sending 500! [Error: Not running]
Now single request:
curl http://127.0.0.1:8080/
Output:
New request at: /
Trying to close the server
The server should not now accepts new requests, it should be in "closing state"
server is closed!
So with chrome doing 1 more request for the favicon for example, the server is not able to shutdown.
For now I'll keep using worker.kill() which makes the worker not to wait for the server to stops.
I ran into the same problem around 6 months ago, sadly don't have any code to demonstrate as it was from my previous job. I solved it by explicitly sending a message to the worker and calling disconnect at the same time. Disconnect prevents the worker from taking on new work and in my case as i was tracking all work that the worker was doing (it was for an upload service that had long running uploads) i was able to wait until all of them are finished and then exit with 0.

Categories