Node.js - Why are some of my callbacks not executing asynchronously?

Node.js - Why are some of my callbacks not executing asynchronously? - javascript

Noob question on using callbacks as a control flow pattern with Node and the http class. Based on my understanding of the event loop, all code is blocking, i/o is non-blocking and using callbacks, here's the a simple http server and a pseudo rest function:
// Require
var http = require("http");
// Class
function REST() {};
// Methods
REST.prototype.resolve = function(request,response,callback) {
// Pseudo rest function
function callREST(request, callback) {
if (request.url == '/test/slow') {
setTimeout(function(){callback('time is 30 seconds')},30000);
} else if (request.url == '/test/foo') {
callback('bar');
}
}
// Call pseudo rest
callREST(request, callback);
}
// Class
function HTTPServer() {};
// Methods
HTTPServer.prototype.start = function() {
http.createServer(function (request, response) {
// Listeners
request.resume();
request.on("end", function () {
// Execute only in not a favicon request
var faviconCheck = request.url.indexOf("favicon");
if (faviconCheck < 0) {
//Print
console.log('incoming validated HTTP request: ' + request.url);
//Instantiate and execute on new REST object
var rest = new REST();
rest.resolve(request,response,function(responseMsg) {
var contentType = {'Content-Type': 'text/plain'};
response.writeHead(200, contentType); // Write response header
response.end(responseMsg); // Send response and end
console.log(request.url + ' response sent and ended');
});
} else {
response.end();
}
});
}).listen(8080);
// Print to console
console.log('HTTPServer running on 8080. PID is ' + process.pid);
}
// Process
// Create http server instance
var httpServer = new HTTPServer();
// Start
httpServer.start();
If I open up a browser and hit the server with "/test/slow" in one tab then "/test/foo" in another, I get the following behavior - "foo" responds with "Bar" immediately and then 30 secs late, "slow" responds with "time is 30 seconds". This is what I was expecting.
But if I open up 3 tabs in a browser and hit the server with "/test/slow" successively in each tab, "slow" is being processed and responds serially/synchronously so that the 3 responses appear at 30 second intervals. I was expecting the responses right after each other if they were being processed asynchronously.
What am I doing wrong?
Thank you for your thoughts.

This is actually not the server's fault. Your browser is opening a single connection and re-using it between the requests, but one request can't begin until the previous finishes. You can see this a couple of ways:
Look in the network tab of the Chrome dev tools - the entry for the longest one will show the request in the blocking state until the first two finish.
Try opening the slow page in different browsers (or one each in normal and incognito windows) - this prevents sharing connections.
Thus, this will only happen if the same browser window is making multiple requests to the same server. Also, note that XHR (AJAX) requests will open separate connections so they can be performed in parallel. In the real world, this won't be a problem.

Related

Why is request.on data firing with a delay on NodeJS?

There is a simple web server that accepts data. Sample code below.
The idea is to track in real time how much data has entered the server and immediately inform the client about this. If you send a small amount of data, then everything works well, but if you send more than X data in size, then the on.data event on the server is triggered with a huge delay. I can see that data is transfering for 5 seconds already but on.data event is not trigerred.
on.data event seems to be triggered only when data is uploaded completely to the server, so that's why it works fine with small data (~2..20Mb), but with big data (50..200Mb) it doesnt work well.
Or maybe it is due to some kind of buffering..?
Do you have any suggestions why on.data triggered with delay and how to fix it?
const app = express();
const port = 3000;
// PUBLIC API
// upload file
app.post('/upload', function (request, response) {
request.on('data', chunk => {
// message appears with delay
console.log('upload on data', chunk.length);
// send message to the client about chunk.length
});
response.send({
message: `Got a POST request ${request.headers['content-length']}`
});
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});

TLDR:
The delay that you are experiencing probably is the Queueing from Resource scheduling from the browser.
The Test
I did some tests with express, and then I found that it uses http to handle requests/response, so I used a raw http server listener to test this scenario, which has the same situation.
Backend code
This code, based on sample of Node transaction samples, will create a http server and give log of time on 3 situations:
When a request was received
When the first data event fires
When the end event fires
const http = require('http');
var firstByte = null;
var server = http.createServer((request, response) => {
const { headers, method, url } = request;
let body = [];
request.on('error', (err) => {
}).on('data', (chunk) => {
if (!firstByte) {
firstByte = Date.now();
console.log('received first byte at: ' + Date.now());
}
}).on('end', () => {
console.log('end receive data at: ' + Date.now());
// body = Buffer.concat(body).toString();
// At this point, we have the headers, method, url and body, and can now
// do whatever we need to in order to respond to this request.
if (url === '/') {
response.statusCode = 200;
response.setHeader('Content-Type', 'text/html');
response.write('<h1>Hello World</h1>');
}
firstByte = null;
response.end();
});
console.log('received a request at: ' + Date.now());
});
server.listen(8083);
Frontend code (snnipet from devtools)
This code will fire a upload to /upload which some array data, I filled the array before with random bytes, but then I removed and see that it did not have any affect on my timing log, so yes.. the upload content for now is just an array of 0's.
console.log('building data');
var view = new Uint32Array(new Array(5 * 1024 * 1024));
console.log('start sending at: ' + Date.now());
fetch("/upload", {
body: view,
method: "post"
}).then(async response => {
const text = await response.text();
console.log('got response: ' + text);
});
Now running the backend code and then running the frontend code I get some log.
Log capture (screenshots)
The Backend log and frontend log:
The time differences between backend and frontend:
Results
looking at the screenshoots and I get two differences between the logs:
The first, and most important, is the difference between frontend fetch start and backend request recevied, I got 1613ms which is "close" (1430ms) to Resource Scheduling in network timing tab, I think there are more things happening between the frontend fetch call and the node backend event, so I can't direct compare the times:
log.backendReceivedRequest - log.frontEndStart
1613
The second is the difference between receving data on backend, which I got
578ms, close to Request sent (585ms) in network timing tab:
log.backendReceivedAllData - log.backendReceivedFirstData
578
I also changed the frontend code to send different sizes of data and the network timing tab still matches the log
The thing that remains unknown for me is... Why does Google Chrome is queueing my fetch since I'm not running any more requests and not using the bandwidth of the server/host? I readed the conditions for Queueing but not found the reason, maybe is allocating the resources on disk, but not sure: https://developer.chrome.com/docs/devtools/network/reference/#timing-explanation
References:
https://nodejs.org/es/docs/guides/anatomy-of-an-http-transaction/
https://developer.chrome.com/docs/devtools/network/reference/#timing-explanation

I found a problem. It was in nginx config. Nginx was setup like a reverse proxy. By default proxy request buffering is enabled, so nginx grabs first whole request body and only then forwards it to nodejs, so that's why I saw delay.
https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_request_buffering

How are Objects with Functions Handled in Node.js?

I am currently using Node.js to handle the back-end of my website but I am unsure of how Websockets/Objects are handled together.
This is a template I am using as an example of my main class. (Sends web-requests to a specific page)
class ViewClass {
constructor(URL, views) {
this.link = URL;
this.views = views;
this.make_requests();
}
make_requests() {
try {
const XMLHttpRequest = require("xmlhttprequest").XMLHttpRequest;
const xhr = new XMLHttpRequest();
let link = this.link;
let views = this.views;
for (let index = 1; index < views + 1; index++) {
xhr.open("GET", link, false);
xhr.onload = function (e) {
if (xhr.readyState === 4) {
if (xhr.status === 200) {
console.log("View: " + index + " Sent Successfully!");
} else {
console.error("View: " + index + " Failed!");
}
}
};
xhr.send(null);
}
} catch (error) {
console.log(error.message);
}
}
}
This is my Main Websocket File (Stripped for simplicity)
server.on('connection', function (socket) {
console.log("Welcomed Connection from: " + socket.remoteAddress);
socket.on('close', function (resp) {
console.log(`[${GetDate(3)}] Bye!`);
});
socket.on('data', function (buf) {
// Take Views/URL from Front-end.
// Initialise a new Object from ViewClass and let it run until finished.
});
});
Lets say I receive data from the WebSocket and that data creates a new ViewClass object and starts running immediately. Will that Now Running code block the input/output of the Node.js Server? Or will it be handled in the background?
If there is any information I can provide to make it clearer let me know as I am extremely new to Websocket/Js and I am more than likely missing information.

Your ViewClass code is launching views XMLHttpRequests and then doing nothing, but waiting for responses to come back. Because a regular XMLHttpRequest is asynchronous (if you don't pass false for the async flag), the server is free to do other things while the code is waiting for the XMLHttpRequest responses.
Will that Now Running code block the input/output of the Node.js Server?
No, because this is asynchronous code, it will not block the input/output of the server.
Or will it be handled in the background?
Responses themselves are not handled in the background. Nodejs runs your Javascript in a single thread (assuming there are no WorkerThreads being used which are not being used here). But, waiting for a networking response is asynchronous and is handled by native code in the event loop in the background. So, while your code is doing nothing but waiting for an event to occur, nodejs and your server is free to respond to other incoming events (such as other incoming requests).
Emergency Edit:
This code:
xhr.open("GET", link, false);
Is attempting a SYNCHRONOUS XMLHttpRequest. That's a horrible thing to do in a node.js server. That WILL block all other activity. Change the false to true to allow the xhr request to be asynchronous.

Make HTTP requests without opening a new connection

If a browser opens a connection to a remote server, is it possible to access that same connection via Javascript?
I have a small Ethernet module on my network that I program sort of like this (pseudocode):
private var socket
while(true) {
if(socket is disconnected) {
open socket
listen on socket (port 80)
}
if(connection interrupt) {
connect socket
}
if(data receive interrupt) {
serve
}
if(disconnection interrupt) {
disconnect socket
}
}
The point is that it listens on one socket for HTTP requests and serves them.
In my web browser, I can connect to the device, making an HTTP GET request for some HTML/JS that I've written, and it works. A connection is opened on the socket and the files come back as HTTP responses.
Now I want to click a button on the webpage and have the browser send an HTTP POST request over that same connection. In my Javascript, I have (edited and formatted for clarity):
// This function sends an HTTP request
function http(type, url, data, callbacks) {
// make a new HTTP request
var request = new XMLHttpRequest();
// open a connection to the URL
request.open(type, url + (data ? "?" + data : ""));
// add headers
if(type == "POST")
request.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded');
// register callbacks for the request
Object.keys(callbacks).forEach(function(callback) {
request[callback] = function() {
callbacks[callback](request.responseText);
};
});
// send and return the request
request.send();
return request;
}
// Here is where I call the function
http("POST", // use POST method
"http://192.168.1.99", // IP address of the network device
dataToSend, // the data that needs to be sent
{ // callbacks
onloadend: function(data) {
console.log("success. got: " + data); // print 'success' when the request is done
},
onerror: function(data) {
console.log("There was an error."); // print 'error' when it fails
console.log(data);
}
}
);
The issue here is that this opens a new connection to the device, but I want to use the same socket that the browser is already connected to. Is this possible, and if so, how?

There is no application control inside the browser to decide if a new connection is used for the next request or if an existing connection is used. In fact, it is perfectly normal that the browser will use multiple connections in parallel to the same server and your server has to be able to deal with this.
Since your server architecture seems to be only able to deal with one connection at a time you either would need to change the architecture to handle multiple parallel connections or to make sure that you only need to handle a single connection at a time. The latter could be achieved by not supporting HTTP keep-alive, i.e. by closing the connection immediately after each response. This way a new request will result in a new connection (which is not what you wanted according to your question) but your server will also be able to handle this new connection (which is what you likely ultimately need) since the previous one was closed.

Node JS Request Library elapsedTime value

I'm new to Node and am having some difficulties with getting the Request library to return an accurate response time.
I have read the thread at nodejs request library, get the response time and can see that the request library should be able to return an "elapsed time" for the request.
I am using it in the following way :
request.get({
url : 'http://example.com',
time : true
},function(err, response){
console.log('Request time in ms', response.elapsedTime);
});
The response.elapsedTime result is in the region of 500-800ms, however I can see the request is actually taking closer to 5000ms.
I am testing this against an uncached nginx page which takes roughly 5 seconds to render the page when profiling via a browser (Chrome).
Here is an example of the timing within Chrome (although the server is under load hence the 10s)
Chrome Profiling example
It looks to me like this isn't actually timing the full start to finish of the request but it "timing" something else. It might be the time taken to download the page once the server starts streaming it.
If this is the case, how can I get the actual start to finish time that this request has taken ? The time I need is from making the request to receiving the entire body and headers.
I am running the request like this with listofURLs being an array of urls to request:
for (var i = 0; i < listofURLs.length; i++) {
collectSingleURL(listofURLs[i].url.toString(),
function (rData) {
console.log(rData['url']+" - "+rData['responseTime']);
});
}
function collectSingleURL(urlToCall, cb) {
var https = require('https');
var http = require('http');
https.globalAgent.maxSockets = 5;
http.globalAgent.maxSockets = 5;
var request = require('request');
var start = Date.now();
// Make the request
request.get({
"url": urlToCall,
"time": true,
headers: {"Connection": "keep-alive"}
}, function (error, response, body) {
//Check for error
if (error) {
var result = {
"errorDetected": "Yes",
"errorMsg": error,
"url": urlToCall,
"timeDate": response.headers['date']
};
//callback(error);
console.log('Error in collectSingleURL:', error);
}
// All Good - pass the relevant data back to the callback
var result = {
"url": urlToCall,
"timeDate": response.headers['date'],
"responseCode": response.statusCode,
"responseMessage": response.statusMessage,
"cacheStatus": response.headers['x-magento-cache-debug'],
"fullHeaders": response.headers,
"bodyHTML": body,
"responseTime" : Date.now() - start
};
cb(result);
//console.log (cb);
});
}

You are missing a key point - it take 5 seconds to render, not to just download the page.
The request module of node is not a full browser, it's a simple HTTP request, so when you for example request www.stackoverflow.com, it will only load the basic HTML returned by the page, it will not load the JS files, CSS file, images etc.
The browser on the otherhand, will load all of that after the basic HTML of the page is loaded (some parts will load before the page has finished loading, together with the page).
Take a look on the network profiling below of stackoverflow - the render finishes at ~1.6 seconds, but the basic HTML page (the upper bar) has finished loading around 0.5 second. So if you use request to fetch a web page, it actually only loading the HTML, meaning - "the upper bar".

Just time it yourself:
var start = Date.now()
request.get({
url : 'http://example.com'
}, function (err, response) {
console.log('Request time in ms', Date.now() - start);
});

Lots of parallel http requests in node.js

I've created a node.js script, that scans network for available HTTP pages, so there is a lot of connections i want to run in parallel, but it seems that some of the requests wait for previous to complete.
Following is the code fragment:
var reply = { };
reply.started = new Date().getTime();
var req = http.request(options, function(res) {
reply.status = res.statusCode;
reply.rawHeaders = res.headers;
reply.headers = JSON.stringify(res.headers);
reply.body = '';
res.setEncoding('utf8');
res.on('data', function (chunk) {
reply.body += chunk;
});
res.on('end', function () {
reply.finished = new Date().getTime();
reply.time = reply.finished - reply.started;
callback(reply);
});
});
req.on('error', function(e) {
if(e.message == 'socket hang up') {
return;
}
errCallback(e.message);
});
req.end();
This code performs only 10-20 requests per second, but i need 500-1k requests performance. Every queued request is made to a different HTTP server.
I've tried to do something like that, but it didn't help:
http.globalAgent.maxSockets = 500;

Something else must be going on with your code. Node can comfortably handle 1k+ requests per second.
I tested with the following simple code:
var http = require('http');
var results = [];
var j=0;
// Make 1000 parallel requests:
for (i=0;i<1000;i++) {
http.request({
host:'127.0.0.1',
path:'/'
},function(res){
results.push(res.statusCode);
j++;
if (j==i) { // last request
console.log(JSON.stringify(results));
}
}).end();
}
To purely test what node is capable of and not my home broadband connection the code requests from a local Nginx server. I also avoid console.log until all the requests have returned because it is implemented as a synchronous function (to avoid losing debugging messages when a program crash).
Running the code using time I get the following results:
real 0m1.093s
user 0m0.595s
sys 0m0.154s
That's 1.093 seconds for 1000 requests which makes it very close to 1k requests per second.
The simple code above will generate OS errors if you try to make a lot of requests (like 10000 or more) because node will happily try to open all those sockets in the for loop (remember: the requests don't start until the for loop ends, they are only created). You mentioned that your solution also runs into the same errors. To avoid this you should limit the number of parallel requests you make.
The simplest way of limiting number of parallel requests is to use one of the Limit functions form the async.js library:
var http = require('http');
var async = require('async');
var requests = [];
// Build a large list of requests:
for (i=0;i<10000;i++) {
requests.push(function(callback){
http.request({
host:'127.0.0.1',
path:'/'
},function(res){
callback(null,res.statusCode);
}).end()
});
}
// Make the requests, 100 at a time
async.parallelLimit(requests, 100,function(err, results){
console.log(JSON.stringify(results));
});
Running this with time on my machine I get:
real 0m8.882s
user 0m4.036s
sys 0m1.569s
So that's 10k request in around 9 seconds or roughly 1.1k/s.
Look at the functions available from async.js.

I've found solution for me, it is not very good, but works:
childProcess = require('child_process')
I'm using curl:
childProcess.exec('curl --max-time 20 --connect-timeout 10 -iSs "' + options.url + '"', function (error, stdout, stderr) { }
This allows me to run 800-1000 curl processes simultaneously. Of course, this solution has it's weekneses, like requirement for lots of open file decriptors, but works.
I've tried node-curl bindings, but that was very slow too.

We Keep Coding

JavaScript is the programming language of the Web.

Node.js - Why are some of my callbacks not executing asynchronously? - javascript

Related

Why is request.on data firing with a delay on NodeJS?

How are Objects with Functions Handled in Node.js?

Make HTTP requests without opening a new connection

Node JS Request Library elapsedTime value

Lots of parallel http requests in node.js

Categories

Resources