I'm creating a reverse HTTP proxy using Node.js for fun. The code is pretty simple at the moment. It listens on 127.0.0.1:8080 for HTTP requests and forwards these to hostname.com, responses from hostname.com are then forwarded back to the client. Nothing fancy is done yet such as rewriting redirect headers, etc. The code is as follows:
var http = require('http');
var server = http.createServer(
function(request, response) {
var proxy = http.createClient(8080, 'hostname.com')
var proxyRequest = proxy.request(request.method, request.url, request.headers);
proxyRequest.on('response', function(proxyResponse) {
proxyResponse.on('data', function(chunk) {
response.write(chunk, 'binary');
});
proxyResponse.on('end', function() {
response.end();
});
response.writeHead(proxyResponse.statusCode, proxyResponse.headers);
});
request.on('data', function(chunk) {
proxyRequest.write(chunk, 'binary');
});
request.on('end', function() {
proxyRequest.end();
});
proxyRequest.on('close', function(err) {
if (err) {
console.log('close error: ' + err + ' for ' + request.url);
}
});
});
server.listen(8080);
server.on('clientError', function(exception) {
console.log('boo a clientError occured :(');
});
All appears to work well until I browse to a page that requires many additional resources (such as images) to be fetched. Naturally the browser will generate a number of GET requests to the reverse proxy to fetch these additional resources.
When I do browse to such a page some of the http.ServerRequests for the additional resources never receive responses. If I restart the page request it almost always results in success as all the resources that were successfully fetched on the first attempt were cached (hence the browser doesn't try GET them again) and so now the browser only needs to grab a few missing ones.
At a guess I would imagine I'm hitting some kind of connection limit although I'm not sure. Any help would be greatly appreciated!
If you set up Wireshark on the proxy, you'll almost certainly see what's happening. (Note that you may need a second machine for this, because some TCP/IP stacks don't provide anything that Wireshark can listen on for loopback traffic - see this)
I'm almost certain that the problem(s) you are running into here are all down to the Connection: header - proxies MUST parse this header and handle it correctly. At a guess, I would say your code is handling the first request in a Connection: keep-alive stream and ignoring the rest. As a proxy, you are supposed to parse and remove/replace this header, and any associated headers (in this case the Keep-Alive: header), before forwarding the request to the server.
If you want to build a HTTP/1.1 proxy, it's very important that you read RFC 2616 and adhere to the many, many rules that it places on their behaviour. The particular problem you are running into here is documented in section 14.10.
Related
I am writing a bare minimal node.js server to understand how it works. From what I can tell this line:
res.writeHead(200);
does nothing. If I remove it I get the exact same behavior from the browser. Does it have any purpose? Is it OK to remove it?
// https://nodejs.dev/learn/the-nodejs-http-module
const http = require('http');
const server = http.createServer(handler);
server.listen(3000, () => {
console.log('node: http server: listening on port: ' + 3000);
});
function handler(request, response) {
res.writeHead(200);
res.end('hello world\n');
}
Is it some how related to http headers?
The default http status is 200, so you do not have to tell the response object that the status is 200. If you don't set it, the response will automatically be 200. You can remove res.writeHead(200); and similarly, you don't need the express version of that which would be res.status(200).
The other thing that res.writeHeader(200) does is cause the headers to be written out in the response stream at that point. You also don't need to call that yourself because when you do res.send(...), the headers will automatically be sent first (if they haven't already been sent). In fact, the res.headersSent property keeps track of whether the headers have been sent yet. If not, they are sent as soon as you call any method that starts sending the body (like res.send() or res.write(), etc...).
Is it OK to remove it?
Yes, in this context it is OK to remove it.
New to Node.js I do understand that createReadStream() function is better for the performance than readFile(), because createReadStream() reads and writes data in chucks while readFile() first reads the whole content. Thus if the file is large, readFile() function might take longer before data can be processed futher. Thus I choose to create server using createReadStream() function as following.
// Create a server with fs.createReadStream(), better performance and less memory usage.
http.createServer( function (request, response) {
// Parse the request containing file name
var pathname = url.parse(request.url).pathname;
// Create a readable stream.
var readerStream = fs.createReadStream(pathname.substr(1));
// Set the encoding to be UTF8.
readerStream.setEncoding('UTF8');
// Handle stream events --> data, end and error
readerStream.on('data', function(chunk) {
// Page found
// HTTP Status: 200 : OK
// Content Type: text/plain
response.writeHead(200, {'Content-type': 'text/html'});
// Write the content of the file to response body.
response.write(chunk);
console.log('Page is being streamed...');
});
readerStream.on('end', function() {
console.log('Page is streamed and emitted successfully.');
});
readerStream.on('error', function(err) {
// HTTP Status: 404 : NOT FOUND
// Content Type: text/plain
response.writeHead(404, {'Content-type': 'text/html'});
console.log('Page streaming error: ' + err);
});
console.log('Code ends!');
}).listen(8081);
// Console will print the message
console.log('Server running at http://127.0.0.1:8081/');
My .html or .txt file contains three short lines of text. After starting my server I visit my web page by going to http://127.0.0.1:8081/index.html. Everything works fine and the content of index.html is echoed on the browser.
But on the tab of the browser, the loader icon keeps turning like it keeps loading for about 1 minute.
Is that normal with Node.js server? Does the icon just keep turning, but costs nothing to the server? Or do I miss something and icon is not supposed to keep turning?
It doesn't look like you are ending your response. The browser probably thinks the request isn't finished and thus continues to "load".
If you look at the Network tab in the developer console you might see the request hasn't finished.
You should be sending response.end()
This method signals to the server that all of the response headers and body have been sent; that server should consider this message complete. The method, response.end(), MUST be called on each response.
I believe you should be calling response.end() in both the readerStream.on('end' and readerStream.on('error' callbacks after you write the head. This will tell the browser the request is finished and it can stop the loading action.
I send JSON requests one by one to the nodejs server. After 6th request, server can't reply to the client immediately and then it takes a little while(15 seconds or little bit more and send back to me answer 200 ok) It occurs a writing json value into MongoDB and time is important option for me in terms with REST call. How can I find the error in this case? (which tool or script code can help me?) My server side code is like that
var controlPathDatabaseSave = "/save";
app.use('/', function(req, res) {
console.log("req body app use", req.body);
var str= req.path;
if(str.localeCompare(controlPathDatabaseSave) == 0)
{
console.log("controlPathDatabaseSave");
mongoDbHandleSave(req.body);
res.setHeader('Content-Type', 'application/json');
res.write('Message taken: \n');
res.write('Everything all right with database saving');
res.send("OK");
console.log("response body", res.body);
}
});
My client side code as below:
function saveDatabaseData()
{
console.log("saveDatabaseData");
var oReq = new XMLHttpRequest();
oReq.open("POST", "http://192.168.80.143:2800/save", true);
oReq.setRequestHeader("Content-type", "application/json;charset=UTF-8");
oReq.onreadystatechange = function() {//Call a function when the state changes.
if(oReq.readyState == 4 && oReq.status == 200) {
console.log("http responseText", oReq.responseText);
}
}
oReq.send(JSON.stringify({links: links, nodes: nodes}));
}
--Mongodb save code
function mongoDbHandleSave(reqParam){
//Connect to the db
MongoClient.connect(MongoDBURL, function(err, db)
{
if(!err)
{
console.log("We are connected in accordance with saving");
} else
{
return console.dir(err);
}
/*
db.createCollection('user', {strict:true}, function(err, collection) {
if(err)
return console.dir(err);
});
*/
var collection = db.collection('user');
//when saving into database only use req.body. Skip JSON.stringify() function
var doc = reqParam;
collection.update(doc, doc, {upsert:true});
});
}
You can see my REST call in google chrome developer editor. (First six call has 200 ok. Last one is in pending state)
--Client output
--Server output
Thanks in advance,
Since it looks like these are Ajax requests from a browser, each browser has a limit on the number of simultaneous connections it will allow to the same host. Browsers have varied that setting over time, but it is likely in the 4-6 range. So, if you are trying to run 6 simultaneous ajax calls to the same host, then you may be running into that limit. What the browser does is hold off on sending the latest ones until the first ones finish (thus avoiding sending too many at once).
The general idea here is to protect servers from getting beat up too much by one single client and thus allow the load to be shared across many clients more fairly. Of course, if your server has nothing else to do, it doesn't really need protecting from a few more connections, but this isn't an interactive system, it's just hard-wired to a limit.
If there are any other requests in process (loading images or scripts or CSS stylesheets) to the same origin, those will count to the limit too.
If you run this in Chrome and you open the network tab of the debugger, you could actually see on the timline exactly when a given request was sent and when its response was received. This should show you immediately whether the later requests are being held up at the browser or at the server.
Here's an article on the topic: Maximum concurrent connections to the same domain for browsers.
Also, keep in mind that, depending upon what your requests do on the server and how the server is structured, there may be a maximum number of server requests that can efficiently processed at once. For example, if you had a blocking, threaded server that was configured with one thread for each of four CPUs, then once the server has four requests going at once, it may have to queue the fifth request until the first one is done causing it to be delayed more than the others.
If you look at the answer by Casey Chu (answered Nov30'10) in this question : How do you extract POST data in Node.js?
You will see that he is responding to 'data' events , to construct the body of the request. Reproducing code here:
var qs = require('querystring');
function (request, response) {
if (request.method == 'POST') {
var body = '';
request.on('data', function (data) {
body += data;
// Too much POST data, kill the connection!
if (body.length > 1e6)
request.connection.destroy();
});
request.on('end', function () {
var post = qs.parse(body);
// use post['blah'], etc.
});
}
}
Suppose I don't care about POST requests, and hence never check if a request is POST or create a 'data' event handler, is there a risk that someone can block my thread by sending a really large post request ? For example, instead of the above code, what if I just did:
function hearStory(request, response) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.write("Cool story bro!");
response.end();
}
What happens to really large POST requests then ? Does the server just ignore the body ? Is there any risk to this approach ? Get requests including their headers must be less that 80kB, so it seems like a simple way to avoid flooding my server.
Hopefully these kinds of attacks can be detected and averted before it ever gets to your server via a firewall or something else. You shouldn't handle DOS attacks with the server itself. However, if they've gotten to your server with malicious intent, there needs to be a way to handle it. If you intend on handling POST requests, the code you're referring will help.
You could, if you just want to avoid POST requests all together and not listen for them, as is demonstrated by the second code snippet, do something like the following.
function denyPost(req, res) {
if (request.method == 'POST') {
console.log('POST denied...'); // this is optional.
request.connection.destroy(); // this kills the connection.
}
}
Of course, this wont work if you plan on handling post requests somehow. But again, DOS attacks need to be handled before they ever get to your server. If they've gotten there, they've already won.
I'm trying to do a simple conection (request - response) from the javascript code on a web to a server in Node.js.
I have tried to make the request as follows:
var request = new XMLHttpRequest();
request.open('GET', 'http://localhost:4444/', false);
request.send();
if (request.status === 200) {
console.log(request.responseText);
}
Running this code I got an error in FireBug
I have continued searching and I found that this method is only to make GET requests on the same domain. To make cross domain requests we must use other strategies.
I found a jQuery method, and it seems that i'm on the right way:
$.get(
'http://localhost:4444/',
function(data) {
alert("sucess");
//Do anything with "data"
}
);
In this case I get the same response without the error.
It seems it works but the "alert" message is never shown! What happens? What am I doing wrong?
The Node.js server code is:
var http = require("http");
http.createServer(function(request, response) {
response.writeHead(200, {"Content-Type": "text/html"});
response.write("Response");
response.end();
}).listen(4444);
So you're running into cross domain issues. You have a few options:
1) since you're using node, use socket.io . It's cross domain compliant.
On the client:
<script src="Full path to were socket IO is held on your server//socket.io.js"></script>
<script>
var socket = io.connect();
socket.on('some_callback', function(data){
// receive data
});
socket.emit('some_other_callback', {'data': value}); //send data
</script>
Server:
var io = require('socket.io').listen(server);
// define interactions with client
io.sockets.on('connection', function(socket){
//send data to client
socket.emit('some_callback', {'data': value});
//recieve client data
socket.on('some_other_callback', function(data){
//do something
});
});
2) Since you just want to use GET you can use JSONP
$.getJSON('url_to_your_domain.com/?callback=?&other_data=something,
function(data){
//do something
}
);
Here we pass your normal GET params as well as callback=?. You will return the following from your server:
require('url');
var r = url.parse(req.url,true);
r.query.callback + '(' + some JSON + ')'
3) If you don't care about all browser compatibility you can use CORS:
You can see a much better example than I would be able to write Here
Cross domain ajax requires special support from your server.
Either CORS: http://en.wikipedia.org/wiki/Cross-origin_resource_sharing
Which not all browsers support yet. It involves special headers in both the request and response that tell the browser that one domain is allowed to communicate with the other, and for what data.
Or JSONP: http://en.wikipedia.org/wiki/JSONP
WHich will work anywhere, but has some implementation limitations. It involves the server wrapping the response in a javascript function callback that will execute and pass in that data you want.
Either way, the server needs to be setup for each of these approaches.
I think your problem is Same Origin Policy. Your browser must get webpage from node.js instance.
Otherwise, you must use something like CORS. There also good question on SO: Ways to circumvent the same-origin policy.