I'm trying to use streams in Node.js to basically build a running buffer of HTTP data until some processing is done, but I'm struggling with the specifics of streams. Some pseudocode will probably help:
var server = http.createServer(function(request, response) {
// Create a buffer stream to hold data generated by the asynchronous process
// to be piped to the response after headers and other obvious response data
var buffer = new http.ServerResponse();
// Start the computation of the full response as soon as possible, passing
// in the buffer stream to hold returned data until headers are written
beginAsyncProcess(request, buffer);
// Send headers and other static data while waiting for the full response
// to be generated by 'beginAsyncProcess'
sendObviousData(response, function() {
// Once obvious data is written (unfortunately HTTP and Node.js have
// certain requirements for the order data must be written in) then pipe
// the stream with the data from 'beginAsyncProcess' into the response
buffer.pipe(response);
});
});
Most of this is almost legitimate code, but it doesn't work. The basic issue is figuring out a way to take advantage of the asynchronous nature of Node.js when there are certain order requirements associated with HTTP requests, namely that headers must always be written first.
While I would definitely appreciate any answers with little hacks to get around the order problem without directly addressing streams, I wanted to use the opportunity to get to know them better. There are plenty of similar situations, but this scenario is more to open the can of worms than anything else.
Let's make a use of callbacks and streams in Node.js and .pause() / .resume() stream functions:
var server = http.createServer(function(request, response) {
// Handle the request first, then..
var body = new Stream(); // <-- you can implement stream.Duplex for read / write operations
body.on('open', function(){
body.pause();
// API generate data
// body.write( generated data ) <-- write to the stream
body.resume();
});
var firstPartOfThePage = getHTMLSomeHow();
response.writeHead(200, { 'Content-Type': 'text/html'});
response.write(firstPartOfThePage, function(){ // <-- callback after sending first part, our body already being processed
body.pipe( response ); // <-- This should fire after being resumed
body.on('end', function(){
response.end(); // <-- end the response
});
});
});
Check this: http://codewinds.com/blog/2013-08-31-nodejs-duplex-streams.html for costum duplex stream creation.
Note: it's still a pseudo code
Related
I have a node.js process that uses a large number of client requests to pull information from a website. I am using the request package (https://www.npmjs.com/package/request) since, as it says: "It supports HTTPS and follows redirects by default."
My problem is that after a certain period of time, the requests begin to hang. I haven't been able to determine if this is because the server is returning an infinite data stream, or if something else is going on. I've set the timeout, but after some number of successful requests, some of them eventually get stuck and never complete.
var options = { url: 'some url', timeout: 60000 };
request(options, function (err, response, body) {
// process
});
My questions are, can I shut down a connection after a certain amount of data is received using this library, and can I stop the request from hanging? Do I need to use the http/https libraries and handle the redirects and protocol switching myself in order the get the kind of control I need? If I do, is there a standardized practice for that?
Edit: Also, if I stop the process and restart it, they pick right back up and start working, so I don't think it is related to the server or the machine the code is running on.
Note that in request(options, callback), the callback will be fired when request is completed and there is no way to break the request.
You should listen on data event instead:
var request = require('request')
var stream = request(options);
var len = 0
stream.on('data', function(data) {
// TODO process your data here
// break stream if len > 1000
len += Buffer.byteLength(data)
if (len > 1000) {
stream.abort()
}
})
I might be out of depth but I really need something to work. I think a write/read stream will solve both my issues but I dont quite understand the syntax or whats required for it to work.
I read the stream handbook and thought i understood some of the basics but when I try to apply it to my situation, it seems to break down.
Currently I have this as the crux of my information.
function readDataTop (x) {
console.log("Read "+x[6]+" and Sent Cached Top Half");
jf.readFile( "loadedreports/top"+x[6], 'utf8', function (err, data) {
resT = data
});
};
Im using Jsonfile plugin for node which basically shortens the fs.write and makes it easier to write instead of constantly writing catch and try blocks for the fs.write and read.
Anyways, I want to implement a stream here but I am unsure of what would happen to my express end and how the object will be received.
I assume since its a stream express wont do anything to the object until it receives it? Or would I have to write a callback to also make sure when my function is called, the stream is complete before express sends the object off to fullfill the ajax request?
app.get('/:report/top', function(req, res) {
readDataTop(global[req.params.report]);
res.header("Content-Type", "application/json; charset=utf-8");
res.header("Cache-Control", "max-age=3600");
res.json(resT);
resT = 0;
});
I am hoping if I change the read part to a stream it will allievate two problems. The issue of sometimes receiving impartial json files when the browser makes the ajax call due to the read speed of larger json objects. (This might be the callback issue i need to solve but a stream should make it more consistent).
Then secondly when I load this node app, it needs to run 30+ write files while it gets the data from my DB. The goal was to disconnect the browser from the db side so node acts as the db by reading and writing. This due to an old SQL server that is being bombarded by a lot of requests already (stale data isnt an issue).
Any help on the syntax here?
Is there a tutorial I can see in code of someone piping an response into a write stream? (the mssql node I use puts the SQL response into an object and I need in JSON format).
function getDataTop (x) {
var connection = new sql.Connection(config, function(err) {
var request = new sql.Request(connection);
request.query(x[0], function(err, topres) {
jf.writeFile( "loadedreports/top"+x[6], topres, function(err) {
if(err) {
console.log(err);
} else {
console.log(x[6]+" top half was saved!");
}
});
});
});
};
Your problem is that you're not waiting for the file to load before sending the response. Use a callback:
function readDataTop(x, cb) {
console.log('Read ' + x[6] + ' and Sent Cached Top Half');
jf.readFile('loadedreports/top' + x[6], 'utf8', cb);
};
// ...
app.get('/:report/top', function(req, res) {
// you should really avoid using globals like this ...
readDataTop(global[req.params.report], function(err, obj) {
// setting the content-type is automatically done by `res.json()`
// cache the data here in-memory if you need to and check for its existence
// before `readDataTop`
res.header('Cache-Control', 'max-age=3600');
res.json(obj);
});
});
Sockets unlike HTTP doesn't have anything that is req, res it is always like:
client.on('data', function(data)...
Event gets executed when there is data on the stream.
Now I want to do a Server 2 Server communication. I'm writing a game where I am gonna have a main server and this main server communicates with the games desktop client.
One server is a World server and the other is a Login server. The client directly connects to the world server and if the data is a login data then the world server passes it to the login server.
But I cant wrap my head around how to do this in node. As a previous webdev I can only think of:
login.send(dataToSendToOtherServer, function(responseOfOtherServer) {
if (responseOfOtherServer === 1)
client.write(thisDataIsGoingToTheDesktopClient)
})
So how can I do something like this for the sockets in node.js?
I tried something like:
Client.prototype.send = function(data, cb) {
// convert json to string
var obj = JSON.stringify(data)
this.client.write(obj)
// wait for the response of this request
this.client.on('data', function(req) {
var request = JSON.parse(req)
// return response as callback
if (data.type === request.type) cb(request)
})
}
But with this every request the response gets +1.
Since you're dealing with plain TCP/IP, you need to come up with your own higher-level protocol to specify things like how to determine when a message is complete (since TCP offers no guarantee it will all arrive in one gulp). Common ways of dealing with this are:
Fixed-length messages: buffer up received data until it's the right length.
Prefixing each message with a length count: buffer up received data until the specified length has been reached.
Designating some character or sequence as an end-of-message indicator: buffer up received data until it ends with that sequence/character.
In your case, you could buffer up received data until JSON.parse succeeds on the accumulated data, assuming each message consists of legal JSON.
Currently I have a problem displaying 'chunks' of responses that I am sending from my Web Service Node.js server (localhost:3000) to a simulated client running on a Node.js server (localhost:3001).
edit * - Current implementation just uses Angular's %http as the transport without web-sockets
The logic goes as follows:
1 . Create an array on the client side of 'Cities' and POST them (from the AngularJS controller) to the Web Service located at: localhost:3000/getMatrix
$http({
method: 'POST',
url: 'http://localhost:3000/getMatrix',
data: cityArray
}).
success(function (data,status,headers,config){
// binding of $scope variables
// calling a local MongoDB to store the each data item received
for(var key in data){
$http.post('/saveRoutes', data[key])
.success(function (data, status){
// Data stored
})
.error(function (data, status){
// error prints in console
});
}
}).
error(function (data,status,headers,config){
alert("Something went wrong!!");
});
2 . The Web Service then runs through its process to make a matrix of 'Cities' (eg. If it was passed 5 cities, it would return a JSON matrix of 5by5 [25 items]). But the catch is that it passes back the data in 'chunks' thanks to Node's > response.write( data )
Side note - Node.js automatically sets 'Transfer-Encoding':'chunked' in the header
* Other code before (routing/variable creation/etc.) *
res.set({
'Content-Type':'application/json; charset=utf-8',
});
res.write("[\n");
* Other code to process loops and pass arguments *
// query.findOne to MongoDB and if there are no errors
res.write(JSON.stringify(docs)+",\n");
* insert more code to run loops to write more chunks *
// at the end of all loops
res.end("]");
// Final JSON looks like such
[
{ *data* : *data* },
{ *data* : *data* },
......
{ *data* : *data* }
]
Currently the problem is not that the 'chunked' response is not reaching its destination, but that I do not know of a way to start processing the data as soon as the chunks come in.
This is a problem since I am trying to do a matrix of 250x250 and waiting for the full response overloads Angular's ability to display the results as it tries to do it all at once (thus blowing up the page).
This is also a problem since I am trying to save the response to MongoDB and it can only handle a certain size of data before it is 'too large' for MongoDB to process.
I have tried looking into Angular's $q and the promise/defer API, but I am a bit confused on how to implement it and have not found a way to start processing data chunks as they come in.
This question on SO about dealing with chunks did not seem to help much either.
Any help or tips on trying to display chunked data as it comes back to AngularJS would be greatly appreciated.
If the responses could be informative code snippets demonstrating the technique, I would greatly appreciate it since seeing an example helps me learn more than a 'text' description.
-- Thanks
No example because I am not sure what you are using in terms of transport code/if you have a websocket available:
$http does not support doing any of the callbacks until a success code is passed back through at the end of the request - it listens for the .onreadystatechange with a 200 -like value.
If you're wanting to do a stream like this you either need to use $http and wrap it in a transport layer that makes multiple $http calls that all end and return a success header.
You could also use websockets, and instead of calling $http, emit an event in the socket.
Then, to get the chunks back the the client, have the server emit each chunk as a new event on the backend, and have the front-end listen for that event and do the processing for each one.
I'm requesting a remote file using an https.request in node.js. I'm not interested in receiving the whole file, I just want what's in the first chunk.
var req = https.request(options, function (res) {
res.setEncoding('utf8');
res.on('data', function (d) {
console.log(d);
res.pause(); // I want this to end instead of pausing
});
});
I want to stop receiving the response altogether after the first chunk, but I don't see any close or end methods, only pause and resume. My worry using pause is that a reference to this response will be hanging around indefinitely.
Any ideas?
Pop this in a file and run it. You might have to adjust to your local google, if you see a 301 redirect answer from google (which is sent as a single chunk, I believe.)
var http = require('http');
var req = http.get("http://www.google.co.za/", function(res) {
res.setEncoding();
res.on('data', function(chunk) {
console.log(chunk.length);
res.destroy(); //After one run, uncomment this.
});
});
To see that res.destroy() really works, uncomment it, and the response object will keep emitting events until it closes itself (at which point node will exit this script).
I also experimented with res.emit('end'); instead of the destroy(), but during one of my test runs, it still fired a few additional chunk callbacks. destroy() seems to be a more imminent "end".
The docs for the destroy method are here: http://nodejs.org/api/stream.html#stream_stream_destroy
But you should start reading here: http://nodejs.org/api/http.html#http_http_clientresponse (which states that the response object implements the readable stream interface.)