I'm requesting a remote file using an https.request in node.js. I'm not interested in receiving the whole file, I just want what's in the first chunk.
var req = https.request(options, function (res) {
res.setEncoding('utf8');
res.on('data', function (d) {
console.log(d);
res.pause(); // I want this to end instead of pausing
});
});
I want to stop receiving the response altogether after the first chunk, but I don't see any close or end methods, only pause and resume. My worry using pause is that a reference to this response will be hanging around indefinitely.
Any ideas?
Pop this in a file and run it. You might have to adjust to your local google, if you see a 301 redirect answer from google (which is sent as a single chunk, I believe.)
var http = require('http');
var req = http.get("http://www.google.co.za/", function(res) {
res.setEncoding();
res.on('data', function(chunk) {
console.log(chunk.length);
res.destroy(); //After one run, uncomment this.
});
});
To see that res.destroy() really works, uncomment it, and the response object will keep emitting events until it closes itself (at which point node will exit this script).
I also experimented with res.emit('end'); instead of the destroy(), but during one of my test runs, it still fired a few additional chunk callbacks. destroy() seems to be a more imminent "end".
The docs for the destroy method are here: http://nodejs.org/api/stream.html#stream_stream_destroy
But you should start reading here: http://nodejs.org/api/http.html#http_http_clientresponse (which states that the response object implements the readable stream interface.)
Related
New to Node.js I do understand that createReadStream() function is better for the performance than readFile(), because createReadStream() reads and writes data in chucks while readFile() first reads the whole content. Thus if the file is large, readFile() function might take longer before data can be processed futher. Thus I choose to create server using createReadStream() function as following.
// Create a server with fs.createReadStream(), better performance and less memory usage.
http.createServer( function (request, response) {
// Parse the request containing file name
var pathname = url.parse(request.url).pathname;
// Create a readable stream.
var readerStream = fs.createReadStream(pathname.substr(1));
// Set the encoding to be UTF8.
readerStream.setEncoding('UTF8');
// Handle stream events --> data, end and error
readerStream.on('data', function(chunk) {
// Page found
// HTTP Status: 200 : OK
// Content Type: text/plain
response.writeHead(200, {'Content-type': 'text/html'});
// Write the content of the file to response body.
response.write(chunk);
console.log('Page is being streamed...');
});
readerStream.on('end', function() {
console.log('Page is streamed and emitted successfully.');
});
readerStream.on('error', function(err) {
// HTTP Status: 404 : NOT FOUND
// Content Type: text/plain
response.writeHead(404, {'Content-type': 'text/html'});
console.log('Page streaming error: ' + err);
});
console.log('Code ends!');
}).listen(8081);
// Console will print the message
console.log('Server running at http://127.0.0.1:8081/');
My .html or .txt file contains three short lines of text. After starting my server I visit my web page by going to http://127.0.0.1:8081/index.html. Everything works fine and the content of index.html is echoed on the browser.
But on the tab of the browser, the loader icon keeps turning like it keeps loading for about 1 minute.
Is that normal with Node.js server? Does the icon just keep turning, but costs nothing to the server? Or do I miss something and icon is not supposed to keep turning?
It doesn't look like you are ending your response. The browser probably thinks the request isn't finished and thus continues to "load".
If you look at the Network tab in the developer console you might see the request hasn't finished.
You should be sending response.end()
This method signals to the server that all of the response headers and body have been sent; that server should consider this message complete. The method, response.end(), MUST be called on each response.
I believe you should be calling response.end() in both the readerStream.on('end' and readerStream.on('error' callbacks after you write the head. This will tell the browser the request is finished and it can stop the loading action.
I'm working on a project that involving a lot of large files where I only need to extract the HTTP header rather than loading the entirely of the file itself, so I'm using the request module to extract the HTTP header immediately and then abort the request since I don't need the entirely of the file. Alas, my current structure has me assigning the request object and then using a listener for response as such is the case below.
const req = request(url);
req.on('response', function(res) {
if (res.statusCode !== 200) {
req.abort();
return this.emit('error', new Error('Bad status code'));
}
if (res.headers.hasOwnProperty(headProp)) {
parseFunc(res.headers);
req.abort();
}
});
Ideally, I'd like, if possible, to utilize Promises to be able to parse the request URL like:
const req = request.getAsync(url)
req.on('response', function(res) {
//whatever logic
});
req.then(parseFunc(res.headers);
But listener events don't really work since the request object hasn't been saved to anything. Additionally, the chained then on a request.getAsync.then seems to only execute after the file has been parsed, which can be 10-11 seconds vs the 250ms-1s of exiting upon the abort.
So, in short: can I get the functionality I desire while avoiding callbacks?
In browser javascript is pathetically broken in that the only way to make requests is using script tags and jsonp. To make this useful, I'm trying to make a nodejs server that, given a callback name and address, loads the page at the address and pads it in a call to callback and serves the result. However, I know next to nothing about nodejs. If the server's response is loaded from a script tag it would result in actually loading a web page. Currently, I'm writing the request as localhost:8000/callback/address so a script tag might be <script src="localhost:8000/alert/https://www.google.com" type="text/javascript"></script>. Here is my code for the server:
var http = require("http");
var request = require("request");
var server = http.createServer(function(req, res){
req.on("end", function(){
console.log("alive");
var url = req.url;
var i = url.indexOf("/", 1);
request(url.substring(i + 1), function(err, ret, body){
res.writeHead(200);
res.write(url.substring(1, i) + "(\"" + body + "\");");
res.end();
});
});
});
server.listen(8000);
Why does this stay loading for a very long time but never actually load? By using console.log() it seems as if the req.on("end") callback is never even called.
If you don't care about any request data, you could just add req.resume(); after you add your end event handler.
The reason it's getting "stuck" is that since node v0.10, streams start out in a paused state, so you need to unpause them by reading from them in some way. req.resume(); accomplishes this. Once there is nothing left in the request stream (which there could be nothing), the end event will be emitted.
I have a node.js process that uses a large number of client requests to pull information from a website. I am using the request package (https://www.npmjs.com/package/request) since, as it says: "It supports HTTPS and follows redirects by default."
My problem is that after a certain period of time, the requests begin to hang. I haven't been able to determine if this is because the server is returning an infinite data stream, or if something else is going on. I've set the timeout, but after some number of successful requests, some of them eventually get stuck and never complete.
var options = { url: 'some url', timeout: 60000 };
request(options, function (err, response, body) {
// process
});
My questions are, can I shut down a connection after a certain amount of data is received using this library, and can I stop the request from hanging? Do I need to use the http/https libraries and handle the redirects and protocol switching myself in order the get the kind of control I need? If I do, is there a standardized practice for that?
Edit: Also, if I stop the process and restart it, they pick right back up and start working, so I don't think it is related to the server or the machine the code is running on.
Note that in request(options, callback), the callback will be fired when request is completed and there is no way to break the request.
You should listen on data event instead:
var request = require('request')
var stream = request(options);
var len = 0
stream.on('data', function(data) {
// TODO process your data here
// break stream if len > 1000
len += Buffer.byteLength(data)
if (len > 1000) {
stream.abort()
}
})
I'm trying to use streams in Node.js to basically build a running buffer of HTTP data until some processing is done, but I'm struggling with the specifics of streams. Some pseudocode will probably help:
var server = http.createServer(function(request, response) {
// Create a buffer stream to hold data generated by the asynchronous process
// to be piped to the response after headers and other obvious response data
var buffer = new http.ServerResponse();
// Start the computation of the full response as soon as possible, passing
// in the buffer stream to hold returned data until headers are written
beginAsyncProcess(request, buffer);
// Send headers and other static data while waiting for the full response
// to be generated by 'beginAsyncProcess'
sendObviousData(response, function() {
// Once obvious data is written (unfortunately HTTP and Node.js have
// certain requirements for the order data must be written in) then pipe
// the stream with the data from 'beginAsyncProcess' into the response
buffer.pipe(response);
});
});
Most of this is almost legitimate code, but it doesn't work. The basic issue is figuring out a way to take advantage of the asynchronous nature of Node.js when there are certain order requirements associated with HTTP requests, namely that headers must always be written first.
While I would definitely appreciate any answers with little hacks to get around the order problem without directly addressing streams, I wanted to use the opportunity to get to know them better. There are plenty of similar situations, but this scenario is more to open the can of worms than anything else.
Let's make a use of callbacks and streams in Node.js and .pause() / .resume() stream functions:
var server = http.createServer(function(request, response) {
// Handle the request first, then..
var body = new Stream(); // <-- you can implement stream.Duplex for read / write operations
body.on('open', function(){
body.pause();
// API generate data
// body.write( generated data ) <-- write to the stream
body.resume();
});
var firstPartOfThePage = getHTMLSomeHow();
response.writeHead(200, { 'Content-Type': 'text/html'});
response.write(firstPartOfThePage, function(){ // <-- callback after sending first part, our body already being processed
body.pipe( response ); // <-- This should fire after being resumed
body.on('end', function(){
response.end(); // <-- end the response
});
});
});
Check this: http://codewinds.com/blog/2013-08-31-nodejs-duplex-streams.html for costum duplex stream creation.
Note: it's still a pseudo code