I have this route:
app.get("/api/current_user", (req, res) => {
//This function takes 3~ seconds to finish
someObj.logOn(data => {
someObj.setData(data);
});
//This will return before function finishes
return res.send(someObj.data);
});
Here is the .logOn() function (simplified):
logOn(_callback) {
//has some data
var info = {};
//returns data in callback
_callback(info);
}
Question: Is there some way to to wait for the function to finish before returning? This function does not deal with promises, so I cannot use async/await. I couldnt find any good answers, and anything with waiting either had to deal with promises or setTimeout. Both of which would not work, right?
Note: If I put the return statement inside the callback right under someObj.setData(data); I will get an error like this:
can't set headers after they are sent
This error occurs not on the intial route load, but after I refresh one more time.
Use a callback. Changing res.send to res.end so that no headers are set. It seems that something is written to the response in the functions we can't see.
app.get("/api/current_user", (req, res) => {
//This function takes 3~ seconds to finish
someObj.logOn(data => {
someObj.setData(data);
res.end(JSON.stringify(data));
});
});
Related
I'm experimenting with Puppeteer Cluster and I just don't understand how to use queuing properly. Can it only be used for calls where you don't wait for a response? I'm using Artillery to fire a bunch of requests simultaneously, but they all fail while only some fail when I have the command execute directly.
I've taken the code straight from the examples and replaced execute with queue which I expected to work, except the code doesn't wait for the result. Is there a way to achieve this anyway?
So this works:
const screen = await cluster.execute(req.query.url);
But this breaks:
const screen = await cluster.queue(req.query.url);
Here's the full example with queue:
const express = require('express');
const app = express();
const { Cluster } = require('puppeteer-cluster');
(async () => {
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_CONTEXT,
maxConcurrency: 2,
});
await cluster.task(async ({ page, data: url }) => {
// make a screenshot
await page.goto('http://' + url);
const screen = await page.screenshot();
return screen;
});
// setup server
app.get('/', async function (req, res) {
if (!req.query.url) {
return res.end('Please specify url like this: ?url=example.com');
}
try {
const screen = await cluster.queue(req.query.url);
// respond with image
res.writeHead(200, {
'Content-Type': 'image/jpg',
'Content-Length': screen.length //variable is undefined here
});
res.end(screen);
} catch (err) {
// catch error
res.end('Error: ' + err.message);
}
});
app.listen(3000, function () {
console.log('Screenshot server listening on port 3000.');
});
})();
What am I doing wrong here? I'd really like to use queuing because without it every incoming request appears to slow down all the other ones.
Author of puppeteer-cluster here.
Quote from the docs:
cluster.queue(..): [...] Be aware that this function only returns a Promise for backward compatibility reasons. This function does not run asynchronously and will immediately return.
cluster.execute(...): [...] Works like Cluster.queue, just that this function returns a Promise which will be resolved after the task is executed. In case an error happens during the execution, this function will reject the Promise with the thrown error. There will be no "taskerror" event fired.
When to use which function:
Use cluster.queue if you want to queue a large number of jobs (e.g. list of URLs). The task function needs to take care of storing the results by printing them to console or storing them into a database.
Use cluster.execute if your task function returns a result. This will still queue the job, so this is like calling queue in addition to waiting for the job to finish. In this scenario, there is most often a "idling cluster" present which is used when a request hits the server (like in your example code).
So, you definitely want to use cluster.execute as you want to wait for the results of the task function. The reason, you do not see any errors is (as quoted above) that the errors of the cluster.queue function are emitted via a taskerror event. The cluster.execute errors are directly thrown (Promise is rejected). Most likely, in both cases your jobs fail, but it is only visible for the cluster.execute
I'm trying to figure out, what is the best/fast way to send the response from expressjs and then do log or do long actions in the server, without delaying the response to the client.
I have the following code, but I see that the response to the client is send only after the loop is finished. I though that the response will be send because I'm triggering res.send(html); and then calling longAction
function longAction () {
for (let i = 0; i < 1000000000; i++) {}
console.log('Finish');
}
function myfunction (req, res) {
res.render(MYPATH, 'index.response.html'), {title: 'My Title'}, (err, html) => {
if (err) {
re.status(500).json({'error':'Internal Server Error. Error Rendering HTML'});
}
else {
res.send(html);
longAction();
}
});
}
router.post('/getIndex', myfunction);
What is the best way to send the response and then run my long/heavy actions?
Or What I'm missing?
I'm trying to figure out, what is the best/fast way to send the response from expressjs and then do log or do long actions in the server, without delaying the response to the client.
The best way to do this is to only call longAction() when express tells you that the response has been sent. Since the response object is a stream, you can use the finish event on that stream to know when all data from the stream has been flushed to the underlying OS.
From the writable stream documentation:
The 'finish' event is emitted after the stream.end() method has been called, and all data has been flushed to the underlying system.
Here's how you could use that in your specific code:
function myfunction (req, res) {
res.render(MYPATH, 'index.response.html'), {title: 'My Title'}, (err, html) => {
if (err) {
res.status(500).json({'error':'Internal Server Error. Error Rendering HTML'});
}
else {
res.on('finish', () => {
longAction();
});
res.send(html);
}
});
}
For a little more explanation on the finish event, you can start by looking at the Express code for res.send() and see that is ends up calling res.end() to actually send the data. If you then look at the documentation for .end() on a stream writable, it says this:
Calling the writable.end() method signals that no more data will be written to the Writable. The optional chunk and encoding arguments allow one final additional chunk of data to be written immediately before closing the stream. If provided, the optional callback function is attached as a listener for the 'finish' event.
So, since Express doesn't expose access to the callback that .end() offers, we just listen to the finish event ourselves to be notified when the stream is done sending its last bit of data.
Note, there is also a typo in your code where re.status(500) should be res.status(500).
Use setImmediate :
var test = function(){
for (let i = 0; i < 1000000000; i++) {}
console.log('Finish');
}
router.get("/a", function(req, res, next){
setImmediate(test);
return res.status(200).json({});
});
Your long function will be executed at the end of the current event loop cycle. This code will execute after any I/O operations (in this sample, first I/O operation is the res.status(200)) in the current event loop and before any timers scheduled for the next event loop.
Thanks for the answers:
After checking I think the best approach is using listening to finish event
res.on('finish', () => {
// Do another stuff after finish got the response
});
I'm new to Node.js, so I'm still wrapping my head around asynchronous functions and callbacks. My struggle now is how to return a response after reading data from a file in an asynchronous operation.
My understanding is that sending a response works like this (and this works for me):
app.get('/search', function (req, res) {
res.send("request received");
});
However, now I want to read a file, perform some operations on the data, and then return the results in a response. If the operations I wanted to perform on the data were simple, I could do something like this -- perform them inline, and maintain access to the res object because it's still within scope.
app.get('/search', function (req, res) {
fs.readFile("data.txt", function(err, data) {
result = process(data.toString());
res.send(result);
});
});
However, the file operations I need to perform are long and complicated enough that I've separated them out into their own function in a separate file. As a result, my code looks more like this:
app.get('/search', function (req, res) {
searcher.do_search(res.query);
// ??? Now what ???
});
I need to call res.send in order to send the result. However, I can't call it directly in the function above, because do_search completes asynchronously. And I can't call it in the callback to do_search because the res object isn't in scope there.
Can somebody help me understand the proper way to handle this in Node.js?
To access a variable in a different function, when there isn't a shared scope, pass it as an argument.
You could just pass res and then access both query and send on the one variable within the function.
For the purposes of separation of concerns, you might be better off passing a callback instead.
Then do_search only needs to know about performing a query and then running a function. That makes it more generic (and thus reusable).
searcher.do_search(res.query, function (data) {
res.send(...);
});
function do_search(query, callback) {
callback(...);
}
The existing answers are perfectly valid, you can also use async/await keywords since ES2017. Using your own function:
app.get('/search', async(req, res, next) {
try {
const answer = await searcher.do_search(req.query);
res.send(answer);
}
catch(error) {
return next(error);
}
});
Is it possible, in node.js, to make an asynchronous call that times out if it takes too long (or doesn't complete) and triggers a default callback?
The details:
I have a node.js server that receives a request and then makes multiple requests asynchronously behind the scenes, before responding. The basic issue is covered by an existing question, but some of these calls are considered 'nice to have'. What I mean is that if we get the response back, then it enhances the response to the client, but if they take too long to respond it is better to respond to the client in a timely manner than with those responses.
At the same time this approach would allow to protect against services that simply aren't completing or failing, while allowing the main thread of operation to respond.
You can think of this in the same way as a Google search that has one core set of results, but provides extra responses based on other behind the scenes queries.
If its simple just use setTimout
app.get('/', function (req, res) {
var result = {};
// populate object
http.get('http://www.google.com/index.html', (res) => {
result.property = response;
return res.send(result);
});
// if we havent returned within a second, return without data
setTimeout(function(){
return res.send(result);
}, 1000);
});
Edit: as mentioned by peteb i forgot to check to see if we already sent. This can be accomplished by using res.headerSent or by maintaining a 'sent' value yourself. I also noticed res variable was being reassigned
app.get('/', function (req, res) {
var result = {};
// populate object
http.get('http://www.google.com/index.html', (httpResponse) => {
result.property = httpResponse;
if(!res.headersSent){
res.send(result);
}
});
// if we havent returned within a second, return without data
setTimeout(function(){
if(!res.headersSent){
res.send(result);
}
}, 1000);
});
Check this example of timeout callback https://github.com/jakubknejzlik/node-timeout-callback/blob/master/index.js
You could modify it to do action if time's out or just simply catch error.
You can try using a timeout. For example using the setTimeout() method:
Setup a timeout handler: var timeOutX = setTimeout(function…
Set that variable to null: timeOutX = NULL (to indicate that the timeout has been fired)
Then execute your callback function with one argument (error handling): callback({error:'The async request timed out'});
You add the time for your timeout function, for example 3 seconds
Something like this:
var timeoutX = setTimeout(function() {
timeOutX = null;
yourCallbackFunction({error:'The async request timed out'});
}, 3000);
With that set, you can then call your async function and you put a timeout check to make sure that your timeout handler didn’t fire yet.
Finally, before you run your callback function, you must clear that scheduled timeout handler using the clearTimeout() method.
Something like this:
yourAsyncFunction(yourArguments, function() {
if (timeOutX) {
clearTimeout(timeOutX);
yourCallbackFunction();
}
});
//Handling get request from the client
app.get('/watch', function(req, res) {
//A for loop goes here which makes multiple requests to an external API and stores the result in a variable data
console.log(data); //Gives an empty value!!
res.write(data); //Sends an empty value
}
Now when I try to log the data variable DURING the loop, it's value is as expected. However it sends data as an empty variable to the client. I am pretty sure this is because the for loop takes a while to execute and Node being non-blocking moves to the next part of the code. Is there a workaround for this or is something fundamentally wrong with the design of my code?
EDIT: Posting the for loop as requested
for(var num in data.items)
{
url ='/Product';
options={
host:"api.xyz.com",
path:url
};
http.get(options,function(response){
var responseData = "";
response.setEncoding('utf8');
//stream the data into the response
response.on('data', function (chunk) {
responseData+=chunk.toString();
});
//responseData=JSON.parse(responseData);
//write the data at the end
response.on('end', function(){
body=JSON.parse(responseData);
var discount=body.product[0].styles[0].percentOff.replace('%','');
if(discount>=20)
{
discounted_json.disc_items.push({"percentOff":discount});
}
});
});
}
When you want to call multiple asynchronous functions in order, you should call the first one, call the next one in it's callback and so on. The code would look like:
asyncFunction1(args, function () {
asyncFunction2(args, function () {
asyncFunction3(args, function () {
// ...
})
})
});
Using this approach, you may end up with an ugly hard-to-maintain piece of code.
There are various ways to achieve the same functionality without nesting callbacks, like using async.js or node-fibers.