I am trying to download a report that is generated daily on the first request to the report's endpoint.
When the report is being created, the endpoint returns a HTTP 202.
I have the following code to handle some errors and redirects, as well as trying to "sleep" for 60 seconds before continuing try the endpoint again. Unfortunately, the second console log to tell me the download completed is never called, though the file does indeed download successfully and the filestream closes.
// Main function
run()
async function run() {
await getReport()
await processReport()
}
async function getReport() {
console.log(`Downloading ${reportFileName}`)
await downloadFile(url, reportFileName)
console.log(`Downloaded ${reportFileName} successfully.`) // This is never called?
}
async function downloadFile (url, targetFile) {
return await new Promise((resolve, reject) => {
https.get(url, async response => {
const code = response.statusCode ?? 0
if (code >= 400) {
return reject(new Error(response.statusMessage))
}
// handle redirects
if (code > 300 && code < 400 && !!response.headers.location) {
resolve(downloadFile(response.headers.location, targetFile))
return
}
// handle file creation pending
if (code == 202) {
console.log(`Report: ${reportFileName} is still being generated, trying again in ${timeToSleepMs/1000} seconds...`)
await sleep(timeToSleepMs)
resolve(downloadFile(url, targetFile))
return
}
// make download directory regardless of if it exists
fs.mkdirSync(outputPath, { recursive: true }, (err) => {
if (error) throw error;
});
// save the file to disk
const fileWriter = fs
.createWriteStream(`${outputPath}/${targetFile}`)
.on('finish', () => {
resolve({})
})
response.pipe(fileWriter)
}).on('error', error => {
reject(error)
})
})
}
Finally my sleep function:
let timeToSleepMs = (60 * 1000)
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
I'm pretty sure this has to do with some sort of async issue because that always seems to be my issue with Node, but I'm not sure how to handle it. I just want to fetch a file and download it locally, retrying if I get a HTTP 202. If there's a better way, please let me know!
tl;dr - How do I properly handle waiting for a HTTP 202 response to turn into a HTTP 200 when the file is generated, then continue executing code after the file is downloaded?
await downloadFile(url, reportFileName) invokes a recursive function and waits for the promise from the outermost invocation to resolve. But if the function calls itself recursively in
await sleep(timeToSleepMs)
return downloadFile(url, targetFile)
this outermost promise is never resolved.
Replace the two lines above with
await sleep(timeToSleepMs)
resolve(downloadFile(url, targetFile))
return
then the resolution of the outermost promise is the resolution of the recursively invoked second-outermost promise, and so on.
Heiko already identified the problem: when you return from the callback, you never resolve the promise. To avoid such mistakes in general, it is recommended not to mix the promisification of a function and the business logic. Do use async/await only in the latter, do not pass async functions as callbacks. In your case, that would be
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
function httpsGet(url) {
return new Promise((resolve, reject) => {
https.get(url, resolve).on('error', reject);
});
}
function writeFile(path, readableStream) {
return new Promise((resolve, reject) => {
const fileWriter = fs
.createWriteStream(path)
.on('finish', resolve)
.on('error', reject);
readableStream.pipe(fileWriter);
});
}
Then you can easily write a straightforward function without any callbacks:
async function downloadFile (url, targetFile) {
const response = httpsGet(url);
const code = response.statusCode ?? 0
if (code >= 400) {
throw new Error(response.statusMessage);
}
// handle redirects
if (code > 300 && code < 400 && !!response.headers.location) {
return downloadFile(response.headers.location, targetFile)
}
// handle file creation pending
if (code == 202) {
console.log(`Report: ${reportFileName} is still being generated, trying again in ${timeToSleepMs/1000} seconds...`)
await sleep(timeToSleepMs)
return downloadFile(url, targetFile)
}
// make download directory regardless of if it exists
fs.mkdirSync(outputPath, { recursive: true });
// save the file to disk
await writeFile(`${outputPath}/${targetFile}`, response);
}
Related
I have an atypical use case for the cypress test runner, where I need to start the server from within the cypress.
I'm doing that by defining the before:spechook in cypress plugins/index.jslike so:
module.exports = (on, config) => {
on('before:spec', async(spec) => {
// the promise will be awaited before the runner continues with the spec
return new Promise((resolve, reject) => {
startServer();
// keep checking that the url accessible, when it is: resolve(null)
while (true) {
getStatus(function(statusCode) {
if (statusCode === 200)
break
})
};
resolve(null)
I'm struggling to implement this while loop that is supposed to keep checking if the url is accessible before fulfilling the before:spec promise.
I have the following function for checking the url:
function getStatus (callback) {
const options = {
hostname: 'localhost',
port: 8080,
path: '/',
method: 'GET'
}
const req = http.request(options, res => {
console.log(`statusCode: ${res.statusCode}`)
callback(res.statusCode}
})
req.on('error', error => {
console.error("ERROR",error)
})
req.end()
};
Any help implementing that loop or other suggestions how to achieve the task of checking the url before fulfilling the before:specpromise appreciated.
Ideally your startServer function should return a promise and in the before:spec hook you simply await startServer();. Or at least should accept a callback which called when the server initialisation is complete. But lets assume that is not possible, so here is another solution for the given code:
function getStatus() {
return new Promise((resolve, reject) => {
const options = {
hostname: 'localhost',
port: 8080,
path: '/',
method: 'GET'
}
const req = http.request(options, res => {
console.log(`statusCode: ${res.statusCode}`)
resolve(res.statusCode);
})
req.on('error', error => {
console.error("ERROR", error);
reject(error);
})
req.end()
});
};
module.exports = (on, config) => {
on('before:spec', async (spec) => {
// the promise will be awaited before the runner continues with the spec
startServer();
// keep checking that the url accessible, when it is: resolve(null)
while (await getStatus() !== 200) {
await (new Promise(resolve => setTimeout(resolve, 50)));
}
});
}
Your original try with while loop had serious flaws as you can't break like that and you flooded your server with requests.
There is only one strange part in the current one, await (new Promise(resolve => setTimeout(resolve, 50))); . This is simply to prevent flooding, lets give 50ms if the service was not yet ready. If you know your service is really slower to start feel free to adjust this, but much lower values doesn't make much sense. Actually it is not even strictly necessary, as the condition in while loop ensures that only one request will be running at a time. But I felt a bit safer this way, pointless to try to server too often if it is still warming up.
Also please note that you may want to resolve(500) or omit resolve/reject in req.on('error') as I don't know if your server is immediately ready to return proper status code, it depends on the implementation of startServer.
I've got an async function that launches a NodeJS worker thread like so:
encode : async (config) => {
if (isMainThread) {
const encode_worker = new Worker(`./service-encode.js`, { workerData: config });
encode_worker.on('message', (transcode_data) => {
log.info("%o", transcode_data);
return transcode_data;
});
encode_worker.on('error', (err) => { log.error(err)});
encode_worker.on('exit', (code) => {
if (code !== 0)
throw new Error(`Encoding stopped with exit code [ ${code} ]`);
console.log("* * * EXITED ENCODER WORKER * * *")
});
}
},
In the serivce-encode.js file I've got the following code which uses async functions. Note that I am using postMessage to signal that it is done.
var transcoder = require('./transcoder');
const {Worker, isMainThread, parentPort, workerData} = require('worker_threads');
console.log("* * * STARTING ENCODE THREAD * * *\n");
console.log(workerData);
transcoder.run(workerData)
.then((results) => {
transcode_data = results;
parentPort.postMessage(transcode_data);
})
.catch(err => { throw err });
Then, I use the following example code but the code in the 'message' event from above fires off immediately. That is, it doesn't seem to wait until it's done:
encode(conf).then((encode_data) => { console.log("Encode Data :", encode_data);
The encode function works fine, but the console.log statement executes immediately when calling encode() function — also the encode_data var is undefined. Since the return statement in the encode is in the message event, shouldn't the promise of the async function be resolved at that time?
So, NOTHING about the code inside your async function supports promises. You can't just throw random asynchronous (but not promise-based) code inside an async function and expect anything to work. An async function will work just fine with promise-based asynchronous functions that you await. Otherwise, it knows nothing about your asynchronous operations in there. That's why calling encode() returns immediately without waiting for anything to complete.
In addition, return transcode_data is inside an asynchronous callback. Returning inside that callback just goes back into the system code that called the callback and is dutifully ignored. You're not returning anything there from the async function itself. You're returning to that callback.
Since your operation is not promise-based, to solve this, you will have to make it promise-based by wrapping it in a promise and then manually resolved or rejecting that promise when needed with the proper values. You can do that like this:
encode: (config) => {
if (isMainThread) {
return new Promise((resolve, reject) => {
const encode_worker = new Worker(`./service-encode.js`, { workerData: config });
encode_worker.on('message', (transcode_data) => {
log.info("%o", transcode_data);
resolve(transcode_data);
});
encode_worker.on('error', (err) => {
log.error(err)
reject(err);
});
encode_worker.on('exit', (code) => {
if (code !== 0) {
reject(new Error(`Encoding stopped with exit code [ ${code} ]`));
}
console.log("* * * EXITED ENCODER WORKER * * *")
});
});
} else {
// should return a promise from all calling paths
return Promise.reject(new Error("Can only call encode() from main thread"));
}
},
FYI, this code assumes that the "result" you're looking for here from the promise is the transcode_data you get with the first message from this worker.
Desperately trying to write a sync version of https://www.npmjs.com/package/node-firebird#reading-blobs-aasynchronous
Basically I need to (a)wait twice:
for the callback function to execute so that the eventEmitter is available
for the "end" event to occur
and then return the Buffer.
my code (JS/TS mix for now) currently does 2, but not 1 : readBlob returns undefined, then Buffer.concat(buffers) is called later ... :
function readBLOB(callback: any): Buffer {
return callback(async (err, _, eventEmitter) => {
let buffers = []
if (err)
throw err
eventEmitter.on('data', chunk => {
buffers.push(chunk);
});
return await eventEmitter.once('end', function (e) {
return Buffer.concat(buffers)
})
})
}
Sorry to ask one more time (yes, I checked a lot of other questions and tried a lot of things...), but how to make this work (simply...) ?
(the function that calls the callback is fetch_blob_async in https://github.com/hgourvest/node-firebird/blob/master/lib/index.js#L4261 , just in case...)
There are few mistakes here like returning an callback function, witch returns, i guess, undefined or returning something IN an callback function that makes no sense.
Also async / await makes no sense here it has no effect. async / await is only useful if you want to await till some Promise resolves. But you have no Promise in your code at all.
What you need is new Promise
function readBLOB(callback) {
return new Promise((resolve, reject) => {
callback((err, _, eventEmitter) => {
let buffers = [];
if (err) reject(err);
eventEmitter.on("data", chunk => {
buffers.push(chunk);
});
eventEmitter.once("end", function(e) {
resolve(Buffer.concat(buffers));
});
});
});
}
Simple like that. You resolve your Buffer and reject if some error occurs
Now you can use it like:
readBLOB(cb).then(data => {
console.log(data);
})
Let's say I have a function hello written like this:
const functions = require("firebase-functions");
const wait3secs = () => {
return new Promise((resolve, reject) => {
setTimeout(() => {
console.log('3 secs job complete');
resolve('done 3 secs');
}, 3000);
});
}
const wait2secs = () => {
return new Promise((resolve, reject) => {
setTimeout(() => {
console.log('2 secs job complete');
resolve('done 2 secs');
}, 2000);
});
}
exports.hello = functions.https.onRequest((req, res) => {
wait3secs(); // Unhandled, this uses 3 seconds to complete.
return wait2secs().then(data => {
res.send(data); // Send response after 2 secs.
});
});
The question is: For the above implementation, did I write it correctly (considering unhandled promise)? And if yes, will the wait3secs be guaranteed to run (asynchronously) in firebase functions until the end, even after the response is being sent?
I have searched in Firebase (here) but haven't found a specific answer to my question.
According to the documentation:
Terminate HTTP functions with res.redirect(), res.send(), or res.end().
What this is saying is that when you call res.send(), the function will be terminated. You should not expect any async work to complete after that - the function will be shut down.
If you need to do more work after sending the response, you will have to arrange to trigger another function to run in the background, such as a pubsub function. That function will need to return a promise that resolves only after all the async work is complete, so that it also does not get shut down prematurely.
I am trying to unzip a file first and then await for that unzipping file to be complete before i loop through each file and upload it to an S3 bucket. The first function unzipPromise is running fine, and everything is getting unzipped in the proper directory, but the uploadS3Promise is not running at all. I am not getting errors through this process, it just runs and unzips the file and never touches the uploadS3Promise function.
function unzipInput(file, client, project_number, oldpath, newpath) {
path = `zips/${client}/${project_number}/${file}/`;
function unzipPromise() {
return new Promise((resolve, reject) => {
fse.mkdirsSync(path);
fs.rename(oldpath, newpath, err => {
if (err) {
throw err;
}
});
fs.createReadStream(newpath).pipe(unzip.Extract({ path }));
});
}
function uploadS3Promise() {
console.log("running");
return new Promise((resolve, reject) => {
// fs.unlinkSync(newpath);
fs.readdirSync(newpath).forEach(file => {
uploadToS3(file, client, project_number, path);
console.log(file, "test");
});
if (err) reject(err);
else resolve("success");
});
}
// New code with async:
(async () => {
try {
await unzipPromise();
await uploadS3Promise();
} catch (e) {
console.error(e);
}
})();
}
uploadS3Promise doesn't run because the code is still awaiting for unzipPromise to complete. The code will not execute further unless you resolve or reject a promise.
So in you code ...
function unzipPromise(){
...
resolve(...)
...
}
On an unrelated note, I think it would be more readable not to name function names end in promise. Like just call them unzip and uploadS3. We don't name our function usually by return types right, like we never say intIndexOf, and so on.
You supposed to call resolve after unzip path is done or reject if error occured.
And since streams are EventEmitter, you can listen to events and interact with it
const stream = fs.createReadStream(newpath).pipe(unzip.Extract({ path }))
stream
.on('error', (err) => reject(err))
.on('finish', () => resolve())
Personally, I'd use a .then to break the process down a bit.
unzipPromise().then(res => {
console.log('resolved promise 1');
uploadS3Promise();
} rej => {
console.log('rejected promise 1, sorry!');
});
Also- "unzipInput" is never resolved or rejected.