NodeJS request times out when sending a archiver file - javascript

I have a NextJS app where I have to generate loads of QR codes, 300,400,500 at a time. I want to put them inside a zip and let the user download them. Here's the code which puts the codes into a zip using the archiver library`:
const archive = archiver.create("zip", {});
let index = 1;
for (const code of codes) {
// #ts-expect-error
const qrCode = new QRCodeStyling({
nodeCanvas: canvas,
width: 300,
height: 300,
data: code.url
});
archive.append(
await qrCode.download({
name: "testName",
extension: "png",
skipDownload: true,
buffer: true,
}),
{
name: `${code.name}-${index}.png`,
},
);
index++;
}
console.log("before finalizing");
await archive.finalize();
console.log("after finalizing");
return archive;
The request takes like 10s on a local machine, and when in production it just times out every time. The program sometimes doesn't even respond, it just hangs on before finalizing..
This is how I'm sending my codes to the frontend:
res.setHeader("Content-Disposition", "attachment");
return res.status(200).send(await exportCodes());
Please note that these zip files are usually around 3-5mb, so size should not be a problem I don't think

When calling qrCode.download(...), you create a promise, which is then awaited, and the result is passed to archive.append(...), – only after that you get to download the next QR code. Let's say, downloading a QR code takes 100 msec. Multiply that by 300, and you get 30000 msec, or 30 seconds, – on average. Of course, it will time out.
You have to parallelize downloading. Create a bunch of promises, await them all together at once, then archive.append(...) the results one by one:
interface QrCodeDownloaded {
name: string;
index: number;
// I don't know the type of result here, sorry; must be something like Buffer | Stream | string
result: any;
}
const archive = archiver.create("zip", {});
const downloads: Promise<QrCodeDownloaded>[] = [];
// We have to use classic for-loop here, because we want to have
// an independent `index` variable for each iteration
for (let index = 0; index < codes.length; index++) {
// counting from 0 here
const code = codes[index];
const qrCode = new QRCodeStyling({
nodeCanvas: canvas,
width: 300,
height: 300,
data: code.url
});
const download = qrCode.download({
name: "testName",
extension: "png",
skipDownload: true,
buffer: true,
});
// We have to use .then() here, because we want to hold to
// code names and indexes, but we also don't want to re-iterate through
// the array again. We can't use `await` here, because this will negate
// the whole thing, – we want to initiate the next download without
// awaiting the current download
downloads.push(download.then((result) => ({
result,
name: code.name,
index: index + 1, // counting from 1 here
})));
}
// This is the most important line, it does the parallelizing
const results = await Promise.all(downloads);
for (const { result, name, index } of results) {
archive.append(result, { name: `${name}-${index}.png` });
}
console.log("before finalizing");
await archive.finalize();
console.log("after finalizing");
return archive;

Related

How more than 2 parallel api requests can change database without skipping one or more of them?

I have an endpoint uploads images and updates a database table.
I send 3 requests to this endpoint at same time. Actually this problem happens when I send more than 2 API requests.
First request that comes to endpoint uploads images and updates database table successfully.
Second request that comes to endpoint uploads images sees database changes of first request, and updates database table successfully.
Third request that comes to endpoint uploads images, doesn't see database changes of second request, and updates database table successfully.
As a result; only database changes of first request and third request apply. Database changes of second request is not able to applied successfully or is overridden, somehow.
I use pg npm package.
Is problem in my code or in pg package.
How can this problem be solved?
Controller:
#UseStaffPermissionsGuards('upsert', 'VehicleCondition')
#ApiBody({ type: VehiclePhotoConditionInfoImageDTO })
#ApiResponse({ status: 201 })
#Post(':id/photos/:photoConditionId/image')
#ApiConsumes('multipart/form-data')
#UseInterceptors(FilesInterceptor('images'), FilesToBodyInterceptor)
async upsertImages(
#Param('id') vehicleId: string,
#Param('photoConditionId') photoConditionId: string,
#Body() vehiclePhotoConditionInfoImages: VehiclePhotoConditionInfoImageDTO,
): Promise<void> {
return this.vehiclePhotoConditionService.upsertImages(
vehicleId,
photoConditionId,
vehiclePhotoConditionInfoImages,
);
}
Service:
async upsertImages(
vehicleId: string,
vehiclePhotoConditionId: string,
vehiclePhotoConditionImage: VehiclePhotoConditionInfoImageDTO,
): Promise<void> {
await this.isVehicleExist(vehicleId);
const vehiclePhotoCondition = await this.getOne(vehicleId, vehiclePhotoConditionId);
if (!vehiclePhotoCondition) {
throw new BadRequestException(
`The vehicle photo condition ${vehiclePhotoConditionId} is not found`,
);
}
const imageKeys = await this.handleImages(vehiclePhotoConditionId, vehiclePhotoConditionImage);
const updatedVehiclePhotoConditions = vehiclePhotoCondition.info.map((data) => {
if (data.vehiclePart === vehiclePhotoConditionImage.vehiclePart) {
data.uploadedImagesKeys.push(...imageKeys);
}
return data;
});
const query = sql
.update('vehicle_photo_condition', {
info: JSON.stringify(updatedVehiclePhotoConditions),
updated_at: sql('now()'),
})
.where({ id: vehiclePhotoConditionId });
await this.db.query(query.toParams());
}
I solved the problem. I am posting correct code.
Here is the explanation:
In previous code, I was updating inner jsonb array of objects in code and because of the fact that the below code took some time and asynchronicity of the NodeJS, previous request can take more time than the other requests that will come later and this situation can cause data inconsistency.
Here is the previous code:
const updatedVehiclePhotoConditions = vehiclePhotoCondition.info.map((data) => {
if (data.vehiclePart === vehiclePhotoConditionImage.vehiclePart) {
data.uploadedImagesKeys.push(...imageKeys);
}
return data;
});
In current code, I am updating inner jsonb array of object in database and let database do this operation. So, no data consistency happened.
Here is current code:
const query = {
text: `
UPDATE vehicle_photo_condition
SET info = s.json_array
FROM (
SELECT
jsonb_agg(
CASE WHEN obj ->> 'vehiclePart' = $1 THEN
jsonb_set(obj, '{uploadedImagesKeys}', $2)
ELSE obj END
) as json_array
FROM vehicle_photo_condition, jsonb_array_elements(info) obj WHERE id = $3
) s WHERE id = $3`,
values: [
vehiclePhotoConditionImage.vehiclePart,
JSON.stringify(imageKeys),
vehiclePhotoConditionId,
],
};
async upsertImages(
vehicleId: string,
vehiclePhotoConditionId: string,
vehiclePhotoConditionImage: VehiclePhotoConditionInfoImageDTO,
): Promise<void> {
const vehiclePhotoCondition = await this.getOne(vehicleId, vehiclePhotoConditionId);
if (!vehiclePhotoCondition) {
throw new BadRequestException(
`The vehicle photo condition ${vehiclePhotoConditionId} is not found`,
);
}
const imageKeys = await this.handleImages(vehiclePhotoConditionId, vehiclePhotoConditionImage);
const query = {
text: `
UPDATE vehicle_photo_condition
SET info = s.json_array
FROM (
SELECT
jsonb_agg(
CASE WHEN obj ->> 'vehiclePart' = $1 THEN
jsonb_set(obj, '{uploadedImagesKeys}', $2)
ELSE obj END
) as json_array
FROM vehicle_photo_condition, jsonb_array_elements(info) obj WHERE id = $3
) s WHERE id = $3`,
values: [
vehiclePhotoConditionImage.vehiclePart,
JSON.stringify(imageKeys),
vehiclePhotoConditionId,
],
};
await this.db.query(query);
}

Dynamic filename in Winston dailyrotate for Promtail/Loki/Grafana

My NodeJS application writes logs with Winston. These logs then will be picked up by Promtail, to be saved to S3 by Loki and then processed in a dashboard in Grafana.
I want to create logs in Winston with dailyrotation of 30m. I want the logs to first be stored in my folder "/home/gad-web/gad-logs" when they are still being appended. And when they are rotated I want to move them to "/home/gad-web/gad-logs-rotated". Promtail will be looking at this specific folder.
I want to use dynamic filenames for different logs being written out, so that I can easily assign static labels to each file separetly using Promtail, rather than having to process each log line and assign a dynamic label to each line of log in one large file.
my file logger.mjs looks like this (formats, levels and other irrelevant data is left out):
const logDir = '/home/gad-web/gad-logs'
const logDirRotated = '/home/gad-web/gad-logs-rotated'
let winstonGdprProofFormat = winston.format.combine(...)
let winstonDailyRotateFileTransport = new winston.transports.DailyRotateFile({
frequency: '30m',
format: winstonGdprProofFormat,
filename: `${logDir}/all-gdpr-proof-%DATE%.log`,
datePattern: 'YYYY-MM-DD HH-mm',
})
// Move the file to another location after it is rotated, so it can be picked up by Promtail
winstonDailyRotateFileTransport.on('rotate', function (oldFilenamePath, newFilenamePath) {
let pathToMoveTo = `${logDirRotated}/${path.basename(oldFilenamePath)}`
fs.rename(oldFilenamePath, pathToMoveTo, function (err) {
if (err) throw err
})
})
let winstonTransports = []
if (process.env.environment !== 'local') {
winstonTransports.push(winstonConsoleTransport)
winstonTransports.push(winstonDailyRotateFileTransport)
} else {
winstonTransports.push(winstonConsoleWithColorsTransport)
}
const logger = winston.createLogger({
level: process.env.environment !== 'local' ? 'info' : 'debug',
levels: winstonLevels,
transports: winstonTransports,
})
export function log (obj) {
let { level, requestId, method, uri, msg, time, data } = obj
if (!level) {
level = 'info'
}
logger.log({
level: level,
requestId: requestId,
method: method,
uri: uri,
msg: msg,
time: time,
data: data,
})
}
It is being called in files that write logs like this:
import { log } from '../config/logger.mjs'
...
function writeRequestLog (start, request, requestId) {
let end = new Date().getTime()
let diff = end - start
log({ level: 'info', requestId: requestId, method: request.method, uri: request.path, msg: null, time: `${diff}ms`, data: JSON.stringify(request.query) })
}
Since the file is imported directly, it is immediately executed, and the winstonDailyRotateFileTransport is created using ${logDir}/all-gdpr-proof-%DATE%.log as the filename. How do I go around this instantiating this with a filename, so that I get daily rotated log files of 30minutes for a bunch of dynamically created different files?
I tried creating a Class in JS, but I quickly got into trouble because of the .on('rotate', ...) defined for the winstonDailyRotateFileTransport, and i'm also not sure of other implications creating a class for this might have (since this logger will be used a lot of times in my code)

How to suppress console output in Tesseract.js?

Tesseract.js seems to print to the console with every call to .recognize(), even with no option parameters attached.
It seems possible to quiet the output with the Tesseract CLI by using the "quiet" flag, but I can't find anything like that for Tesseract.js.
I've scanned through the parameters that could be passed to "options" as found on the Tesseract.js repository:
https://github.com/naptha/tesseract.js/blob/master/docs/tesseract_parameters.md
I've tried setting everything that has to do with "DEBUG" to 0, and I've tried sending the output to a "debug_file" parameter, but nothing I do seems to change the console output.
Here's a basic example with no parameters on the "options" object:
const fs = require('fs');
const Tesseract = require('tesseract.js');
const image = fs.readFileSync('path/to/image.jpg');
const options = {};
Tesseract.recognize(image, options)
.finally((resultOrError) => {
Tesseract.terminate();
}
);
I would expect there to be no output at all here, but instead this gets printed:
pre-main prep time: 76 ms
{ text: '',
html: '<div class=\'ocr_page\' id=\'page_1\' title=\'image ""; bbox 0 0 600 80; ppageno 0\'>\n</div>\n',
confidence: 0,
blocks: [],
psm: 'SINGLE_BLOCK',
oem: 'DEFAULT',
version: '3.04.00',
paragraphs: [],
lines: [],
words: [],
symbols: [] }
UPDATE
Okay, okay. It's early in the morning, I could have tried a little harder here. It looks like Tesseract.js automatically dumps everything to the console if you don't make calls to .catch() and .then(). With the example below, most of the console output disappears.
const fs = require('fs');
const Tesseract = require('tesseract.js');
const image = fs.readFileSync('path/to/image.jpg');
const options = {};
const doSomethingWithResult = (result) => { result };
const doSomethingWithError = (error) => { error };
Tesseract.recognize(image, options)
.then(result => doSomethingWithResult(result))
.catch(err => doSomethingWithError(err))
.finally((resultOrError) => {
Tesseract.terminate();
}
);
Now, only this gets printed to the console:
pre-main prep time: 66 ms
I'd still like a way to suppress this, so I'm going to leave the question unanswered for now. I hope someone can chime in with a suggestion.

nodejs recursively call same api and write to excel file sequentially

I need to call an API recursively using request promise after getting result from API need to write in an excel file , API sample response given below
{
"totalRecords": 9524,
"size": 20,
"currentPage": 1,
"totalPages": 477,
"result": [{
"name": "john doe",
"dob": "1999-11-11"
},
{
"name": "john1 doe1",
"dob": "1989-12-12"
}
]
}
Now I want to call this API n times, here n is equal to totalPages, after calling each API I want to write response result to the excel files.
First write page 1 response result to excel then append page 2 response result to excel file and so on..
I have written some sample code given below
function callAPI(pageNo) {
var options = {
url: "http://example.com/getData?pageNo="+pageNo,
method: 'GET',
headers: {
'Content-Type': 'application/json'
},
json: true
}
return request(options)
}
callAPI(1).then(function (res) {
// Write res.result to excel file
}).catch(function (err) {
// Handle error here
})
But facing problem calling recursively API and maintaining sequentially like write page 1 result first to excel file then page 2 result append to excel and so on..
Any code sample how to achieve in nodejs
You want to do something like this:
function getAllPages() {
function getNextPage(pageNo) {
return callAPI(pageNo).then(response => {
let needNextPage = true;
if (pageNo === 1) {
// write to file
} else {
// append to file
}
if (needNextPage) {
return getNextPage(pageNo+1);
} else {
return undefined;
}
});
}
return getNextPage(1);
}
Obviously change that 'needNextPage' to false to stop the recursion when you're done
So you want to do 477 requests in sequence? How long do you wanna wait for this to finish? Even in paralell, this would be still too long for me.
Best: write an API that can return you a batch of pages at once. Reducing the number of requests to the backend. Maybe something like http://example.com/getData?pages=1-100 and let it return an Array; maybe like
[
{
"totalRecords": 9524,
"currentPage": 1,
"totalPages": 477,
"result": [...]
},
{
"totalRecords": 9524,
"currentPage": 2,
"totalPages": 477,
"result": [...]
},
...
]
or more compact
{
"totalRecords": 9524,
"totalPages": 477,
"pages": [
{
"currentPage": 1,
"result": [...]
},
{
"currentPage": 2,
"result": [...]
},
...
]
}
Sidenote: writing the size of the results array into the json is unnecessary. This value can easily be determined from data.result.length
But back to your question
Imo. all you want to run in sequence is adding the pages to the sheet. The requests can be done in paralell. That already saves you a lot of overall runtime for the whole task.
callApi(1).then(firstPage => {
let {currentPage, totalPages} = firstPage;
//`previous` ensures that the Promises resolve in sequence,
//even if some later request finish sooner that earlier ones.
let previous = Promise.resolve(firstPage).then(writePageToExcel);
while(++currentPage <= totalPages){
//make the next request in paralell
let p = callApi(currentPage);
//execute `writePageToExcel` in sequence
//as soon as all previous ones have finished
previous = previous.then(() => p.then(writePageToExcel));
}
return previous;
})
.then(() => console.log("work done"));
or you wait for all pages to be loaded, before you write them to excel
callApi(1).then(firstPage => {
let {currentPage, totalPages} = firstPage;
let promises = [firstPage];
while(++currentPage < totalPages)
promises.push(callApi(currentPage));
//wait for all requests to finish
return Promise.all(promises);
})
//write all pages to excel
.then(writePagesToExcel)
.then(() => console.log("work done"));
or you could batch the requests
callApi(1).then(firstPage => {
const batchSize = 16;
let {currentPage, totalPages} = firstPage;
return Promise.resolve([ firstPage ])
.then(writePagesToExcel)
.then(function nextBatch(){
if(currentPage > totalPages) return;
//load a batch of pages in paralell
let batch = [];
for(let i=0; i<batchSize && ++currentPage <= totalPages; ++i){
batch[i] = callApi(currentPage);
}
//when the batch is done ...
return Promise.all(batch)
//... write it to the excel sheet ...
.then(writePagesToExcel)
//... and process the next batch
.then(nextBatch);
});
})
.then(() => console.log("work done"));
But don't forget to add the error handling. Since I'm not sure how you'd want to handle errors with the approaches I've posted, I didn't include the error-handling here.
Edit:
can u pls modify batch requests, getting some error, where you are assigning toalPages it's not right why the totalPages should equal to firstPage
let {currentPage, totalPages} = firstPage;
//is just a shorthand for
let currentPage = firstPage.currentPage, totalPages = firstPage.totalPages;
//what JS version are you targeting?
This first request, callApi(1).then(firstPage => ...) is primarily to determine currentIndex and totalLength, as you provide these properties in the returned JSON. Now that I know these two, I can initiate as many requests in paralell, as I'd want to. And I don't have to wait for any one of them to finish to determine at what index I am, and wether there are more pages to load.
and why you are writing return Promise.resolve([ firstPage ])
To save me some trouble and checking, as I don't know anything about how you'd implement writePagesToExcel.
I return Promise.resolve(...) so I can do .then(writePagesToExcel). This solves me two problems:
I don't have to care wether writePagesToExcel returns sync or a promise and I can always follow up with another .then(...)
I don't need to care wether writePagesToExcel may throw. In case of any Error, it all ends up in the Promise chain, and can be taken care of there.
So ultimately I safe myself a few checks, by simply wrapping firstPage back up in a Promise and continue with .then(...). Considering the amounts of data you're processing here, imo. this ain't too much of an overhead to get rid of some potential pitfalls.
why you are passing array like in resolve
To stay consistent in each example. In this example, I named the function that processes the data writePagesToExcel (plural) wich should indicate that it deals with multiple pages (an array of them); I thought that this would be clear in that context.
Since I still need this seperate call at the beginning to get firstPage, and I didn't want to complicate the logic in nextBatch just to concat this first page with the first batch, I treat [firstPage] as a seperate "batch", write it to excel and continue with nextBatch
function callAPI(pageNo) {
var options = {
url: "http://example.com/getData?pageNo="+pageNo,
method: 'GET',
headers: {
'Content-Type': 'application/json'
},
json: true
}
return request(options)
}
function writeToExcel(res){console.log(res)} //returns promise.
callAPI(1).then(function (res) {
if(res){
writeToExcel(res).then(() => {
var emptyPromise = new Promise(res => setTimeout(res, 0));
while(res && res.currentPage < res.totalPages){
emptyPromise = emptyPromise.then(() => {
return callAPI(res.currentPage).then(function (res){
if(res){
writeToExcel(res)
}
});
}
}
return emptyPromise;
});
}
}).catch(function (err) {
// Handle error here
})

Firebase Real Time Database Structure for File Upload

I am working on my first Firebase project using AngularFire2. Below is the overall design of my learning project.
Users uploads photos and it's stored in the Firebase storage as images.
The uploaded photos are listed in the homepage sorted based on timestamp.
Below is the structure that I have now when I started. But I feel difficulty when doing joins. I should be able to get user details for each uploads and able to sort uploads by timestamp.
User:
- Name
- Email
- Avatar
Uploads:
- ImageURL
- User ID
- Time
I read few blogs de-normalising the data structure. For my given scenario, how best can i re-model my database structure?
Any example for creating some sample data in the new proposed solution will be great for my understanding.
Once the image upload is done, I am calling the below code to create an entry in the database.
addUpload(image: any): firebase.Promise<any> {
return firebase.database().ref('/userUploads').push({
user: firebase.auth().currentUser.uid,
image: image,
time: new Date().getTime()
});
}
I am trying to join 2 entities as below. i am not sure how can I do it efficiently and correctly.
getUploads(): any {
const rootDef = this.db.database.ref();
const uploads = rootDef.child('userUploads').orderByChild('time');
uploads.on('child_added',snap => {
let userRef =rootDef.child('userProfile/' + snap.child('user').val());
userRef.once('value').then(userSnap => {
???? HOW TO HANDLE HERE
});
});
return ?????;
}
I would like to get a final list having all upload details and its corresponding user data for each upload.
This type of join will always be tricky if you write it from scratch. But I'll try to walk you through it. I'm using this JSON for my answer:
{
uploads: {
Upload1: {
uid: "uid1",
url: "https://firebase.com"
},
Upload2: {
uid: "uid2",
url: "https://google.com"
}
},
users: {
uid1: {
name: "Purus"
},
uid2: {
name: "Frank"
}
}
}
We're taking a three-stepped approach here:
Load the data from uploads
Load the users for that data from users
Join the user data to the upload data
1. Load the data uploads
Your code is trying to return a value. Since the data is loaded from Firebase asynchronously, it won't be available yet when your return statement executes. That gives you two options:
Pass in a callback to getUploads() that you then call when the data has loaded.
Return a promise from getUploads() that resolves when the data has loaded.
I'm going to use promises here, since the code is already difficult enough.
function getUploads() {
return ref.child("uploads").once("value").then((snap) => {
return snap.val();
});
}
This should be fairly readable: we load all uploads and, once they are loaded, we return the value.
getUploads().then((uploads) => console.log(uploads));
Will print:
{
Upload1 {
uid: "uid1",
url: "https://firebase.com"
},
Upload2 {
uid: "uid2",
url: "https://google.com"
}
}
2. Load the users for that data from users
Now in the next step, we're going to be loading the user for each upload. For this step we're not returning the uploads anymore, just the user node for each upload:
function getUploads() {
return ref.child("uploads").once("value").then((snap) => {
var promises = [];
snap.forEach((uploadSnap) => {
promises.push(
ref.child("users").child(uploadSnap.val().uid).once("value")
);
});
return Promise.all(promises).then((userSnaps) => {
return userSnaps.map((userSnap) => userSnap.val());
});
});
}
You can see that we loop over the uploads and create a promise for loading the user for that upload. Then we return Promise.all(), which ensures its then() only gets called once all users are loaded.
Now calling
getUploads().then((uploads) => console.log(uploads));
Prints:
[{
name: "Purus"
}, {
name: "Frank"
}]
So we get an array of users, one for each upload. Note that if the same user had posted multiple uploads, you'd get that user multiple times in this array. In a real production app you'd want to de-duplicate the users. But this answer is already getting long enough, so I'm leaving that as an exercise for the reader...
3. Join the user data to the upload data
The final step is to take the data from the two previous steps and joining it together.
function getUploads() {
return ref.child("uploads").once("value").then((snap) => {
var promises = [];
snap.forEach((uploadSnap) => {
promises.push(
ref.child("users").child(uploadSnap.val().uid).once("value")
);
});
return Promise.all(promises).then((userSnaps) => {
var uploads = [];
var i=0;
snap.forEach((uploadSnap) => {
var upload = uploadSnap.val();
upload.username = userSnaps[i++].val().name;
uploads.push(upload);
});
return uploads;
});
});
}
You'll see we added a then() to the Promise.all() call, which gets invoked after all users have loaded. At that point we have both the users and their uploads, so we can join them together. And since we loaded the users in the same order as the uploads, we can just join them by their index (i). Once you de-duplicate the users this will be a bit trickier.
Now if you call the code with:
getUploads().then((uploads) => console.log(uploads));
It prints:
[{
uid: "uid1",
url: "https://firebase.com",
username: "Purus"
}, {
uid: "uid2",
url: "https://google.com",
username: "Frank"
}]
The array of uploads with the name of the user who created that upload.
The working code for each step is in https://jsbin.com/noyemof/edit?js,console
I did the following based on Franks answer and it works. I am not sure if this is the best way for dealing with large number of data.
getUploads() {
return new Promise((resolve, reject) => {
const rootDef = this.db.database.ref();
const uploadsRef = rootDef.child('userUploads').orderByChild('time');
const userRef = rootDef.child("userProfile");
var uploads = [];
uploadsRef.once("value").then((uploadSnaps) => {
uploadSnaps.forEach((uploadSnap) => {
var upload = uploadSnap.val();
userRef.child(uploadSnap.val().user).once("value").then((userSnap) => {
upload.displayName = userSnap.val().displayName;
upload.avatar = userSnap.val().avatar;
uploads.push(upload);
});
});
});
resolve(uploads);
});
}

Categories