429 Too Many Requests - Angular 7 - on multiple file upload - javascript

I have this problem when I try to upload more than a few hundred of files at the same time.
The API interface is for one file only so I have to call the service sending each file. Right now I have this:
onFilePaymentSelect(event): void {
if (event.target.files.length > 0) {
this.paymentFiles = event.target.files[0];
}
let i = 0;
let save = 0;
const numFiles = event.target.files.length;
let procesed = 0;
if (event.target.files.length > 0) {
while (event.target.files[i]) {
const formData = new FormData();
formData.append('file', event.target.files[i]);
this.payrollsService.sendFilesPaymentName(formData).subscribe(
(response) => {
let added = null;
procesed++;
if (response.status_message === 'File saved') {
added = true;
save++;
} else {
added = false;
}
this.payList.push({ filename, message, added });
});
i++;
}
}
So really I have a while for sending each file to the API but I get the message "429 too many request" on a high number of files. Any way I can improve this?

Working with observables will make that task easier to reason about (rather than using imperative programming).
A browser usually allows you to make 6 request in parallel and will queue the others. But we don't want the browser to manage that queue for us (or if we're running in a node environment we wouldn't have that for ex).
What do we want: We want to upload a lot of files. They should be queued and uploaded as efficiently as possible by running 5 requests in parallel at all time. (so we keep 1 free for other requests in our app).
In order to demo that, let's build some mocks first:
function randomInteger(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
const mockPayrollsService = {
sendFilesPaymentName: (file: File) => {
return of(file).pipe(
// simulate a 500ms to 1.5s network latency from the server
delay(randomInteger(500, 1500))
);
}
};
// array containing 50 files which are mocked
const files: File[] = Array.from({ length: 50 })
.fill(null)
.map(() => new File([], ""));
I think the code above is self explanatory. We are generating mocks so we can see how the core of the code will actually run without having access to your application for real.
Now, the main part:
const NUMBER_OF_PARALLEL_CALLS = 5;
const onFilePaymentSelect = (files: File[]) => {
const uploadQueue$ = from(files).pipe(
map(file => mockPayrollsService.sendFilesPaymentName(file)),
mergeAll(NUMBER_OF_PARALLEL_CALLS)
);
uploadQueue$
.pipe(
scan(nbUploadedFiles => nbUploadedFiles + 1, 0),
tap(nbUploadedFiles =>
console.log(`${nbUploadedFiles}/${files.length} file(s) uploaded`)
),
tap({ complete: () => console.log("All files have been uploaded") })
)
.subscribe();
};
onFilePaymentSelect(files);
We use from to send the files one by one into an observable
using map, we prepare our request for 1 file (but as we don't subscribe to it and the observable is cold, the request is just prepared, not triggered!)
we now use mergeMap to run a pool of calls. Thanks to the fact that mergeMap takes the concurrency as an argument, we can say "please run a maximum of 5 calls at the same time"
we then use scan for display purpose only (to count the number of files that have been uploaded successfully)
Here's a live demo: https://stackblitz.com/edit/rxjs-zuwy33?file=index.ts
Open up the console to see that we're not uploading all them at once

Related

Universal Sentence Encoder tensorflowjs optimize performance using webworker

I am using the following code to initiate Webworker which creates embeddings using Universal Sentence Encoder
const initEmbeddingWorker = (filePath) => {
let worker = new Worker(filePath);
worker.postMessage({init: 'init'})
worker.onmessage = (e) => {
worker.terminate();
}
}
Webworker code
onmessage = function (e) {
if(e.data.init && e.data.init === 'init') {
fetchData();
}
}
const fetchData = () => {
//fetches data from indexeddb
createEmbedding(data, storeEmbedding);
}
const createEmbedding = (data, callback) => {
use.load().then(model => {
model.embed(data).then(embeddings => {
callback(embeddings);
})
});
}
const storeEmbedding = (matrix) => {
let data = matrix.arraySync();
//store data in indexeddb
}
It takes 3 minutes to create 100 embeddings using 10 Webworkers running simultaneously and each worker creating embeddings for 10 sentences. The time taken to create embeddings is too large as I need to create embedding for more than 1000 sentences which takes around 25 to 30 minutes.
Whenever this code runs it hogs all the resources which makes the machine very slow and almost unusable.
Are there any performance optimizations that are missing?
Using 10 webworkers means that the machine used to run it has at least 11 cores. Why this assumption ? (number of webworker + main thread )
To leverage the use of webworker to the best, each webworker should be run on a different core. What happens when there are more workers than cores ? Well the program won't be as fast as expected because a lot of times will be used exchanging communications between the cores.
Now let's look at what happens on each core.
arraySync is a blocking call preventing that thread from be using for another thing.
Instead of using arraySync, array can be used.
const storeEmbedding = async (matrix) => {
let data = await matrix.array();
//store data in indexeddb
}
array and its counterpart arraySync are slower compare to data and dataSync. It will be better to store the flatten data, output of data.
const storeEmbedding = async (matrix) => {
let data = await matrix.data();
//store data in indexeddb
}

How to make upload faster in angular 7?

I am trying to speed up the upload. So I tried with different solution, with both BackEnd and Front-End. Those are,
1) I uploaded the tar file (already compressed one)
2) I tried chunk upload (sequentially), if the response is success next API will get triggered. In the back-end side, in the same file the content will get appended.
3) I tried chunk upload but in parallel, at a single time I make the 50 request to upload the chunk content (I know, at a time browser handle only 6 requests). From the backend side, we are storing all the chunk file separately, after receiving the final request, appending all those chunks in to the single file.
But observed is, I am not seeing the much difference with all these cases.
Following is my service file
export class largeGeneUpload {
chromosomeFile: any;
options: any;
chunkSize = 1200000;
activeConnections = 0;
threadsQuantity = 50;
totalChunkCount = 0;
chunksPosition = 0;
failedChunks = [];
sendNext() {
if (this.activeConnections >= this.threadsQuantity) {
return;
}
if (this.chunksPosition === this.totalChunkCount) {
console.log('all chunks are done');
return;
}
const i = this.chunksPosition;
const url = 'gene/human';
const chunkIndex = i;
const start = chunkIndex * this.chunkSize;
const end = Math.min(start + this.chunkSize, this.chromosomeFile.size);
const currentchunkSize = this.chunkSize * i;
const chunkData = this.chromosomeFile.webkitSlice ? this.chromosomeFile.webkitSlice(start, end) : this.chromosomeFile.slice(start, end);
const fd = new FormData();
const binar = new File([chunkData], this.chromosomeFile.upload.filename);
console.log(binar);
fd.append('file', binar);
fd.append('dzuuid', this.chromosomeFile.upload.uuid);
fd.append('dzchunkindex', chunkIndex.toString());
fd.append('dztotalfilesize', this.chromosomeFile.upload.total);
fd.append('dzchunksize', this.chunkSize.toString());
fd.append('dztotalchunkcount', this.chromosomeFile.upload.totalChunkCount);
fd.append('isCancel', 'false');
fd.append('dzchunkbyteoffset', currentchunkSize.toString());
this.chunksPosition += 1;
this.activeConnections += 1;
this.apiDataService.uploadChunk(url, fd)
.then(() => {
this.activeConnections -= 1;
this.sendNext();
})
.catch((error) => {
this.activeConnections -= 1;
console.log('error here');
// chunksQueue.push(chunkId);
});
this.sendNext();
}
uploadChunk(resrc: string, item) {
return new Promise((resolve, reject) => {
this._http.post(this.baseApiUrl + resrc, item, {
headers: this.headers,
withCredentials: true
}).subscribe(r => {
console.log(r);
resolve();
}, err => {
console.log('err', err);
reject();
});
});
}
But the thing is, If I upload the same file in google drive it is not taking much time.
Let's consider, I have 700 MB file, to upload it in google drive it took 3 mins. But the same 700 MB file to upload with my Angular code with our back-end server it took 7 mins to finish it.
How do I improve the performance of file upload.?
forgive me ,
it seems silly answer but this depend on your hosting infrastructure
A lot of variables can cause this, but by your story it has nothing to do with your front-end code. Making it into chunks is not going to help, because browsers have their own optimized algorithm to upload files. The most likely culprit is your backend server or the connection from your client to the server.
You say that google drive is fast, but you should also know that google has a very widespread global infrastructure with top of the line cloud servers. If you are using, for example, a 2 euro per month fixed place hosting provider, you cannot expect the same processing and network power as google.

How to make a certain number of functions run parallel in loop in NodeJs?

I'm looking for a way to run 3 same-functions at once in a loop and wait until it finish and continues to run another 3 same-functions. I think it involves a loop, promise API. But my solution is fail. It would be great if you could tell me what did I do wrong and how to fix it.
Here is what I have done so far:
I have a download function (call downloadFile), an on-hold function (call runAfter) and a multi download function (call downloadList). They look like this:
const https = require('https')
const fs = require('fs')
const { join } = require('path')
const chalk = require('chalk') // NPM
const mime = require('./MIME') // A small module read Json and turn it to object. It returns a file extension string.
exports.downloadFile = url => new Promise((resolve, reject) => {
const req = https.request(url, res => {
console.log('Accessing:', chalk.greenBright(url))
console.log(res.statusCode, res.statusMessage)
// console.log(res.headers)
const ext = mime(res)
const name = url
.replace(/\?.+/i, '')
.match(/[\ \w\.-]+$/i)[0]
.substring(0, 250)
.replace(`.${ext}`, '')
const file = `${name}.${ext}`
const stream = fs.createWriteStream(join('_DLs', file))
res.pipe(stream)
res.on('error', reject)
stream
.on('open', () => console.log(
chalk.bold.cyan('Download:'),
file
))
.on('error', reject)
.on('close', () => {
console.log(chalk.bold.cyan('Completed:'), file)
resolve(true)
})
})
req.on('error', reject)
req.end()
})
exports.runAfter = (ms, url) => new Promise((resolve, reject) => {
setTimeout(() => {
this.downloadFile(url)
.then(resolve)
.catch(reject)
}, ms);
})
/* The list param is Array<String> only */
exports.downloadList = async (list, options) => {
const opt = Object.assign({
thread: 3,
delayRange: {
min: 100,
max: 1000
}
}, options)
// PROBLEM
const multiThread = async (pos, run) => {
const threads = []
for (let t = pos; t < opt.thread + t; t++) threads.push(run(t))
return await Promise.all(threads)
}
const inQueue = async run => {
for (let i = 0; i < list.length; i += opt.thread)
if (opt.thread > 1) await multiThread(i, run)
else await run(i)
}
const delay = range => Math.floor(
Math.random() * (new Date()).getHours() *
(range.max - range.min) + range.min
)
inQueue(i => this.runAfter(delay(opt.delayRange), list[i]))
}
The downloadFile will download anything from the link given. The runAfter will delay a random ms before excute downloadFile. The downloadList receive a list of URL and pass each of it to runAfter to download. And that (downloadList) is where the trouble begin.
If I just pass the whole list through simple loop and execute a single file at once. It's easy. But if I pass a large requests, like a list with 50 urls. It would take long time. So I decide to make it run parallel at 3 - 5 downloadFile at once, instead of one downloadFile. I was thinking about using async/await and Promise.all to solve the problem. However, it's crash. Below is the NodeJs report:
<--- Last few GCs --->
[4124:01EF5068] 75085 ms: Scavenge 491.0 (493.7) -> 490.9 (492.5) MB, 39.9 / 0.0 ms (average mu = 0.083, current mu = 0.028) allocation failure
[4124:01EF5068] 75183 ms: Scavenge 491.4 (492.5) -> 491.2 (493.2) MB, 29.8 / 0.0 ms (average mu = 0.083, current mu = 0.028) allocation failure
<--- JS stacktrace --->
==== JS stack trace =========================================
0: ExitFrame [pc: 00B879E7]
Security context: 0x03b40451 <JSObject>
1: multiThread [04151355] [<project folder>\inc\Downloader.js:~62] [pc=03C87FBF](this=0x03cfffe1 <JSGlobal Object>,0,0x041512d9 <JSFunction (sfi = 03E2E865)>)
2: inQueue [041513AD] [<project folder>\inc\Downloader.js:70] [bytecode=03E2EA95 offset=62](this=0x03cfffe1 <JSGlobal Object>,0x041512d9 ...
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Writing Node.js report to file: report.20200428.000236.4124.0.001.json
Node.js report completed
Apparently, a sub-function of downloadList (multiThread) is a cause but I couldn't read those number (seems like a physical address of RAM or something), so I have no idea how to fix it. I'm not a professional engineer so I would appreciate if you could give me a good explanation.
Addition information:
NodeJs version: 12.13.1
Localhost: Aspire SW3-013 > 1.9GB (2GB in spec) / Intel Atom CPU Z3735F
Connecting to Internet via WiFi (Realtek drive)
OS: Windows 10 (no other choice)
In case you might ask:
Why wrapping Promise for downloadFile? For further application, like I can put it in other app which only require one download at a time.
Does runAfter important? Maybe no, just a little challenges for myself. But it could be useful if servers require delay download time.
Homework or Business? None, hobby only. I plan to build a app to fetch and download image from API of Unsplash. So I prefer a good explanation what I did wrong and how to fix it rather then a code that simple works.
Your for-loop in multiThread never ends because your continuation condition is t < opt.thread + t. This will always be true if opt.thread is not zero. You will have an infinite loop here, and that's the cause of your crash.
I suspect you wanted to do something like this:
const multiThread = async (pos, run) => {
const threads = [];
for (let t = 0; t < opt.thread && pos+t < list.length; t++) {
threads.push(run(pos + t));
}
return await Promise.all(threads);
};
The difference here is that the continuation condition for the loop should be limiting itself to a maximum of opt.thread times, and also not going past the end of the number of entries in the list array.
If the list variable isn't global (ie, list.length is not available in the multiThread function), then you can leave out the second part of the condition and just handle it in the run function like this so that any values of i past the end of the list are ignored:
inQueue(i => {
if (i < list.length) this.runAfter(delay(opt.delayRange), list[i])
})

Downloading multiple files from aws to Node js server

I need to download large no of files(say 100k, each file size 0.2 - 1 MB) from aws s3 to node js server. The code I am using is
app.get('/api/download-all', function(req, res) {
res.json({status: 'download initiated'})
downloadFromS3(getDocs());
});
The function that downloads the audios is
function downloadFromS3(docs){
docs.forEach((doc, fileIndex)=>{
var s3FilePath = doc.wav
var fileName = s3FilePath.split('/').pop();
var s3Params = {Bucket: 'zzzzz', Key: s3FilePath};
var file = fs.createWriteStream(dir + '/' + fileName);
console.log(downloadSession);
s3.getObject(s3Params)
.on('httpData', function (chunk) {
console.log("file writing happening", fileName);
file.write(chunk);
})
.send();
}); }
Here the download function fires S3.getObject call as many times as the no of files to download. it doesn't wait for the status of the file. its almost like some 100k (in my case) s3.getObject has been made before letting a file to download. is this a right way or should I wait for one file to download and invoke the s3 call after that. what will be the right approach.
2) There is one other issue I am facing with this code. Once I make the download api call from UI the server gets busy with download. its not returning any requests from the UI. all requests gets pending. Is there is anyway to do the download in background. I had gone through some approaches like fork a child process or a web worker to handle this. I am not sure which one to use. what is the best way to handle this.
I'd advise an in-between approach. Kicking off 100k downloads in parallel is really not a good idea. But similarly, waiting for each download to fully complete won't utilise your full bandwidth. I'd suggest a solution that "pools" jobs - e.g., you create a pool of promises, each of which can download one file at a time, as soon as it finishes it starts the next.
I've been using a function like this:
Promise.pool = function pool(funcs, inParallel, progressCallback) {
const promises = [];
const results = [];
function getNext() {
if (funcs.length) {
return funcs.pop()()
.catch(() => {})
.then((res) => {
results.push(res);
if (progressCallback) {
progressCallback(results);
}
return getNext();
});
}
}
for (let i = 0; i < Math.min(inParallel, funcs.length); i++) {
promises.push(getNext());
}
return Promise.all(promises)
.then(() => results);
};
Then you'd define an array of functions, each downloads one file and returns a promise which resolves on completion:
const funcs = docs.map((doc) => {
return () => {
return new Promise((resolve) => {
var s3FilePath = doc.wav
var fileName = s3FilePath.split('/').pop();
var s3Params = {Bucket: 'zzzzz', Key: s3FilePath};
var file = fs.createWriteStream(dir + '/' + fileName);
console.log(downloadSession);
s3.getObject(s3Params)
.on('httpData', function (chunk) {
console.log("file writing happening", fileName);
file.write(chunk);
})
.on("end", () => resolve())
.send();
});
}
});
Finally, you'd use it like this:
const inParallel = 32;
function callback(partialResults) {
//console log, whatever
}
Promise.pool(funcs, inParallel, callback)
.then(() => console.log("all done!"));

Upload image taken with camera to firebase storage on React Native

All I want to do is upload a photo taken using react-native-camera to firebase storage with react-native-fetch-blob, but no matter what I do it doesn't happen.
I've gone through all of the documentations I can find and nothing seems to work.
If anyone has a working system for accomplishing this please post it as an answer. I can get the uri of the jpg that react-native-camera returns (it displays in the ImageView and everything), but my upload function seems to stop working when it's time to put the blob.
My current function:
uploadImage = (uri, imageName, mime = 'image/jpg') => {
return new Promise((resolve, reject) => {
const uploadUri = Platform.OS === 'ios' ? uri.replace('file://', '') : uri
let uploadBlob = null
const imageRef = firebase.storage().ref('selfies').child(imageName)
console.log("uploadUri",uploadUri)
fs.readFile(uploadUri, 'base64').then((data) => {
console.log("MADE DATA")
var blobEvent = new Blob(data, 'image/jpg;base64')
var blob = null
blobEvent.onCreated(genBlob => {
console.log("CREATED BLOB EVENT")
blob = genBlob
firebase.storage().ref('selfies').child(imageName).put(blob).then(function(snapshot) {
console.log('Uploaded a blob or file!');
firebase.database().ref("selfies/" + firebase.auth().currentUser.uid).set(0)
var updates = {};
updates["/users/" + firebase.auth().currentUser.uid + "/signup/"] = 1;
firebase.database().ref().update(updates);
});
}, (error) => {
console.log('Upload Error: ' + error)
alert(error)
}, () => {
console.log('Completed upload: ' + uploadTask.snapshot.downloadURL)
})
})
}).catch((error) => {
alert(error)
})
}
I want to be as efficient as possible, so if it's faster and takes less memory to not change it to base64, then I prefer that. Right now I just have no clue how to make this work.
This has been a huge source of stress in my life and I hope someone has this figured out.
The fastest approach would be to use the native android / ios sdk's and avoid clogging the JS thread, there are a few libraries out there that will provide a react native module to do just this (they all have a small js api that communicates over react natives bridge to the native side where all the firebase logic runs)
react-native-firebase is one such library. It follows the firebase web sdk's api, so if you know how to use the web sdk then you should be able to use the exact same logic with this module as well as additional firebase apis that are only available on the native SDKS.
For example, it has a storage implementation included and a handy putFile function, which you can provide it with a path to a file on the device and it'll upload it for you using the native firebase sdks, no file handling is done on the JS thread and is therefore extremely fast.
Example:
// other built in paths here: https://github.com/invertase/react-native-firebase/blob/master/lib/modules/storage/index.js#L146
const imagePath = firebase.storage.Native.DOCUMENT_DIRECTORY_PATH + '/myface.png';
const ref = firebase.storage().ref('selfies').child('/myface.png');
const uploadTask = ref.putFile(imagePath);
// .on observer is completely optional (can instead use .then/.catch), but allows you to
// do things like a progress bar for example
uploadTask.on(firebase.storage.TaskEvent.STATE_CHANGED, (snapshot) => {
// observe state change events such as progress
// get task progress, including the number of bytes uploaded and the total number of bytes to be uploaded
const progress = (snapshot.bytesTransferred / snapshot.totalBytes) * 100;
console.log(`Upload is ${progress}% done`);
switch (snapshot.state) {
case firebase.storage.TaskState.SUCCESS: // or 'success'
console.log('Upload is complete');
break;
case firebase.storage.TaskState.RUNNING: // or 'running'
console.log('Upload is running');
break;
default:
console.log(snapshot.state);
}
}, (error) => {
console.error(error);
}, () => {
const uploadTaskSnapshot = uploadTask.snapshot;
// task finished
// states: https://github.com/invertase/react-native-firebase/blob/master/lib/modules/storage/index.js#L139
console.log(uploadTaskSnapshot.state === firebase.storage.TaskState.SUCCESS);
console.log(uploadTaskSnapshot.bytesTransferred === uploadTaskSnapshot.totalBytes);
console.log(uploadTaskSnapshot.metadata);
console.log(uploadTaskSnapshot.downloadUrl)
});
Disclaimer: I am the author of react-native-firebase.

Categories