JS Worker Performance - Parsing JSON - javascript

I'm experimenting with Workers as my user interface is very slow due to big tasks running in the background.
I'm starting at the simplest tasks such as parsing JSON. See below for my very simple code to create an async function running on a Worke.
Performance wise there is a big difference between:
JSON.parse(jsonStr);
and
await parseJsonAsync(jsonStr);
JSON.parse() takes 1ms whereas parseJsonAsync takes 102ms!
So my question is: are the overheads really that big for running worker threads or am I missing something ?
const worker = new Worker(new URL('../workers/parseJson.js', import.meta.url));
export async function parseJsonAsync(jsonStr) {
return new Promise(
(resolve, reject) => {
worker.onmessage = ({
data: {
jsonObject
}
}) => {
resolve(jsonObject);
};
worker.postMessage({
jsonStr: jsonStr,
});
}
);
}
parseJson.js
self.onmessage = ({
data: {
jsonStr
}
}) => {
let jsonObject = null;
try {
jsonObject = JSON.parse(jsonStr);
} catch (ex) {
} finally {
self.postMessage({
jsonObject: jsonObject
});
}
};

I can now confirm that the overhead of transferring message between threads is pretty big. But the raw performance of worker (at least in executing JSON.parse) is close to main thread.
TL;DR: Just compare numbers in 2 tables. Without sending big object via postMessage, worker perf is just fine.
For test payload jsonStr, I create a string of a long list of [{"foo":"bar"}, ...] repeat n times. The number of items in jsonStr can be tuned by changing Array.from({ length: number }).
I then do postMessage(jsonStr) to run JSON.parse in worker, when done parsing it sends back the parsed jsonObject. In main thread just I call JSON.parse(jsonStr) directly.
runTest(delay) use setTimeout to wait until the worker startup to run the actual test. runTest() without delay runs immediately so we can measure worker startup time.
Code for the test.
const blobURL = URL.createObjectURL(
new Blob(
[
"(",
function () {
self.onmessage = ({ data: jsonStr }) => {
let jsonObject = null;
try {
jsonObject = JSON.parse(jsonStr);
self.postMessage(["done", jsonObject]);
} catch (e) {
self.postMessage(["error", e]);
}
};
}.toString(),
")()",
],
{ type: "application/javascript" }
)
);
const worker = new Worker(blobURL);
const jsonStr = "[" + Array.from({ length: 1_000 }, () => `{"foo":"bar"}`).join(",") + "]";
function test(payload) {
worker.onmessage = ({ data }) => {
const delta = performance.now() - t0;
console.log("worker", delta);
console.log("worker response", data[0]);
};
const t0 = performance.now();
worker.postMessage(payload);
testParseJsonInMain(payload);
}
function testParseJsonInMain(payload) {
let obj;
try {
const t0 = performance.now();
obj = JSON.parse(payload);
const delta = performance.now() - t0;
console.log("main", delta);
} catch {}
}
function runTest(delay) {
if (delay) {
setTimeout(() => test(jsonStr), delay);
} else {
test(jsonStr);
}
}
runTest(1000);
I observe that it takes around 30ms to start the worker on my machine. If test run after worker startup, I got these numbers (unit in milliseconds):
#items in payload
main
worker
1,000
0.2
2.1
10,000
1.3
9.8
100,000
15.4
73.5
1,000,000
165
854
10,000,000
2633
15312
When payload reaches 10 million items, the worker really struggles (takes 15 seconds). At 10 million items, the jsonStr is around 140MB.
But if the worker does not send back the parsed jsonObject, the numbers are so much better. Just make a little change to above test code:
// worker code changed from:
self.postMessage(["done", jsonObject]);
// to:
self.postMessage(["done", typeof jsonObject]);
#items in payload
main
worker
1,000
0.2
1.2
10,000
2.1
3.5
100,000
15.7
26.2
1,000,000
196
232
10,000,000
2249
2801
P.S. I've actually done another test. Instead of postMessage(jsonStr), I used TextEncoder to turn the string into ArrayBuffer, then postMessage(arrayBuffer, arrayBuffer) which supposedly transfers the underlying memory from main thread directly to worker.
I did not see real difference in terms of time consumed, in fact it gets a little bit slower. Guess sending large string isn't an issue.

Related

Delayed read performance when using navigator.serial for serial communication

I've been trying out the web serial API in chrome (https://web.dev/serial/) to do some basic communication with an Arduino board. I've noticed quite a substantial delay when reading data from the serial port however. This same issue is present in some demos, but not all.
For instance, using the WebSerial demo linked towards the bottom has a near instantaneous read:
While using the Serial Terminal example results in a read delay. (note the write is triggered at the moment of a character being entered on the keyboard):
WebSerial being open source allows for me to check for differences between my own implementation, however I am seeing performance much like the second example.
As for the relevant code:
this.port = await navigator.serial.requestPort({ filters });
await this.port.open({ baudRate: 115200, bufferSize: 255, dataBits: 8, flowControl: 'none', parity: 'none', stopBits: 1 });
this.open = true;
this.monitor();
private monitor = async () => {
const dataEndFlag = new Uint8Array([4, 3]);
while (this.open && this.port?.readable) {
this.open = true;
const reader = this.port.readable.getReader();
try {
let data: Uint8Array = new Uint8Array([]);
while (this.open) {
const { value, done } = await reader.read();
if (done) {
this.open = false;
break;
}
if (value) {
data = Uint8Array.of(...data, ...value);
}
if (data.slice(-2).every((val, idx) => val === dataEndFlag[idx])) {
const decoded = this.decoder.decode(data);
this.messages.push(decoded);
data = new Uint8Array([]);
}
}
} catch {
}
}
}
public write = async (data: string) => {
if (this.port?.writable) {
const writer = this.port.writable.getWriter();
await writer.write(this.encoder.encode(data));
writer.releaseLock();
}
}
The equivalent WebSerial code can be found here, this is pretty much an exact replica. From what I can observe, it seems to hang at await reader.read(); for a brief period of time.
This is occurring both on a Windows 10 device and a macOS Monterey device. The specific hardware device is an Arduino Pro Micro connected to a USB port.
Has anyone experienced this same scenario?
Update: I did some additional testing with more verbose logging. It seems that the time between the write and read is exactly 1 second every time.
the delay may result from SerialEvent() in your arduino script: set Serial.setTimeout(1);
This means 1 millisecond instead of default 1000 milliseconds.

Universal Sentence Encoder tensorflowjs optimize performance using webworker

I am using the following code to initiate Webworker which creates embeddings using Universal Sentence Encoder
const initEmbeddingWorker = (filePath) => {
let worker = new Worker(filePath);
worker.postMessage({init: 'init'})
worker.onmessage = (e) => {
worker.terminate();
}
}
Webworker code
onmessage = function (e) {
if(e.data.init && e.data.init === 'init') {
fetchData();
}
}
const fetchData = () => {
//fetches data from indexeddb
createEmbedding(data, storeEmbedding);
}
const createEmbedding = (data, callback) => {
use.load().then(model => {
model.embed(data).then(embeddings => {
callback(embeddings);
})
});
}
const storeEmbedding = (matrix) => {
let data = matrix.arraySync();
//store data in indexeddb
}
It takes 3 minutes to create 100 embeddings using 10 Webworkers running simultaneously and each worker creating embeddings for 10 sentences. The time taken to create embeddings is too large as I need to create embedding for more than 1000 sentences which takes around 25 to 30 minutes.
Whenever this code runs it hogs all the resources which makes the machine very slow and almost unusable.
Are there any performance optimizations that are missing?
Using 10 webworkers means that the machine used to run it has at least 11 cores. Why this assumption ? (number of webworker + main thread )
To leverage the use of webworker to the best, each webworker should be run on a different core. What happens when there are more workers than cores ? Well the program won't be as fast as expected because a lot of times will be used exchanging communications between the cores.
Now let's look at what happens on each core.
arraySync is a blocking call preventing that thread from be using for another thing.
Instead of using arraySync, array can be used.
const storeEmbedding = async (matrix) => {
let data = await matrix.array();
//store data in indexeddb
}
array and its counterpart arraySync are slower compare to data and dataSync. It will be better to store the flatten data, output of data.
const storeEmbedding = async (matrix) => {
let data = await matrix.data();
//store data in indexeddb
}

How to make a certain number of functions run parallel in loop in NodeJs?

I'm looking for a way to run 3 same-functions at once in a loop and wait until it finish and continues to run another 3 same-functions. I think it involves a loop, promise API. But my solution is fail. It would be great if you could tell me what did I do wrong and how to fix it.
Here is what I have done so far:
I have a download function (call downloadFile), an on-hold function (call runAfter) and a multi download function (call downloadList). They look like this:
const https = require('https')
const fs = require('fs')
const { join } = require('path')
const chalk = require('chalk') // NPM
const mime = require('./MIME') // A small module read Json and turn it to object. It returns a file extension string.
exports.downloadFile = url => new Promise((resolve, reject) => {
const req = https.request(url, res => {
console.log('Accessing:', chalk.greenBright(url))
console.log(res.statusCode, res.statusMessage)
// console.log(res.headers)
const ext = mime(res)
const name = url
.replace(/\?.+/i, '')
.match(/[\ \w\.-]+$/i)[0]
.substring(0, 250)
.replace(`.${ext}`, '')
const file = `${name}.${ext}`
const stream = fs.createWriteStream(join('_DLs', file))
res.pipe(stream)
res.on('error', reject)
stream
.on('open', () => console.log(
chalk.bold.cyan('Download:'),
file
))
.on('error', reject)
.on('close', () => {
console.log(chalk.bold.cyan('Completed:'), file)
resolve(true)
})
})
req.on('error', reject)
req.end()
})
exports.runAfter = (ms, url) => new Promise((resolve, reject) => {
setTimeout(() => {
this.downloadFile(url)
.then(resolve)
.catch(reject)
}, ms);
})
/* The list param is Array<String> only */
exports.downloadList = async (list, options) => {
const opt = Object.assign({
thread: 3,
delayRange: {
min: 100,
max: 1000
}
}, options)
// PROBLEM
const multiThread = async (pos, run) => {
const threads = []
for (let t = pos; t < opt.thread + t; t++) threads.push(run(t))
return await Promise.all(threads)
}
const inQueue = async run => {
for (let i = 0; i < list.length; i += opt.thread)
if (opt.thread > 1) await multiThread(i, run)
else await run(i)
}
const delay = range => Math.floor(
Math.random() * (new Date()).getHours() *
(range.max - range.min) + range.min
)
inQueue(i => this.runAfter(delay(opt.delayRange), list[i]))
}
The downloadFile will download anything from the link given. The runAfter will delay a random ms before excute downloadFile. The downloadList receive a list of URL and pass each of it to runAfter to download. And that (downloadList) is where the trouble begin.
If I just pass the whole list through simple loop and execute a single file at once. It's easy. But if I pass a large requests, like a list with 50 urls. It would take long time. So I decide to make it run parallel at 3 - 5 downloadFile at once, instead of one downloadFile. I was thinking about using async/await and Promise.all to solve the problem. However, it's crash. Below is the NodeJs report:
<--- Last few GCs --->
[4124:01EF5068] 75085 ms: Scavenge 491.0 (493.7) -> 490.9 (492.5) MB, 39.9 / 0.0 ms (average mu = 0.083, current mu = 0.028) allocation failure
[4124:01EF5068] 75183 ms: Scavenge 491.4 (492.5) -> 491.2 (493.2) MB, 29.8 / 0.0 ms (average mu = 0.083, current mu = 0.028) allocation failure
<--- JS stacktrace --->
==== JS stack trace =========================================
0: ExitFrame [pc: 00B879E7]
Security context: 0x03b40451 <JSObject>
1: multiThread [04151355] [<project folder>\inc\Downloader.js:~62] [pc=03C87FBF](this=0x03cfffe1 <JSGlobal Object>,0,0x041512d9 <JSFunction (sfi = 03E2E865)>)
2: inQueue [041513AD] [<project folder>\inc\Downloader.js:70] [bytecode=03E2EA95 offset=62](this=0x03cfffe1 <JSGlobal Object>,0x041512d9 ...
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Writing Node.js report to file: report.20200428.000236.4124.0.001.json
Node.js report completed
Apparently, a sub-function of downloadList (multiThread) is a cause but I couldn't read those number (seems like a physical address of RAM or something), so I have no idea how to fix it. I'm not a professional engineer so I would appreciate if you could give me a good explanation.
Addition information:
NodeJs version: 12.13.1
Localhost: Aspire SW3-013 > 1.9GB (2GB in spec) / Intel Atom CPU Z3735F
Connecting to Internet via WiFi (Realtek drive)
OS: Windows 10 (no other choice)
In case you might ask:
Why wrapping Promise for downloadFile? For further application, like I can put it in other app which only require one download at a time.
Does runAfter important? Maybe no, just a little challenges for myself. But it could be useful if servers require delay download time.
Homework or Business? None, hobby only. I plan to build a app to fetch and download image from API of Unsplash. So I prefer a good explanation what I did wrong and how to fix it rather then a code that simple works.
Your for-loop in multiThread never ends because your continuation condition is t < opt.thread + t. This will always be true if opt.thread is not zero. You will have an infinite loop here, and that's the cause of your crash.
I suspect you wanted to do something like this:
const multiThread = async (pos, run) => {
const threads = [];
for (let t = 0; t < opt.thread && pos+t < list.length; t++) {
threads.push(run(pos + t));
}
return await Promise.all(threads);
};
The difference here is that the continuation condition for the loop should be limiting itself to a maximum of opt.thread times, and also not going past the end of the number of entries in the list array.
If the list variable isn't global (ie, list.length is not available in the multiThread function), then you can leave out the second part of the condition and just handle it in the run function like this so that any values of i past the end of the list are ignored:
inQueue(i => {
if (i < list.length) this.runAfter(delay(opt.delayRange), list[i])
})

Faster web worker messaging

I've tested web worker messaging in Chrome specifically and getting results of about ~50ms latency to send and receive a message:
// Sender section
let imageData = mCtx.getImageData(0, 0, w, h);
let bitmapData = await createImageBitmap(imageData);
beforeAddBitmapFrame = performance.now();
videoWorker.postMessage({ action : 'addFrameBitmap', data: bitmapData }, [bitmapData]);
// Receiver section
videoWorker.onmessage = function (e) {
let blob = e.data.data;
beforeRenderBlobFrame = performance.now();
let latency = (beforeRenderBlobFrame - beforeAddBitmapFrame); // 50ms
if(latency > 10) {
console.log('=== Max Latency Hit ===')
}
renderBlobTest(blob);
};
This is basically a loop test where an image is sent to the web worker and the web worker will just send it back to calculate latency. 50 ms here might be nothing at first glace but if you multiply it like for a video with 30 FPS, so doing the math, 50 ms x 30 frames = 1500 ms latency (1.5 seconds) that's a lot considering this is not a network transfer.
What can be done to lower the latency of Web worker messaging?
[UPDATE]
To further test I did a simple "ping" test to the web worker at given interval
setInterval(function () {
let pingTime = new Date().getMilliseconds();
videoWorker.postMessage({ action: 'ping', pingTime : pingTime });
}, 500);
Then did
if(e.data.pingTime) {
let pongTime = new Date().getMilliseconds();
console.log('Got pong: ' + ( pongTime - e.data.pingTime ))
}
Similar result above it averages to ~50ms.
You felt in one of the micro-benchmarking traps:
Never run a single instance of the test.
The first run will always be slower, the engine has to warm up, in your case, the whole Worker thread has to be generated and a lot of other stuff have to be initialized (see this Q/A for a list of things delaying the first message).
Also, a single test is prone to report completely false results because of some external and unrelated events (a background app deciding to perform some operations just at that moment, the Garbage Collector kicking in, a UI event, anything...)
const videoWorker = new Worker( generateWorkerURL() );
let startTime;
const latencies = [];
const max_rounds = 10;
// Receiver section
videoWorker.onmessage = function (e) {
const endTime = performance.now();
e.data.close();
const latency = (endTime - startTime);
// store the current latency
latencies.push( latency );
if( latencies.length < max_rounds ) {
performTest();
}
else {
logResults();
}
};
// initial call
performTest();
// the actual test code
async function performTest() {
// we'll build a new random image every test
const w = 1920;
const h = 1080;
// make some noise
const data = Uint32Array.from( { length: w * h }, ()=> Math.random * 0xFFFFFF + 0xFF000000);
const imageData = new ImageData( new Uint8ClampedArray( data.buffer ), w, h );
let bitmapData = await createImageBitmap(imageData);
// start measuring the time it takes to transfer
startTime = performance.now();
videoWorker.postMessage( bitmapData, [ bitmapData ] );
}
// when all the tests are done
function logResults() {
const total = latencies.reduce( (total, lat) => total + lat );
const avg = total / latencies.length;
console.log( "average latency (ms)", avg );
console.log( "first ten absolute values", latencies.slice( 0, 10 ) );
}
function generateWorkerURL() {
const content = `onmessage = e => postMessage( e.data, [e.data] );`;
const blob = new Blob( [ content ], { type: 'text/javacript' } );
return URL.createObjectURL( blob );
}
Running 1000 tests leads to an average of <1.2ms per tests on my machine (and 0.12ms when not generating a new ImageData every test i.e without GC), while the first run takes about 11ms.
These results imply that the transferring of the data takes virtually no time (it's almost as fast as just waiting for the next event loop).
So your bottleneck is in an other castle and there is nothing to speed up in the messaging part.
Remember that if your main thread is blocked, so will be the handlers that fire from that main thread.

429 Too Many Requests - Angular 7 - on multiple file upload

I have this problem when I try to upload more than a few hundred of files at the same time.
The API interface is for one file only so I have to call the service sending each file. Right now I have this:
onFilePaymentSelect(event): void {
if (event.target.files.length > 0) {
this.paymentFiles = event.target.files[0];
}
let i = 0;
let save = 0;
const numFiles = event.target.files.length;
let procesed = 0;
if (event.target.files.length > 0) {
while (event.target.files[i]) {
const formData = new FormData();
formData.append('file', event.target.files[i]);
this.payrollsService.sendFilesPaymentName(formData).subscribe(
(response) => {
let added = null;
procesed++;
if (response.status_message === 'File saved') {
added = true;
save++;
} else {
added = false;
}
this.payList.push({ filename, message, added });
});
i++;
}
}
So really I have a while for sending each file to the API but I get the message "429 too many request" on a high number of files. Any way I can improve this?
Working with observables will make that task easier to reason about (rather than using imperative programming).
A browser usually allows you to make 6 request in parallel and will queue the others. But we don't want the browser to manage that queue for us (or if we're running in a node environment we wouldn't have that for ex).
What do we want: We want to upload a lot of files. They should be queued and uploaded as efficiently as possible by running 5 requests in parallel at all time. (so we keep 1 free for other requests in our app).
In order to demo that, let's build some mocks first:
function randomInteger(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
const mockPayrollsService = {
sendFilesPaymentName: (file: File) => {
return of(file).pipe(
// simulate a 500ms to 1.5s network latency from the server
delay(randomInteger(500, 1500))
);
}
};
// array containing 50 files which are mocked
const files: File[] = Array.from({ length: 50 })
.fill(null)
.map(() => new File([], ""));
I think the code above is self explanatory. We are generating mocks so we can see how the core of the code will actually run without having access to your application for real.
Now, the main part:
const NUMBER_OF_PARALLEL_CALLS = 5;
const onFilePaymentSelect = (files: File[]) => {
const uploadQueue$ = from(files).pipe(
map(file => mockPayrollsService.sendFilesPaymentName(file)),
mergeAll(NUMBER_OF_PARALLEL_CALLS)
);
uploadQueue$
.pipe(
scan(nbUploadedFiles => nbUploadedFiles + 1, 0),
tap(nbUploadedFiles =>
console.log(`${nbUploadedFiles}/${files.length} file(s) uploaded`)
),
tap({ complete: () => console.log("All files have been uploaded") })
)
.subscribe();
};
onFilePaymentSelect(files);
We use from to send the files one by one into an observable
using map, we prepare our request for 1 file (but as we don't subscribe to it and the observable is cold, the request is just prepared, not triggered!)
we now use mergeMap to run a pool of calls. Thanks to the fact that mergeMap takes the concurrency as an argument, we can say "please run a maximum of 5 calls at the same time"
we then use scan for display purpose only (to count the number of files that have been uploaded successfully)
Here's a live demo: https://stackblitz.com/edit/rxjs-zuwy33?file=index.ts
Open up the console to see that we're not uploading all them at once

Categories