I use the fetchAPI to retrieve my data from the backend as a stream.
I decrypt the data chunk by chunk and the concat the content back together for the original file.
I have found that the stream seems to provide data differently each time makling the chunnks different. How can I force the stream to the chunks in the original sequence.
fetch(myRequest, myInit).then(response => {
var tmpResult = new Uint8Array();
const reader = response.body.getReader();
return new ReadableStream({
start(controller) {
return pump();
function pump() {
return reader.read().then(({ done, value }) => {
// When no more data needs to be consumed, close the stream
if (value) {
//values here are different in order every time
//making my concatenated values different every time
controller.enqueue(value);
var decrypted = cryptor.decrypt(value);
var arrayResponse = decrypted.toArrayBuffer();
if (arrayResponse) {
tmpResult = arrayBufferConcat(tmpResult, arrayResponse);
}
}
// Enqueue the next data chunk into our target stream
if (done) {
if (counter == length) {
callback(obj);
}
controller.close();
return;
}
return pump();
});
}
}
})
})
The documentation tells us that:
Each chunk is read sequentially and output to the UI, until the stream
has finished being read, at which point we return out of the recursive
function and print the entire stream to another part of the UI.
I made a test program with node, using node-fetch:
import fetch from 'node-fetch';
const testStreamChunkOrder = async () => {
return new Promise(async (resolve) => {
let response = await fetch('https://jsonplaceholder.typicode.com/todos/');
let stream = response.body;
let data = '';
stream.on('readable', () => {
let chunk;
while (null !== (chunk = stream.read())) {
data += chunk;
}
})
stream.on('end', () => {
resolve(JSON.parse(data).splice(0, 5).map((x) => x.title));
})
});
}
(async () => {
let results = await Promise.all(new Array(10).fill(testStreamChunkOrder()))
let joined = results.map((r) => r.join(''));
console.log(`Is every result same: ${joined.every((j) => j.localeCompare(joined[0]) === 0)}`)
})()
This one fetches some random todo-list json and streams it chunk-by-chunk, accumulating the chunks into data. When the stream is done, we parse the full json and take the first 5 elements of the todo-list and keep only the titles, after which we then return the result asynchronously.
This whole process is done 10 times. When we have 10 streamed title-lists, we go through each title-list and join the title names together to form a string. Finally we use .every to see if each of the 10 strings are the same, which means that each json was fetched and streamed in the same order.
So I believe the problems lies somewhere else - the streaming itself is working correctly. While I did use node-fetch instead of the actual Fetch API, I think it is safe to say that the actual Fetch API works as it should.
Also I noticed that you are directly calling response.body.getReader(), but when I looked at the documentation, the body.getReader call is done inside another then statement:
fetch('./tortoise.png')
.then(response => response.body)
.then(body => {
const reader = body.getReader();
This might not matter, but considering everything else in your code, such as the excessive wrapping and returning of functions, I think your problems could go away just by reading a couple of tutorials on streams and cleaning up the code a bit. And if not, you will still be in a better position to figure out if the problem is in one of your many functions you are unwilling to expose. Asynchronous code's behavior is inherently difficult to debug and lacking information around such code makes it even harder.
I'm assuming you're using the cipher/decipher family of methods in node's crypto library. We can simplify this using streams by first piping the ReadableStream into a decipher TransformStream (a stream that is both readable and writable) via ReadableStream#pipe().
const { createDecipherIv } = require('crypto');
const { createWriteStream } = require('fs');
const { pipeline } = require('stream');
// change these to match your encryption scheme and key retrieval
const algo = 'aes-256-cbc';
const key = 'my5up3r53cr3t';
// put your initialization vector you've determined here
// leave null if you are not (or the algo doesn't support) using an iv
const iv = null;
// creates the decipher TransformStream
const decipher = createDecipherIv(algo, key, iv);
// write plaintext file here
const destFile = createWriteStream('/path/to/destination.ext');
fetch(myRequest, myInit)
.then(response => response.body)
.then(body => body.pipe(decipher).pipe(destFile))
.then(stream => stream.on('end', console.log('done writing file')));
You may also pipe this to be read out in a buffer, pipe to the browser, etc, just be sure to match your algorithm, key, and iv wherever you're defining your cipher/decipher functions.
If we take the pattern in that MDN example seriously, we should use the controller to enqueue the decrypted data (not the still encrypted value), and aggregate the results with the stream returned by the first promise. In other words...
return fetch(myRequest, myInit)
// Retrieve its body as ReadableStream
.then(response => {
const reader = response.body.getReader();
return new ReadableStream({
start(controller) {
return pump();
function pump() {
return reader.read().then(({ done, value }) => {
// When no more data needs to be consumed, close the stream
if (done) {
controller.close();
return;
}
// do the computational work on each chunk here and enqueue
// *the result of that work* on the controller stream...
const decrypted = cryptor.decrypt(value);
controller.enqueue(decrypted);
return pump();
});
}
}
})
})
// Create a new response out of the stream
.then(stream => new Response(stream))
// Create an object URL for the response
.then(response => response.blob())
.then(blob => {
const arrayResponse = blob.toArrayBuffer();
// arrayResponse is the properly sequenced result
// if the caller wants a promise to resolve to this, just return it
return arrayResponse;
// OR... the OP code makes reference to a callback. if that's real,
// call the callback with this result
// callback(arrayResponse);
})
.catch(err => console.error(err));
Related
How do I write to a Node Passthrough stream, then later read that data? When I try, the code hangs as though no data is sent. Here's a minimal example (in Typescript):
const stream = new PassThrough();
stream.write('Test chunk.');
stream.end();
// Later
const chunks: Buffer[] = [];
const output = await new Promise<Buffer>((resolve, reject) => {
stream.on('data', (chunk) => {
chunks.push(Buffer.from(chunk));}
);
stream.on('error', (err) => reject(err));
stream.on('end', () => {
resolve(Buffer.concat(chunks));
});
});
Please note that I can't attach the event listeners before writing to the stream: I don't know at the time of writing how I'm going to be reading from it. My understanding of a Transform stream like PassThrough was that it "decoupled" the Readable from the Writable, so that you could access them asynchronously.
Your code works for me, the promise resolves to a buffer containing "Test chunk.".
It will fail, however, if the readable side of the stream has already started emitting data when the stream.on('data', (chunk) => {...}) is executed. I could force such a behavior by enclosing the // Later part of your code in a setTimeout and inserting an additional
stream.on("data", () => {});
before that. This command will cause the stream to start emitting. Could that have happened in your case?
To be on the safe side, end the "early" part of your code with stream.pause() and begin the "later" part with stream.resume(), for example:
const output = await new Promise<Buffer>((resolve, reject) => {
stream.resume();
stream.on('data', (chunk) => {
...
To send a PDF file from a Node.js server to a client I use the following code:
const pdf = printer.createPdfKitDocument(docDefinition);
const chunks = [];
pdf.on("data", (chunk) => {
chunks.push(chunk);
});
pdf.on("end", () => {
const pdfBuffered = `data:application/pdf;base64, ${Buffer.concat(chunks).toString("base64")}`;
res.setHeader("Content-Type", "application/pdf");
res.setHeader("Content-Length", pdfBuffered.length);
res.send(pdfBuffered);
});
pdf.end();
Everything is working correctly, the only issue is that the stream here is using callback-approach rather then async/await.
I've found a possible solution:
const { pipeline } = require("stream/promises");
async function run() {
await pipeline(
fs.createReadStream('archive.tar'),
zlib.createGzip(),
fs.createWriteStream('archive.tar.gz')
);
console.log('Pipeline succeeded.');
}
run().catch(console.error);
But I can't figure out how to adopt the initial code to the one with stream/promises.
You can manually wrap your PDF code in a promise like this and then use it as a function that returns a promise:
function sendPDF(docDefinition) {
return new Promise((resolve, reject) => {
const pdf = printer.createPdfKitDocument(docDefinition);
const chunks = [];
pdf.on("data", (chunk) => {
chunks.push(chunk);
});
pdf.on("end", () => {
const pdfBuffered =
`data:application/pdf;base64, ${Buffer.concat(chunks).toString("base64")}`;
resolve(pdfBuffered);
});
pdf.on("error", reject);
pdf.end();
});
}
sendPDF(docDefinition).then(pdfBuffer => {
res.setHeader("Content-Type", "application/pdf");
res.setHeader("Content-Length", pdfBuffer.length);
res.send(pdfBuffer);
}).catch(err => {
console.log(err);
res.sendStatus(500);
});
Because there are many data events, you can't promisify just the data portion. You will still have to listen for each data event and collect the data.
You can only convert a callback-API to async/await if the callback is intended to only be executed once.
The one you found online works, because you're just waiting for the whole stream to finish before the callback runs once. What you've got is callbacks that execute multiple times, on every incoming chunk of data.
There are other resources you can look at to make streams nicer to consume, like RXJS, or this upcoming ECMAScript proposal to add observables to the language. Both of these are designed to handle the scenario when a callback can execute multiple times — something that async/await can not do.
I'm building a browser tool that samples a big file and shows some stats about it.
The program picks k random parts of a file, and processes each part of the file separately. Once each part is processed, an object is modified that keeps track of the rolling "stats" of the file (in the example below, I've simplified to incrementing a rolling counter).
The issue is that now every part is read in parallel, but I'd like it to be in series - so that the updates to the rolling counter are thread safe.
I think the next processFileChunk the for-loop is executing before the other finishes. How do I get this to be done serially?
I'm fairly new to Vue, and frontend in general. Is this a simple asynchronicity problem? How do I tell if something is asynchronous?
Edit: the parsing step uses the papaparse library (which I bet is the asynchronous part)
import {parse} from 'papaparse'
export default {
data() {
counter: 0
},
methods() {
streamAndSample(file) {
var vm = this;
const k = 10 // number of samples
var pointers = PickRandomPointers(file) // this is an array of integers, representing a random byte location of a file
for (const k_th_random_pointer in pointers) {
processFileChunk(file, k_th_random_pointer)
}
}
processFileChunk(file, k_th_random_pointer){
var vm = this;
var reader = new FileReader();
reader.readAsText(file.slice(k_th_random_pointer, k_th_random_pointer + 100000)) // read 100 KB
reader.onload = function (oEvent) {
var text = oEvent.target.result
parse(text,{complete: function (res) {
for (var i = 0; i < res.data.length; i++) {
vm.counter = vm.counter + 1
}}})
}
}
}
}
"thread safe" JavaScript
JavaScript is single-threaded, so only one thread of execution is run at a time. Async operations are put into a master event queue, and each is run until completion one after another.
Perhaps you meant "race condition", where the file size determines when it affects the counter rather than the read order. That is, a smaller file might be parsed earlier (and thus bump the counter) than a larger one that the parser initially saw first.
Awaiting each result
To await the parser completion of each file before moving onto the next, return a Promise from processFileChunk() that resolves the parsed data length:
export default {
methods: {
processFileChunk(file, k_th_random_pointer) {
return new Promise((resolve, reject) => {
const reader = new FileReader()
reader.onload = oEvent => {
const text = text = oEvent.target.result
const result = parse(text)
resolve(result.data.length)
}
reader.onerror = err => reject(err)
reader.onabort = () => reject()
reader.readAsText(file.slice(k_th_random_pointer, k_th_random_pointer + 100000)) // read 100 KB
})
}
}
}
Then make streamAndSample() an async function in order to await the result of each processFileChunk() call (the result is the data length resolved in the Promise):
export default {
methods: {
👇
async streamAndSample(file) {
const k = 10
const pointers = PickRandomPointers(file)
for (const k_th_random_pointer in pointers) {
👇
const length = await processFileChunk(file, k_th_random_pointer)
this.counter += length
}
}
}
}
Aside: Instead of passing a cached this into a callback, use an arrow function, which automatically preserves the context. I've done that in the code blocks above.
It's worth noting the papaparse.parse() also supports streaming for large files (although the starting read index cannot be specified), so processFileChunk() might be rewritten as this:
export default {
methods: {
processFileChunk(file, k_th_random_pointer) {
return new Promise((resolve, reject) => {
parse(file, {
chunk(res, parser) {
console.log('chunk', res.data.length)
},
chunkSize: 100000,
complete: res => resolve(res.data.length),
error: err => reject(err)
})
})
}
}
}
Problem: Asynchronous code causes whole source code to follow asynchrony
Example:
// global scope
let _variableDefinedInParentScope
WriteConfiguration(__params) {
// overSpreading.
let { data, config } = __params
// local variable.
let _fileName, _configuration, _write, _read
// variable assignment.
_fileName = config
_configuration = data
// if dataset and fileName is not empty.
if(!_.isEmpty(_configuration) && !_.isEmpty(_fileName)) {
// create a path you want to write to
// :warning: on iOS, you cannot write into `RNFS.MainBundlePath`,
// but `RNFS.DocumentDirectoryPath` exists on both platforms and is writable
_fileName = Fs.DocumentDirectoryPath + ' /' + _fileName;
// get file data and return.
return Fs.readDir(_fileName).then((__data) => {
console.error(__data)
// if data is not empty.
if(!_.isEmpty(__data)) {
// return data if found.
return __data
} else {
// write the file
return Fs.writeFile(_fileName, data, 'utf8')
.then((success) => {
// on successful file write.
return success
})
.catch((err) => {
// report failure.
console.error(err.message);
})
}
})
.catch((err) => {
// report failure.
console.error(err.message)
})
}
} // write configuration to json file.
following are ways to promise handling
.then((__onAccept)=>{}, (__onReject) => {})
aync function (__promise) { await WriteConfiguration() }
.then((_onAccept) => { _variableDefinedInParentScope = __onAccept }
As far i know third one is useless point as i never encounterd any return because promise is resolving takes time and calling that variable before any resolve will return undefined
React-native
In react-native almost every part of code is syncronus where file writing module's are asynchrony and this is causing trouble for me.
What i want
i want to return value from asyncrous to syncrouns code. without any asynchrony chain.
Your answer is quit simple by using the
await
and
async
EXAMPLE:
mainFunction(){
//will wait for asyncroFunction to finish!!
await asyncroFunction()
}
async asyncroFunction(){
}
Well I have some functions which connect to database (redis) and return some data, those functions usually are based on promises but are asynchronous and contain streams. I looked and read some things about testing and I chose to go with tape, sinon and proxyquire, if I mock this function how I would know that it works?
The following function (listKeys) returns (through promise) all the keys that exist in the redis db after completes the scanning.
let methods = {
client: client,
// Cache for listKeys
cacheKeys: [],
// Increment and return through promise all keys
// store to cacheKeys;
listKeys: blob => {
return new Promise((resolve, reject) => {
blob = blob ? blob : '*';
let stream = methods.client.scanStream({
match: blob,
count: 10,
})
stream.on('data', keys => {
for (var i = 0; i < keys.length; i++) {
if (methods.cacheKeys.indexOf(keys[i]) === -1) {
methods.cacheKeys.push(keys[i])
}
}
})
stream.on('end', () => {
resolve(methods.cacheKeys)
})
stream.on('error', reject)
})
}
}
So how do you test a function like that?
I think there are a couple ways To excercise this function through a test and all revolve around configuring a test stream to be used by your test.
I like to write test cases that I think are important first , then figure out a way to implement them. To me the most important is something like
it('should resolve cacheKeys on end')
Then a stream needs to be created to provide to your function
var Stream = require('stream');
var stream = new Stream();
Then scan stream needs to be controlled by your test
You could do this by creating a fake client
client = {
scanStream: (config) => { return stream }
}
Then a test can be configured with your assertion
var testKeys = ['t'];
Method.listKeys().then((cacheKeys) => {
assert(cacheKeys).toEqual(testKeys);
done()
})
Now that your promise is waiting on your stream with an assertion
Send data to stream.
stream.emit('data', testKeys)
A simple way to test whether the keys get saved to cacheKeys properly by mocking the DB stream, sending data over it and checking whether it got saved properly. E.g.:
// Create a mock stream to substitute database
var mockStream = new require('stream').Readable();
// Create a mock client.scanStream that returns the mocked stream
var client = {
scanStream: function () {
return mockStream;
}
};
// Assign the mocks to methods
methods.client = client;
// Call listKeys(), so the streams get prepared and the promise awaits resolution
methods.listKeys()
.then(function (r) {
// Setup asserts for correct results here
console.log('Promise resolved with: ', r);
});
// Send test data over the mocked stream
mockStream.emit('data', 'hello');
// End the stream to resolve the promise and execute the asserts
mockStream.emit('end');