Download large data stream (> 1Gb) using javascript

Download large data stream (> 1Gb) using javascript - javascript

I was wondering if it was possible to stream data from javascript to the browser's downloads manager.
Using webrtc, I stream data (from files > 1Gb) from a browser to the other. On the receiver side, I store into memory all this data (as arraybuffer ... so the data is essentially still chunks), and I would like the user to be able to download it.
Problem : Blob objects have a maximum size of about 600 Mb (depending on the browser) so I can't re-create the file from the chunks. Is there a way to stream these chunks so that the browser downloads them directly ?

if you want to fetch a large file blob from an api or url, you can use streamsaver.
npm install streamsaver
then you can do something like this
import { createWriteStream } from 'streamsaver';
export const downloadFile = (url, fileName) => {
return fetch(url).then(res => {
const fileStream = createWriteStream(fileName);
const writer = fileStream.getWriter();
if (res.body.pipeTo) {
writer.releaseLock();
return res.body.pipeTo(fileStream);
}
const reader = res.body.getReader();
const pump = () =>
reader
.read()
.then(({ value, done }) => (done ? writer.close() : writer.write(value).then(pump)));
return pump();
});
};
and you can use it like this:
const url = "http://urltobigfile";
const fileName = "bigfile.zip";
downloadFile(url, fileName).then(() => { alert('done'); });

Following #guest271314's advice, I added StreamSaver.js to my project, and I successfully received files bigger than 1GB on Chrome. According to the documentation, it should work for files up to 15GB but my browser crashed before that (maximum file size was about 4GB for me).
Note I: to avoid the Blob max size limitation, I also tried to manually append data to the href field of a <a></a> but it failed with files of about 600MB ...
Note II: as amazing as it might seem, the basic technique using createObjectURL works perfectly fine on Firefox for files up to 4GB !!

Related

How do I use JSZip to unzip a .Zip File object in memory?

I have a .Zip file that is in memory as a File object. I want to access the individual files and add it to an array of File objects in memory. I see several options online but it all requires accessing a physical .zip file on a computer. How can I do it without saving it as a physical file?

You can certainly do this with JSZip, since File implements Blob you can probably just do JSZip.loadAsync(yourFileObject).then(zip => { /* do something */ }. See the docs. You'll want to iterate over every file in the archive and create blobs, optimally with Promise.all().
However, for way better performance and smaller size, I would like to point you to my library fflate. If you're trying to get an array of file objects in fflate:
// If you aren't using a bundler, see the CDN instructions in the docs
import { unzipSync, unzip } from 'fflate';
// multithreaded = false is slower and blocks the UI thread if the files
// inside are compressed, but it can be faster if they are not.
const getFiles = async (zipFile, multithreaded = true) => {
const zipBuffer = new Uint8Array(await zipFile.arrayBuffer());
const unzipped = multithreaded
? await new Promise((resolve, reject) => unzip(
zipBuffer,
(err, unzipped) => err
? reject(err)
: resolve(unzipped)
))
: unzipSync(zipBuffer);
const fileArray = Object.keys(unzipped)
.filter(filename => unzipped[filename].length > 0)
.map(filename => new File([unzipped[filename]], filename));
return fileArray;
}
console.log(someFileObject);
// File { ... }
getFiles(someFileObject).then(console.log)
// [File { ... }, File { ... }, ...]

Stream File Download in Safari and Firefox

Currently, as a requirement, if a user wishes to download a large zip file, the download is a streamed.
This is done by fetching an endpoint, then using Streamsaver.js to stream the download to their browser as shown below.
function download(id, fileName) {
const endpoint = `.../extract/downloads/zip_download/?id=${id}`;
return fetch(endpoint, requestOptions.get()).then(res => {
const downloadSize = res.headers.get("content-length");
const fileStream = createWriteStream(fileName, { size: downloadSize });
const writer = fileStream.getWriter();
if (res.body.pipeTo) {
writer.releaseLock();
return res.body.pipeTo(fileStream);
}
const reader = res.body.getReader();
const pump = () =>
reader
.read()
.then(({ value, done }) =>
done ? writer.close() : writer.write(value).then(pump)
);
return pump();
});
}
This works fine in Chrome, however I'm running into issues with Firefox and Safari. The issue I get is:
TypeError: undefined is not a constructor (evaluating 'new streamSaver.WritableStream')
What other methods are there of approaching this? Surely there must be a universal way to stream a download of a large that I'm missing?

I ran into the same issue and included the web-streams-polyfill package in my project to fix it. Currently, some non-chromium browsers appear to not support WritableStream.
For myself, I simply included this script tag in my index.html
<script src="https://unpkg.com/web-streams-polyfill/dist/polyfill.min.js"></script>

What is the best way to keep a file open to read/write?

I have a local JSON file which I intent to read/write from a NodeJS electron app. I am not sure, but I believe that instead of using readFile() and writeFile(), I should get a FileHandle to avoid multiple open and close actions.
So I've tried to grab a FileHandle from fs.promises.open(), but the problem seems to be that I am unable to get a FileHandle from an existing file without truncate it and clear it to 0.
const { resolve } = require('path');
const fsPromises = require('fs').promises;
function init() {
// Save table name
this.path = resolve(__dirname, '..', 'data', `test.json`);
// Create/Open the json file
fsPromises
.open(this.path, 'wx+')
.then(fileHandle => {
// Grab file handle if the file don't exists
// because of the flag 'wx+'
this.fh = fileHandle;
})
.catch(err => {
if (err.code === 'EEXIST') {
// File exists
}
});
}
Am I doing something wrong? Are there better ways to do it?
Links:
https://nodejs.org/api/fs.html#fs_fspromises_open_path_flags_mode
https://nodejs.org/api/fs.html#fs_file_system_flags

Because JSON is a text format that has to be read or written all at once and can't be easily modified or added onto in place, you're going to have to read the whole file or write the whole file at once.
So, your simplest option will be to just use fs.promises.readFile() and fs.promises.writeFile() and let the library open the file, read/write it and close the file. Opening and closing a file in a modern OS takes advantage of disk caching so if you're reopening a file you just previously opened not long ago, it's not going to be a slow operation. Further, since nodejs performs these operations in secondary threads in libuv, it doesn't block the main thread of nodejs either so its generally not a performance issue for your server.
If you really wanted to open the file once and hold it open, you would open it for reading and writing using the r+ flag as in:
const fileHandle = await fsPromises.open(this.path, 'r+');
Reading the whole file would be simple as the new fileHandle object has a .readFile() method.
const text = await fileHandle.readFile({encoding 'utf8'});
For writing the whole file from an open filehandle, you would have to truncate the file, then write your bytes, then flush the write buffer to ensure the last bit of the data got to the disk and isn't sitting in a buffer.
await fileHandle.truncate(0); // clear previous contents
let {bytesWritten} = await fileHandle.write(mybuffer, 0, someLength, 0); // write new data
assert(bytesWritten === someLength);
await fileHandle.sync(); // flush buffering to disk

JavaScript: Writing to download stream

I want to download an encrypted file from my server, decrypt it and save it locally. I want to decrypt the file and write it locally as it is being downloaded rather than waiting for the download to finish, decrypting it and then putting the decrypted file in an anchor tag. The main reason I want to do this is so that with large files the browser does not have to store hundreds of megabytes or several gigabytes in memory.

This is only going to be possible with a combination of service worker + fetch + stream
A few browser has worker and fetch but even fewer support fetch with streaming (Blink)
new Response(new ReadableStream({...}))
I have built a streaming file saver lib to communicate with a service worker in other to intercept network request: StreamSaver.js
It's a little bit different from node's stream here is an example
function unencrypt(){
// should return Uint8Array
return new Uint8Array()
}
// We use fetch instead of xhr that has streaming support
fetch(url).then(res => {
// create a writable stream + intercept a network response
const fileStream = streamSaver.createWriteStream('filename.txt')
const writer = fileStream.getWriter()
// stream the response
const reader = res.body.getReader()
const pump = () => reader.read()
.then(({ value, done }) => {
let chunk = unencrypt(value)
// Write one chunk, then get the next one
writer.write(chunk) // returns a promise
// While the write stream can handle the watermark,
// read more data
return writer.ready.then(pump)
)
// Start the reader
pump().then(() =>
console.log('Closed the stream, Done writing')
)
})
There are also two other way you can get streaming response with xhr, but it's not standard and doesn't mather if you use them (responseType = ms-stream || moz-chunked-arrayBuffer) cuz StreamSaver depends on fetch + ReadableStream any ways and can't be used in any other way
Later you will be able to do something like this when WritableStream + Transform streams gets implemented as well
fetch(url).then(res => {
const fileStream = streamSaver.createWriteStream('filename.txt')
res.body
.pipeThrogh(unencrypt)
.pipeTo(fileStream)
.then(done)
})
It's also worth mentioning that the default download manager is commonly associated with background download so ppl sometimes close the tab when they see the download. But this is all happening in the main thread so you need to warn the user when they leave
window.onbeforeunload = function(e) {
if( download_is_done() ) return
var dialogText = 'Download is not finish, leaving the page will abort the download'
e.returnValue = dialogText
return dialogText
}

New solution has arrived: showSaveFilePicker/FileSystemWritableFileStream, supported in Chrome, Edge, and Opera since October 2020 (and with a ServiceWorker-based shim for Firefox—from the author of the other major answer!), will allow you to do this directly:
async function streamDownloadDecryptToDisk(url, DECRYPT) {
// create readable stream for ciphertext
let rs_src = fetch(url).then(response => response.body);
// create writable stream for file
let ws_dest = window.showSaveFilePicker().then(handle => handle.createWritable());
// create transform stream for decryption
let ts_dec = new TransformStream({
async transform(chunk, controller) {
controller.enqueue(await DECRYPT(chunk));
}
});
// stream cleartext to file
let rs_clear = rs_src.then(s => s.pipeThrough(ts_dec));
return (await rs_clear).pipeTo(await ws_dest);
}
Depending on performance—if you're trying to compete with MEGA, for instance—you might also consider modifying DECRYPT(chunk) to allow you to use ReadableStreamBYOBReader with it:
…zero-copy reading from an underlying byte source. It is used for efficient copying from underlying sources where the data is delivered as an "anonymous" sequence of bytes, such as files.

For security reasons, browsers do not allow piping an incoming readable stream directly to the local file system, so you have two ways to solve it:
window.open(Resource_URL): download the resource in a new window with
Content_Disposition set to "attachment";
<a download href="path/to/resource"></a>: using the "download" attribute of
AnchorElement to download stream into the hard disk;
hope these helps :)

Prompt to download a stream of multiple files as one

Assuming there's a server storing multiple files (not necessarily text documents):
http://<server>/<path>/file0001.txt ... http://<server>/<path>/file9999.txt
If user was to download all of those files as one, how would I do it in javascript?
Normally user would have to download 9999 files and join them on his drive.
How can I prompt a download of a file and stream the data of multiple files while javascript gets them, just like it's a stream of one, big file.
I imagine it would be something like this (excuse me for lack of javascript, just trying to explain):
With (download prompt of 'onefile.txt') as connection:
While connection is open:
For file in file_list:
get file
return file.contents
connection close
Downloading each file and storing it in memory until the last one is retrieved is not a good idea, since overall size of that file can be quite big.
I'm wondering if that's even possible. I can write it in python, but that's another story. I wanted to make it a javascript function on a website.

I'm surprised javascript can't just create a "virtual localhost connection" where it uses some generator to "yield" the contents of each file...
Well, if you use a service worker then you can manipulate the response and give it a readableStream which you can "yield" the content of each file...
This is what the streamSaver dose internally but takes away all hassle...
I will show you an example using es6 and StreamSaver.js
It's not tested it's just a ruffly idea.
This will consume very little memory, but it's limited to only Blink ATM if you wanna use StreamSaver
let download = Promise.coroutine(function* (files) {
const fileStream = streamSaver.createWriteStream('onefile.txt')
const writeStream = fileStream.getWriter()
// Later you will be able to just simply do
// yield res.body.pipeTo(fileStream) instead of pumping
for (let file of files) {
let res = yield fetch(file)
let reader = res.body.getReader()
let pump = () => reader.read()
.then(({ value, done }) => !done &&
// Write one chunk, then get the next one
writeStream.write(value).then(pump)
)
yield pump()
}
// Close the stream when you are done writing
writeStream.close()
}
download([
'http://<server>/<path>/file0001.txt',
'http://<server>/<path>/file9999.txt'
]).then(() => {
alert('all done')
})

We Keep Coding

JavaScript is the programming language of the Web.

Download large data stream (> 1Gb) using javascript - javascript

Related

How do I use JSZip to unzip a .Zip File object in memory?

Stream File Download in Safari and Firefox

What is the best way to keep a file open to read/write?

JavaScript: Writing to download stream

Prompt to download a stream of multiple files as one

Categories

Resources