Splitting a file before upload?

Splitting a file before upload? - javascript

On a webpage, is it possible to split large files into chunks before the file is uploaded to the server? For example, split a 10MB file into 1MB chunks, and upload one chunk at a time while showing a progress bar?
It sounds like JavaScript doesn't have any file manipulation abilities, but what about Flash and Java applets?
This would need to work in IE6+, Firefox and Chrome. Update: forgot to mention that (a) we are using Grails and (b) this needs to run over https.

You can try Plupload. It can be configured to check whatever runtime is available on users side, be it - Flash, Silverlight, HTML5, Gears, etc, and use whichever satisfies required features first. Among other things it supports image resizing (on users side, preserving EXIF data(!)), stream and multipart upload, and chunking. Files can be chunked on users side, and sent to a server-side handler chunk-by-chunk (requires some additional care on server), so that big files can be uploaded to a server having max filesize limit set to a value much lower then their size, for example. And more.
Some runtimes support https I believe, some need testing. Anyway, developers on there are quite responsive these days. So you might at least try ;)

The only option I know of that would allow this would be a signed Java applet.
Unsigned applets and Flash movies have no filesystem access, so they wouldn't be able to read the file data. Flash is able to upload files, but most of that is handled by the built-in Flash implementation and from what I remember the file contents would never be exposed to your code.

There is no JavaScript solution for that selection of browsers. There is the File API but whilst it works in newer Firefox and Chrome versions it's not going to happen in IE (no sign of it in IE9 betas yet either).
In any case, reading the file locally and uploading it via XMLHttpRequest is inefficient because XMLHttpRequest does not have the ability to send pure binary, only Unicode text. You can encode binary into text using base-64 (or, if you are really dedicated, a custom 7-bit encoding of your own) but this will be less efficient than a normal file upload.
You can certainly do uploads with Flash (see SWFUpload et al), or even Java if you must (Jumploader... I wouldn't bother, these days, though, as Flash prevalence is very high and the Java plugin continues to decline). You won't necessarily get the low-level control to split into chunks, but do you really need that? What for?
Another possible approach is to use a standard HTML file upload field, and when submit occurs set an interval call to poll the server with XMLHttpRequest, asking it how far the file upload is coming along. This requires a bit of work on the server end to store the current upload progress in the session or database, so another request can read it. It also means using a form parsing library that gives you progress callback, which most standard language built-in ones like PHP's don't.
Whatever you do, take a ‘progressive enhancement’ approach, allowing browsers with no support to fall back to a plain HTML upload. Browsers do typically have an upload progress bar for HTML file uploads, it just tends to be small and easily missed.

Do you specifically need it two be in X chunks? Or are you trying to solve the problems cause by uploading large files? (e.g. can't restart an upload on the client side, server side crashes when the entire file is uploaded and held in memory all at once)
Search for streaming upload components. It depends on what technologies you are working with as to which component you will prefer jsp, asp.net, etc.
http://krystalware.com/Products/SlickUpload/ This one is a server side product
Here are some more pointers to various uploaders http://weblogs.asp.net/jgalloway/archive/2008/01/08/large-file-uploads-in-asp-net.aspx
some try to manage memory on the server,e.g. so the entire huge file isn´t in memory at one time, some try to manage the client side experience.

Related

How to efficiently handle large file uploads with django+react?

I have a Django application with a React frontend, and need to upload multi-gigabyte files (around 8-12gb) selected from the React on a remote machine.
Currently I'm using a simple form and uploading the files via fetch with formData in the body and then reading them out of request.FILES, however I'm not very happy with this solution, since it takes over a minute to upload the files even from the same computer to itself (and I'm not entirely sure why).
Is there any way to speed up this process? The production environment for this is a gigabit local network without any external internet access so no cloud storage please.
The data seems to be fairly compressible, easily reducing it's size by 30%+ when zipped/tared.
Is there any way to compress during file uploads, or any parameters I can change on either end to speed up the process?
It doesn't seem like an 8 GB file should take over a minute to upload from the same machine to itself, and in fact should theoretically be faster than that over a gigabit network. How can I streamline this?
It'd be nice if I could make a progress bar on frontend but fetch didn't currently allow that, but if another method does that helps with the file uploads, happens to have a way to monitoring progress of the file upload, please mention it.
React 15.4, Django 1.10, Python 3.5 lots of memory, many CPU cores, and even for the server, a GPU (in case someone has some bizarre idea to use CUDA for decompression or something though I've never heard of such a thing) are available, in case there's some sort of parallelization that could help saturate the connection.

Client-side zipping with Flash + JavaScript

I'm looking for a robust way of creating a zip archive on the fly from information on a given page and making this available for download. Client-side zipping is a must since my script runs from a bookmarklet.
My first approach while I was more concerned with writing the rest of the script was just to post the information to a few lines of PHP running on my local server which zipped it and sent it back. This is obviously not suitable for a bookmarklet worth sharing.
I found JSZip earlier today, and I thought that'd be the end of it. This library works great when it works; unfortunately, the archives I'm creating frequently exceed a couple of MBs, and this breaks JSZip. (Note: I've only tested this on Chrome.)
Pure JS downloads also have the limitation of funky names due the data URI, which I intended to solve using JSZip's recommended method, using Downloadify, which uses Flash. This made me wonder whether the size limitations on JS zip generating could be / have been overcome by using a similar interplay of Flash & JS.
I Googled this, but having no experience with Actionscript I couldn't figure out quickly whether what I'm asking is possible. Is it possible to use a Flash object from JS to create relatively large (into the 10s of MBs) zip file on the client-side?
Thanks!

First of all some numbers:
Flash promises that uploads will work if the file is smaller than 100 Mb (I don't know whether it means base 10 or base 16).
There are two popular libraries in Flash for creating ZIP archives, but read on first.
ZIP archiver is a program that both compresses and archives the data, and it does it in exactly this order. I.e. it compresses each file separately and then appends it to the entire archive. This yields worse compression rate but allows for iterative creation of the archive. With the benefit being that you can even start sending the archive before it is entirely compressed.
An alternative to ZIP is first to use a dedicated archiver and then to compress the entire archive at once. This, some times can achieve few times better compression, but the cost is that you have to process the data at once.
But Flash ByteArray.compress() method offers you native implementation of deflate algorithm, which is mostly the same thing you would use in ZIP archiver. So, if you had implemented something like tar, you could significantly reduce the size of the files being sent.
But Flash is a single-thread environment, so, you would have to be careful about the size of the data you compress, and, probably, will have to find it out empirically. Or just use ZIP - more redundancy, but easier to implement.
I've used this library before: nochump. Didn't have any problems. Although, it is somewhat old, and it might make sense to try to port it to use Alchemy opcodes (which are used for fast memory access significantly reducing the cost of low-level binary arithmentic operations such as binary or, binary and etc.) This library implements CRC32 algorithm, which is an essential part of ZIP archive and it uses Alchemy - so it should be considerably faster, but you would have to implement the rest on your own.
Yet another option you might consider is Goole's NaCl - there you would be able to choose from archiver and compression implementations because it essentially runs the native code, so you could even use bz2 and other modern stuff - unfortunately, only in Chrome (and users must enable it) or Firefox (need plugin).

Is it possible to implement any kind of file upload recovery / resumption in a browser?

The project is a servlet to which people can upload files via, at present, HTTP POST. This is accompanied by Web page(s) providing a front-end to trigger the upload. We have more or less complete control over the servlet, and the Web pages, but don't want to impose any restrictions on the client beyond being a reasonably modern browser with Javascript. No Java applets etc.
Files may potentially be large, and a possible use case is mobile devices on less reliable networks. Some people on the project are demanding the ability to resume an upload if the network connection goes down. I don't think this is possible with plain HTTP and Javascript in a browser, but I'd love to be proved wrong.
Any suggestions?

Not with Plain Ol' JS. It doesn't have access to the file system, not even a file added to an input type=file control and so it cannot manipulate the data and upload via XHR instead.
You would have to look into a Flash or Java based alternative.

With your current restrictions, no.
(There may be a tiny chance that using the HTML5 file api could be capable of doing this. Maybe someone more knowledgeable can comment because I usually cannot make heads or tails of technical specifications from the w3c : http://www.w3.org/TR/file-upload/ )

Firefox 3.6 implements a FileReader interface, however it doesn't seem to support any form of skipping. Therefor, you would need to read the file and split it where you need it to resume.
This would not be especially useful for large file since you would probably crash the browser anyway because of the memory-allocation it would need.
https://developer.mozilla.org/en/DOM/FileReader

Is it possible to compute a file's SHA1 ID using Javascript?

If this were possible to do prior to posting a form, it may save me having to upload the file to my server...

To do that you would have to load the file's binary information into JavaScript. Which is not possible.
But here's an implementation of SHA1 in JavaScript.

Actually you can read the contents of a client-side file now, as long as it's chosen in a file upload field and you are using Firefox. See the input.files array. You can then indeed hash it, although it might be rather slow.
See How would I get a Hash value of a users file with Javascript or Flash? for an example and a compact SHA-1 implementation.

It is possible to use SHA1, though performance isn't going to be the best...
For anything over a few hundred KB's you will have to run some benchmarks and determine if indeed its a viable solution.
See this link for a good implementation (passpack and quite a few OS projects use it)
Edit:
As other have already replied, actually getting the file contents may be a whole different matter - so unless you use something like Google Gears or Adobe AIR it should be virtually impossible.

One can read their local file using the HTML5 File interface: https://developer.mozilla.org/en-US/docs/Web/API/File
And then you can use a library for like Crypto.js https://code.google.com/p/crypto-js/ to finish the hash over the read text.

No, you can't access a file from a local computer using JavaScript .
You're going to have to upload it first to the server, then checking the checksum of the file.

Not natively, no, and this is a bad idea anyway. Every byte in the file will have to be loaded into memory by Javascript, and you'd need a way to get it there.
If you must do this and you've got a way to put the file's binary information into your script, then there's plenty of third-party scripts you can use. Here's one, for example.

You could do this with a Java applet. I've never used any of them, but there are quite a few Java upload applets out there. The hash algorithm itself is available with Java and can be accessed through java.security.MessageDigest. If the client doesn't have the Java plug-in available you could just fail back to a regular upload and hash on the server.
A side note: depending upon why you're hashing the file you'll probably want to re-hash it on the server after the upload rather than trust the client.

Is it possible to write to a file (on a disk) using JavaScript?

I am a novice-intermediate programmer taking a stab at AJAX. While reading up on JavaScript I found it curious that most of the examples I've been drawing on use PHP for such an operation. I know many of you may argue that 'I'm doing it wrong' or 'JavaScript is a client-side language' etc. but the question stands. . .can you write a file in only JavaScript?

Yes, of course you can. It just depends on what API objects your javascript engine makes available to you.
However, odds are the javascript engine you're thinking about does not provide this capability. Definitely none of the major web browsers will allow it.

You can write cookies with Javascript, on newer browsers you also have an SQLite database to store client side data. You cannot store data in an arbitrary location on the disk though.

You can use something like Google Gears to produce JS applications which are capable of storing data in a local cache or database. You can't read or write arbitrary areas of the disk though. (This was written in 2009 - Google Gears is now deprecated)
These days, you should be looking at the local storage capabilities provided by HTML5

No. You could use JavaScript to create an AJAX request to a server-side processing script, but allowing JS to directly write to disk - either client-side or server-side - would be a massive, nasty, glaring, unforgivable browser security hole.

The short answer is no; you cannot by default write a file to the local disk, by using plain JavaScript in a browser. You'll need a helper to do that. For example, TiddlyWiki is a wiki engine that is just a single, static HTML file, but it can write itself to disk with the help of a Java applet (Tiddly Saver).

You can in Windows Scripting Host.

Next version of chrome (v52) made this possible with fetch api + service worker + streams, you can enable streams now with a flag...
you can go to the StreamSaver.js to see some examples of how to use it.
You can do something like this:
const writeStream = fs.createWriteStream('filename.txt')
const encoder = new TextEncoder
let data = 'a'.repeat(1024)
let uint8array = encoder.encode(data + "\n\n")
writeStream.write(uint8array)
writeStream.close()
Or just go ahead and look at the demos: https://jimmywarting.github.io/StreamSaver.js/example.html

Nope, Javascript is not allowed to access the filesystem at all, its a security restriction in the browser. The only way you can really do it is with ActiveX, but then your limiting yourself to using IE.
Edit:
AS the above post states, it could be possible if your engine allowed it, however I don't know of one browser engine (which is what I asusme you are writing it for) that will allow you to.

If you just need to let user download a file (.txt, .csv, images and others) via browser download dialog, you can use data URIs with <a href=... download=.../> tag.
For example (for text file):
Click to download
You can also set the attribute href and download using javascript, and use element.click() to trigger the download.
However, this method cannot write a file without user confirming the file download dialog.

We Keep Coding

JavaScript is the programming language of the Web.