How to enforce file size limit in jquery.fileupload - javascript

I'm using jquery.fileupload() to upload files from a browser to a Node.js server, which parses the files with the "multiparty" npm. I need to enforce a size limit on each file, and also on total size of all the files being uploaded in one request.
The "multiparty" npm allows me to do the latter, but not the former. And even for the latter, the limit isn't enforced until the browser uploads enough data to hit the limit. So the user can wait a long time only to get an error message.
I'd like to enforce the limit on the client-side. I've searched the Internet for solutions, but none of them seem to work. They may have worked in the past, but not with the newest version of Chrome.
I've found that I can determine that the files are too big by watching for a "change" event on the file-input element, like this:
$('#file-input-id').on('change', function() {
console.log(this.files);
});
When this event triggers, this.files contains an array of selected files, including the name and size of each. So I can determine that the caps have been exceeded, and I can alert the user. But I don't know how to stop the files from uploading anyway. Various source on the Internet suggest that I can do this by returning false or manipulating this.files. But none of this seems to work.
I'm testing this against the latest version of Chrome (66.0.3359.139), but I'd like a solution that works with any modern browser.

The file object that exists on the element has a size property which you can use to compare and make validations on the client. I wrote an example in javascript. I know you want it in JQuery but, that was kind of already answered here
Anyways, this is what I came up with ...
var inputElement = document.getElementById("file")
inputElement.addEventListener('change', function(){
var fileLimit = 100; // could be whatever you want
var files = inputElement.files;
var fileSize = files[0].size; //inputElement.files is always an array
var fileSizeInKB = (fileSize/1024); // this would be in kilobytes defaults to bytes
if(fileSizeInKB < fileLimit){
console.log("file go for launch")
// add file to server here
} else {
console.log("file too big")
// do not pass go, do not add to server. Pass error to user
document.getElementById("error").innerHTML = "your file is over 100 KB "
}
})
(CodePen https://codepen.io/HappinessFactory/pen/yjggbq)
Hope that answers your question. Good luck!

Thanks! I'm sure your answer would work if I weren't using jquery.fileupload(), but jquery.fileupload() starts the upload automatically. So there's no "add file to server" logic to perform/skip.
But your answer sent me off in the right direction. For anyone else stuck on this: The trick is to use the "start" or "submit" properties of the "options" object passed into jquery.fileupload(). Both of these are functions, and if either one returns false, the upload is cancelled.

Related

What is the Correct Way to Download Blob in Chrome? [duplicate]

Update
Since asking the question below and arriving at a more fundamental question after finding the error in the code, I found some more information such as in the MDN web docs for the downloads API method downloads.download() it states that a revoke of an object url should be performed only after the file/url has been downloaded. So, I spent some time trying to understand whether or not a web extension makes the downloads API onChanged event 'available' to javascript of a web page and don't think it does. I don't understand why the downloads API is available to extensions only, especailly when there are quite a few questions concerning this same memory-usage/object-url-revocation issue. For example, Wait for user to finish downloading a blob in Javascript.
If you know, would you please explain? Thank you.
Starting with Firefox browser closed, and right clicking on a local html file to open in Firefox, it opens with five firefox.exe processes as viewed in Windows Task Manager. Four of the processes start with between 20,000k and 25,000k of memory and one with about 115,000k.
This html page has an indexedDB database with 50 object stores each containing 50 objects. Each object is extracted from its object store and converted to string using JSON.stringify, and written to a two-dimensional array. Afterward, all elements of the array are concatenated into one large string, converted to a blob and written to the hard disk through a URL object which is revoked immediately afterward. The final file is about 190MB.
If the code is stopped just before the conversion to blob, one of the firefox.exe process's memory usage increases to around 425,000k and then falls back to 25,000k in about 5-10 seconds after the elements of the array have been concatenated into a single string.
If the code is run to completion, the memory usage of that same firefox.exe process grows to about 1,000,000k and then drops to about 225,000k. The firefox.exe process that started at 115,000k also increases at the blob stage of the code to about 325,000k and never decreases.
After the blob is written to disk as a text file, these two firefox.exe processes never release the approximate 2 x 200,000k increase in memory.
I have set every variable used in each function to null and the memory is never freed unless the page is refreshed. Also, this process is initiated by a button click event; and if it is run again without an intermediate refresh, each of these two firefox.exe processes grab an additional 200,000k of memory with each run.
I haven't been able to figure out how to free the memory?
The two functions are quite simple. json[i][j] holds the string version of the jth object from the ith object store in the database. os_data[] is an array of small objects { "name" : objectStoreName, "count" : n }, where n is the number of objects in the store. The build_text fuction appears to release the memory if write_to_disk is not invoked. So, the issue appears to be related to the blob or the url.
I'm probably overlooking something obvious. Thank you for any direction you can provide.
EDIT:
I see from JavaScript: Create and save file that I have a mistake in the revokeObjectURL(blob) statment. It can't revoke blob, the createObjectURL(blob) needed to be saved to a variable like url and then revoke url, not blob.
That worked for the most part and the memory is released from both of the firefox.exe processes mentioned above, in most cases. This leaves me with one small question about the timing of the revoke of the url.
If the revoke is what allows for the release of memory, should the url be revoked only after the file has been successfully downloaded? If the revoke takes place before the user clicks ok to download the file, what happens? Suppose I click the button to prepare the file from the database and after it's ready the browser brings up the window for downloading, but I wait a little while thinking about what to name the file or where to save it, won't the revoke statment be run already but the url is still 'held' by the browser since it is what will be downloaded? I know I can still download the file, but does the revoke still release the memory? From my small amount of experimenting with this one example, it appears that it does not get released in this scenario.
If there was an event that fires when the file has either successfully or unsuccessfully been downloaded to the client, is not that the time when the url should be revoked? Would it be better to set a timeout of a few minutes before revoking the url, since I'm pretty sure there is not an event indicating download to client has ended.
I'm probably not understanding something basic about this. Thanks.
function build_text() {
var i, j, l, txt = "";
for ( i = 1; i <=50; i++ ) {
l = os_data[i-1].count;
for ( j = 1; j <= l; j++ ) {
txt += json[i][j] + '\n';
}; // next j
}; // next i
write_to_disk('indexedDB portfolio', txt);
txt = json = null;
} // close build_text
function write_to_disk( fileName, data ) {
fileName = fileName.replace(".","");
var blob = new Blob( [data], { type: 'text/csv' } ), elem;
if ( window.navigator.msSaveOrOpenBlob ) {
window.navigator.msSaveBlob(blob, fileName);
} else {
elem = window.document.createElement('a');
elem.href = window.URL.createObjectURL(blob);
elem.download = fileName;
document.body.appendChild(elem);
elem.click();
document.body.removeChild(elem);
window.URL.revokeObjectURL(blob);
}; // end if
data = blob = elem = fileName = null;
} // close write_to_disk
I am a bit lost as to what is the question here...
But let's try to answer, at least part of it:
For a starter let's explain what URL.createObjectURL(blob) roughly does:
It creates a blob URI, which is an URI pointing to the Blob blob in memory just like if it was in an reachable place (like a server).
This blob URI will mark blob as being un-collectable by the Garbage Collector (GC) for as long as it has not been revoked, so that you don't have to maintain a live reference to blob in your script, but that you can still use/load it.
URL.revokeObjectURL will then break the link between the blob URI and the Blob in memory. It will not free up the memory occupied by blob directly, it will just remove its own protection regarding the GC, [and won't point to anywhere anymore].
So if you have multiple blob URI pointing to the same Blob object, revoking only one won't break the other blob URIs.
Now, the memory will be freed only when the GC will kick in, and this in only decided by the browser internals, when it thinks it is the best time, or when it sees it has no other options (generally when it misses memroy space).
So it is quite normal that you don't see your memory being freed up instantly, and by experience, I would say that FF doesn't care about using a lot of memory, when it is available, making GC kick not so often, whihc is good for user-experience (GCing often results in lags).
For your download question, indeed, web APIs don't provide a way to know if a download has been successful or failed, nor even if it has just ended.
For the revoking part, it really depends on when you do it.
If you do it directly in the click handler, then the browser won't have done the pre-fetch request yet, so when the default action of the click (the download) will happen, there won't be anything linked by the URI anymore.
Now, if you do revoke the blob URI after the "save" prompt, the browser will have done a pre-fetch request, and thus might be able to mark by itself that the Blob resource should not be cleared. But I don't think this behavior is tied by any specs, and it might be better to wait at least for the window's focus event, at which point the downloading of the resource should already have started.
const blob = new Blob(['bar']);
const uri = URL.createObjectURL(blob);
anchor.href = uri;
anchor.onclick = e => {
window.addEventListener('focus', e=>{
URL.revokeObjectURL(uri);
console.log("Blob URI revoked, you won't be able to download it anymore");
}, {once: true});
};
<a id="anchor" download="foo.txt">download</a>

check uploaded file format on client side

I am creating a web portal where end user will upload a csv file and I will do some manipulation on that file on the server side (python). There is some latency and lag on the server side so I dont want to send the message from server to client regarding the bad format of uploaded file. Is there any way to do heavy lifting on client side may be using js or jquery to check if the uploaded file is "comma" separated or not etc etc?
I know we can do "accept=.csv" in the html so that file extension has csv format but how to do with contents to be sure.
Accessing local files from Javascript is only possible by using the File API (https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications) - by using this you might be able to check the content whether it matches your expectations or not.
Here's some bits of code I used to display a preview image clientside when a file is selected. You should be able to use this as a starting point to do something else with the file data. Determining whether its csv is up to you.
Obvious caveat:
You still have to check server side. Anyone can modify your clientside javascript to pretend a bad file is good.
Another caveat:
I'm pretty sure that you can have escaped comma characters in a valid csv file. I think the escape character might be different across some implementations too...
// Fired when the user chooses a file in the OS dialog box
// They will have clicked <input id="fileId" type="file">
document.getElementById('fileId').onchange = function (evt) {
if(!evt.target.files || evt.target.files.length === 0){
console.log('No files selected');
return;
}
var uploadTitle = evt2.target.files[0].name;
var uploadSize = evt2.target.files[0].size;
var uploadType = evt2.target.files[0].type;
// To manipulate the file you set a callback for the whole contents:
var FR = new FileReader();
// I've only used this readAsDataURL which will encode the file like data:image/gif;base64,R0lGODl...
// I'm sure there's a similar call for plaintext
FR.readAsDataURL($('#file')[0].files[0]);
FR.onload = function(evt2){
var evtData = {
filesEvent: evt,
}
var uploadData = evt2.result
console.log(uploadTitle, uploadSize, uploadType, uploadData);
}
}

Detecting if the user drops the same file twice on a browser window

I want to allow users to drag images from their desktop onto a browser window and then upload those images to a server. I want to upload each file only once, even if it is dropped on the window several times. For security reasons, the information from File object that is accessible to JavaScript is limited. According to msdn.microsoft.com, only the following properties can be read:
name
lastModifiedDate
(Safari also exposes size and type).
The user can drop two images with the same name and last modified date from different folders onto the browser window. There is a very small but finite chance that these two images are in fact different.
I've created a script that reads in the raw dataURL of each image file, and compares it to files that were previously dropped on the window. One advantage of this is that it can detect identical files with different names.
This works, but it seems overkill. It also requires a huge amount of data to be stored. I could improve this (and add to the overkill) by making a hash of the dataURL, and storing that instead.
I'm hoping that there may be a more elegant way of achieving my goal. What can you suggest?
<!DOCTYPE html>
<html>
<head>
<title>Detect duplicate drops</title>
<style>
html, body {
width: 100%;
height: 100%;
margin: 0;
background: #000;
}
</style>
<script>
var body
var imageData = []
document.addEventListener('DOMContentLoaded', function ready() {
body = document.getElementsByTagName("body")[0]
body.addEventListener("dragover", swallowEvent, false)
body.addEventListener("drop", treatDrop, false)
}, false)
function swallowEvent(event) {
// Prevent browser from loading the dropped image in an empty page
event.preventDefault()
event.stopPropagation()
}
function treatDrop(event) {
swallowEvent(event)
for (var ii=0, file; file = event.dataTransfer.files[ii]; ii++) {
importImage(file)
}
}
function importImage(file) {
var reader = new FileReader()
reader.onload = function fileImported(event) {
var dataURL = event.target.result
var index = imageData.indexOf(dataURL)
var img, message
if (index < 0) {
index = imageData.length
console.log(dataURL)
imageData.push(dataURL, file.name)
message = "Image "+file.name+" imported"
} else {
message = "Image "+file.name+" imported as "+imageData[index+1]
}
img = document.createElement("img")
img.src = imageData[index] // copy or reference?
body.appendChild(img)
console.log(message)
}
reader.readAsDataURL(file)
}
</script>
</head>
<body>
</body>
</html>
Here is a suggestion (that I haven't seen being mentioned in your question):
Create a Blob URL for each file-object in the FileList-object to be stored in the browsers URL Store, saving their URL-String.
Then you pass that URL-string to a webworker (separate thread) which uses the FileReader to read each file (accessed via the Blob URL string) in chunked sections, re-using one fixed-size buffer (almost like a circular buffer), to calculates the file's hash (there are simple/fast carry-able hashes like crc32 which can often be simply combined with a vertical and horizontal checksum in the same loop (also carry-able over chunks)).
You might speed up the process by reading in 32 bit (unsigned) values instead of 8 bit values using an appropriate 'bufferview' (that's 4 times faster). System endianness is not important, don't waste resources on this!
Upon completion the webworker then passes back the file's hash to the main-thread/app which then simply performs your matrix comparison of [[fname, fsize, blobUrl, fhash] /* , etc /*].
Pro
The re-used fixed buffer significantly brings down your memory usage (to any level you specify), the webworker brings up performance by using the extra thread (which doesn't block your main browser's thread).
Con
You'd still need serverside fall-back for browsers with javascript disabled (you might add a hidden field to the form and set it's value using javascript as means of a javascript-enabled check, as to lower server-side load). However.. even then.. you'd still need server-side fallback to safeguard against malicious input.
Usefulness
So.. no net gain? Well.. if the chance is reasonable that the user might upload duplicate files (or just uses them in a web-based app) than you have saved on waisted bandwith just to perform the check. That is quite a (ecological/financial) win in my book.
Extra
Hashes are prone to collision, period. To lower the (realistic) chance of collision you'd select a more advanced hash-algo (most are easily carry-able in chunked mode). Obvious trade-off for more advanced hashes is larger code-size and lower speed (higher CPU usage).

Ensuring HTML5 FileReader Performs When .target Content Has Changed Since Last Use

In a website I am using HTML5, rather than .php/forms/ and clientside AJAX, to retrieve files
from my laptop's hard drive. The page, when it loads, has a textarea in which I display the files; but here I also should note that I have other functionality, which permits me to save the contents of the textarea as a new file. That, I do via AJAX. Back to the chase: the "Browse" opens the Dialogue, I select a file, and the HTML5 FileRead fetches the file, all is well.
I can perform that operation as many times as I wish, no problem. But if I then save a new file, when I next go to use the FileReader, it fails. I have been on this for about 14 hours now, trying to discover a reason why the code behaves that way.
Without putting up a wall of code, anyone encountered a similar problem? Ah, and the most annoying of all, is that the Dragonfly inspector is showing the textarea textContent as the proper data, from inside the event.target.result which works so well on the first operations after opening the page.
EDIT ::
Unlike most of the blogs, questions and answers examples and even some spec documentation,
.textContent did not work in the context of the script I am working in, but one change as to
what specifically the data was being written AS, and all the intermittent, on again off again
FileReader behaviour, was resolved and extensive testing shows it performing 100%.
In order to permit use of FileReader in between performing file saves out of the textarea, just had to change textContent to innerText:
reader.onloadend = function(evt) {
if (evt.target.readyState == FileReader.DONE) { // note: DONE == 2
console.log(">>>> 12 <<<< text is : "+event.target.result);
// the textarea being edited // saved from for new files...
// ... and used to display files read by FileReader
document.getElementById("TextAREAx1xMAIN").textContent = event.target.result;
}
};
reader.onloadend = function(evt) {
if (evt.target.readyState == FileReader.DONE) {
console.log(">>>> 12 <<<< text is : "+event.target.result);
// the textarea being edited // saved from for new files...
// ... and used to display files read by FileReader
document.getElementById("TextAREAx1xMAIN").innerText = event.target.result;
}
};

Use FileAPI to download big generated data file

The JavaScript process generates a lot of data (200-300MB). I would like to save this data for further analysis but the best I found so far is saving using this example http://jsfiddle.net/c2U2T/ which is not an option for me, because it looks like it requires all the data being available before starting the downloading. But what I need is something like
var saver = new Saver();
saver.save(); // The Save As ... dialog appears
saver.onaccepted = function () { // user accepted saving
for (var i = 0; i < 1000000; i++) {
saver.write(Math.random());
}
};
Of course, instead of the Math.random() will be some meaningful construction.
#dader - I would build upon dader's example.
Use HTML5 FileSystem API - but instead of writing to the file each and every line (more IO than it is worth), you can batch some of the lines in memory in a javascript object/array/string, and only write it to the file when they reach a certain threshold. You are thus appending to a local file as the process chugs (makes it easy to pause/restart/stop etc)
Of note is the following, which is an example of how you can spawn the dialoge to request the amount of data that you would need (it sounds large). Tested in chrome.:
navigator.persistentStorage.queryUsageAndQuota(
function (usage, quota) {
var availableSpace = quota - usage;
var requestingQuota = args.size + usage;
if (availableSpace >= args.size) {
window.requestFileSystem(PERSISTENT, availableSpace, persistentStorageGranted, persistentStorageDenied);
} else {
navigator.persistentStorage.requestQuota(
requestingQuota, function (grantedQuota) {
window.requestFileSystem(PERSISTENT, grantedQuota - usage, persistentStorageGranted, persistentStorageDenied);
}, errorCb
);
}
}, errorCb);
When you are done you can use Javascript to open a new window with the url of that blob object that you saved which you can retrieve via: fileEntry.toURL()
OR - when it is done crunching you can just display that URL in an html link and then they could right click on it and do whatever Save Link As that they want.
But this is something that is new and cool that you can do entirely in the browser without needing to involve a server in any way at all. Side note, 200-300MB of data generated by a Javascript Process sounds absolutely huge... that would be a concern for whether you are storing the "right" data...
What you actually are trying to do is a kind of streaming. I mean FileAPI is not suited for the task. Instead, I could suggest two options :
The first, using XHR facility, ie ajax, by splitting your data into several chunks which will sequencially be sent to the server, each chunk in its own request along with an id ( for identifying the stream ) and a position index ( for identifying the chunk position ). I won't recommend that, since it adds work to break up and reassemble data, and since there's a better solution.
The second way of achieving this is to use Websocket API. It allows you to send data sequentially to the server as it is generated. Following a usual stream API. I think you definitely need this.
This page may be a good place to start at : http://binaryjs.com/
That's all folks !
EDIT considering your comment :
I'm not sure to perfectly get your point though but, what about HTML5's FileSystem API ?
There are a couple examples here : http://www.html5rocks.com/en/tutorials/file/filesystem/ among which this sample that allows you to append data to an existant file. You can also create a new file, etc. :
function onInitFs(fs) {
fs.root.getFile('log.txt', {create: false}, function(fileEntry) {
// Create a FileWriter object for our FileEntry (log.txt).
fileEntry.createWriter(function(fileWriter) {
fileWriter.seek(fileWriter.length); // Start write position at EOF.
// Create a new Blob and write it to log.txt.
var blob = new Blob(['Hello World'], {type: 'text/plain'});
fileWriter.write(blob);
}, errorHandler);
}, errorHandler);
}
EDIT 2 :
What you're trying to do is not possible using javascript as said on SO here. Tha author nonetheless suggest to use Java Applet to achieve needed behaviour.
To put it in a nutshell, HTML5 Filesystem API only provides a sandboxed filesystem, ie located in some hidden directory of the browser. So if you want to access the true filesystem, using java would be just fine considering your use case. I guess there is an interface between java and javascript here.
But if you want to make your data only available from the browser ( constrained by same origin policy ), use FileSystem API.

Categories