Point FileReader.readAsArrayBuffer to WebAssembly's linear memory

Point FileReader.readAsArrayBuffer to WebAssembly's linear memory - javascript

I'm working with WebAssembly right now, and wanted to be able to pass a file's bytes from JavaScript to Rust code. Currently, I've exposed the following Rust struct to JavaScript:
use js_sys::{Function, Uint8Array};
use wasm_bindgen::prelude::{wasm_bindgen, JsValue};
#[wasm_bindgen]
pub struct WasmMemBuffer {
buffer: Vec<u8>,
}
#[wasm_bindgen]
impl WasmMemBuffer {
#[wasm_bindgen(constructor)]
pub fn new(byte_length: u32, f: &Function) -> Self {
let mut buffer = vec![0; byte_length as usize];
unsafe {
let array = Uint8Array::view(&mut buffer);
f.call1(&JsValue::NULL, &JsValue::from(array))
.expect("The callback function should not throw");
}
Self { buffer }
}
}
This struct, WasmMemBuffer, will point to a place in WebAssembly memory, meaning that I can write to it from JavaScript without any expensive copy operations.
import { WasmMemBuffer } from 'rust-crate';
const buf = new WasmMemBuffer(1000, (arr: Uint8Array) => {
for (let i = 0; i < arr.length; i++) {
arr[i] = 1.0; // Initialize the buffer with ones
}
});
This works great. However, I'd like to read a user-submitted file's contents directly into WebAssembly's linear memory, without creating an ArrayBuffer, converting it to a Uint8Array, and copying its values to the WasmMemBuffer. As I understand it, FileReader allocates a new ArrayBuffer when calling readAsArrayBuffer. I'd like to point it to the WasmMemBuffer instead. Is there a way I can do this? For reference, this is the code I'm using to read the file.
// e is the onchange event from a <input type="file"> element
function readFile(e: Event) {
const target = e?.target as HTMLInputElement;
const file = target?.files?.[0];
if (!target || !file) {
return;
}
const reader = new FileReader();
reader.onload = (e) => {
const res = e.target?.result as ArrayBuffer;
if (res) {
// Do something with res
}
};
reader.readAsArrayBuffer(file);
}

Related

How to pass an array of object from javascript to webAssembly

I know it is possible to pass arrays of integer to web assembly using something like this :
const bin = ...; // WebAssembly binary, I assume below that it imports a memory from module "imports", field "memory".
const module = new WebAssembly.Module(bin);
const memory = new WebAssembly.Memory({ initial: 2 }); // Size is in pages.
const instance = new WebAssembly.Instance(module, { imports: { memory: memory } });
const arrayBuffer = memory.buffer;
const buffer = new Uint8Array(arrayBuffer);
I have read a lot of documentation and some questions that looks like what I was looking for :
How to pass an array of objects to WebAssembly and convert it to a vector of structs with wasm-bindgen?
Pass a JavaScript array as argument to a WebAssembly function
https://becominghuman.ai/passing-and-returning-webassembly-array-parameters-a0f572c65d97
https://rob-blackbourn.github.io/blog/webassembly/wasm/array/arrays/javascript/c/2020/06/07/wasm-arrays.html
And yet none of those answered my question.
here is a small example of AssemblyScript that describe the kind of function I would like to use :
class Dummy {
constructor()
}
export function getDummy(): Dummy {
return new Dummy();
}
export function workWithDummy(dummies: Dummy[] = []) {
// do something
}
and in the javaScript code :
const fs = require('fs');
const {resolve} = require('../utils/utils.js');
const env = {
abort: (message, filename, line, column) => {
throw new Error(`${message} in ${filename} at ${line}:${column}`);
}
};
module.exports = fs.promises.readFile(resolve('parser.wasm')).then(buffer => {
return WebAssembly.instantiate(buffer, {env: env}).then(wasmModule => {
const module = wasmModule.instance.exports;
module.workWithDummy([module.getDummy()]); //won't work
});
});
I am running this code in nodeJs 18.1.0
To sum up my question is : how to make this line work ?
module.workWithDummy([module.getDummy()]); //won't work

file input files not read onChange on mobile

I'm building a puzzle app in React that allows the user to upload their own puzzles. This works fine on the web (the user clicks the input's label and it opens a dialog. When the user picks a file the onChange event is triggered), but on mobile, or at least on Chrome on Android, the files are not read...
This is where the input is declared:
<div className="file-input-wrapper">
<label for="puzzleUpload" className="button-dark">Upload Puzzle(s)</label>
<input type="file"
accept="application/json"
multiple
id="puzzleUpload"
onChange={handleFiles}/>
</div>
and this is the handleFiles() method
// when a file is uploaded, this checks to see that it's the right type, then adds it to the puzzle list
const handleFiles = () => {
var selectedFiles = document.getElementById('puzzleUpload').files;
// checks if the JSON is a valid puzzle
const validPuzzle = (puzzle) => {
let keys = ["name", "entitySetID", "logic", "size"];
return keys.every((key) => {return puzzle.hasOwnProperty(key)});
};
const onLoad = (event) => {
let puzzle = JSON.parse(event.target.result);
if(validPuzzle(puzzle)) {
appendPuzzleList(puzzle);
}
else {
console.log("JSON file does not contain a properly formatted Logike puzzle")
}
};
//checks the file type before attempting to read it
for (let i = 0; i < selectedFiles.length; i++) {
if(selectedFiles[i].type === 'application/json') {
//creates new readers so that it can read many files sequentially.
var reader = new FileReader();
reader.onload = onLoad;
reader.readAsText(selectedFiles[i]);
}
}
};
A working prototype with the most recent code can be found at http://logike.confusedretriever.com and it's possible to quickly write compatible JSON using the builder in the app.
I've been looking up solutions for the past hour and a half and have come up empty handed, so any help would be greatly appreciated! I read the FileReader docs, and everything seems to be supported, so I'm kind of stumped.
Interestingly, the file IS selected (you can see the filename in the ugly default version of the input once it's selected, but I hide it via CSS), so I'm tempted to implement a mobile-only button to trigger the event, if there isn't a more legit solution...

Chrome uses the OS's list of known MIME Types.
I guess Android doesn't know about "application/json", and at least, doesn't map the .json extension to this MIME type, this means that when you upload your File in this browser, you won't have the correct type property set, instead, it is set to the empty string ("").
But anyway, you shouldn't trust this type property, ever.
So you could always avoid some generic types, like image/*, video/*, but the only reliable way to know if it was a valid JSON file or not will be by actually reading the data contained in your file.
But I understand you don't want to start this operation if your user provides a huge file, like a video.
One simple solution might be to check the size property instead, if you know in which range your generated files might come.
One less simple but not so hard either solution would be to prepend a magic number (a.k.a File Signature)to your generated files (if your app is the only way to handle these files).
Then you would just have to check this magic number only before going to read the whole file:
// some magic-number (here "•MJS")
const MAGIC_NB = new Uint8Array([226, 128, 162, 77, 74, 83]);
// creates a json-like File, with our magic_nb prepended
function generateFile(data) {
const str = JSON.stringify(data);
const blob = new Blob([MAGIC_NB, str], {
type: 'application/myjson' // won't be used anyway
});
return new File([blob], 'my_file.json');
}
// checks whether the provided blob starts with our magic numbers or not
function checkFile(blob) {
return new Promise((res, rej) => {
const reader = new FileReader();
reader.onload = e => {
const arr = new Uint8Array(reader.result);
res(!arr.some((v, i) => MAGIC_NB[i] !== v));
};
reader.onerror = rej;
// read only the length of our magic nb
reader.readAsArrayBuffer(blob.slice(0, MAGIC_NB.length));
});
}
function handleFile(file) {
return checkFile(file).then(isValid => {
if (isValid) {
return readFile(file);
} else {
throw new Error('invalid file');
}
});
}
function readFile(file) {
return new Promise((res, rej) => {
const reader = new FileReader();
reader.onload = e => res(JSON.parse(reader.result));
reader.onerror = rej;
// don't read the magic_nb part again
reader.readAsText(file.slice(MAGIC_NB.length));
});
}
const my_file = generateFile({
key: 'value'
});
handleFile(my_file)
.then(obj => console.log(obj))
.catch(console.error);
And in the same way note that all browsers won't accept all the schemes for the accept attribute, and that you might want to double your MIME notation with a simple extension one (anyway even MIMEs are checked only against this extension).

Correct MD5 hash in javascript for all filetypes

Using the library here: https://github.com/blueimp/JavaScript-MD5
I am attempting to correctly hash files in MD5 using javascript.
So far I get correct hashes for text files but if I attempt to hash an image file I get an incorrect hash.
This could be due to how the javascript FileReader reads the larger image files. I have tried readAsBinaryString(), readAsArrayBuffer() and readAsText() none of which provide the correct hash with the given library.
How should I be reading the file for this to provide a correct hash for all filetypes, is there a more appropriate library that works for all filetypes I should be using?
HTML:
<input id="file-to-hash" type=file>
<button onclick="hashFile()">Hash</button>
Javascript:
function hashFile() {
var file = document.getElementById('file-to-hash').files[0];
var reader = new FileReader();
reader.readAsArrayBuffer(file);
reader.onload = readSuccess;
}
function readSuccess(evt){
fileContents = evt.target.result;
var hash = md5(fileContents);
}

There is now the SubtleCrypto API and its subtle.digest method.
You won't be able to get an MD5 hash from this API, because MD5 is not considered secure anymore.
But you'll be able to get an hash with other (more-secure) algorithms, such as SHA.
function getHash(buffer, algo = "SHA-256") {
return crypto.subtle.digest(algo, buffer)
.then(hash => {
// here hash is an arrayBuffer, so we'll convert it to its hex version
let result = '';
const view = new DataView(hash);
for (let i = 0; i < hash.byteLength; i += 4) {
result += ('00000000' + view.getUint32(i).toString(16)).slice(-8);
}
return result;
});
}
f.onchange = e => {
const fR = new FileReader();
fR.onload = e => getHash(fR.result)
.then(hash => console.log(hash))
// Chrome only accept it from an secure origin
.catch(e => {
if (e.code === 9) {
console.log(`Be sure to be on the https page :
https://stackoverflow.com/questions/44036218/`)
} else {
console.log(e.message)
}
})
fR.readAsArrayBuffer(f.files[0]);
}
<input type="file" id="f">

Calculate SHA-1 checksum of local html5 video file using JavaScript

When a video on my local storage—let's say it's currently located at file:///home/user/video.m4v—is opened by dragging it into a new tab in Chrome, how can I calculate the SHA-1 checksum for the file using JavaScript?
Purpose:
I am planning to write a Chrome extension which will store the calculated checksum of videos (files with extensions matching a pattern) as localStorage objects in order to save the playback position of video upon tab close and then restore it when the file is loaded again, even if the location or filename of the video is changed.

You need a crypto library for this. A well known one is Google CryptoJS.
I found this as an specific example for your task: https://gist.github.com/npcode/11282867
After including the crypto-js source:
function sha1sum() {
var oFile = document.getElementById('uploadFile').files[0];
var sha1 = CryptoJS.algo.SHA1.create();
var read = 0;
var unit = 1024 * 1024;
var blob;
var reader = new FileReader();
reader.readAsArrayBuffer(oFile.slice(read, read + unit));
reader.onload = function(e) {
var bytes = CryptoJS.lib.WordArray.create(e.target.result);
sha1.update(bytes);
read += unit;
if (read < oFile.size) {
blob = oFile.slice(read, read + unit);
reader.readAsArrayBuffer(blob);
} else {
var hash = sha1.finalize();
console.log(hash.toString(CryptoJS.enc.Hex)); // print the result
}
}
}
I wouldn't recommend to calculate a hash over the whole video file as it can be pretty resource consuming depending on the file size. Maybe you can use just the meta information or reconsider about the filename and filepath again?

Web APIs have progressed considerably since I asked this question. Calculating a hex digest is now possible using the built-in SubtleCrypto.digest().
TS Playground link
function u8ToHex (u8: number): string {
return u8.toString(16).padStart(2, '0');
}
/** Ref: https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/digest#supported_algorithms */
const supportedAlgorithms = [
'SHA-1',
'SHA-256',
'SHA-384',
'SHA-512',
] as const;
type SupportedAlgorithm = typeof supportedAlgorithms[number];
type Message = string | Blob | BufferSource;
async function hexDigest (
algorithm: SupportedAlgorithm,
message: Message,
): Promise<string> {
let buf: BufferSource;
if (typeof message === 'string') buf = new TextEncoder().encode(message);
else if (message instanceof Blob) buf = await message.arrayBuffer();
else buf = message;
const hash = await crypto.subtle.digest(algorithm, buf);
return [...new Uint8Array(hash)].map(u8ToHex).join('');
}

How to calculate md5 hash of a file using javascript

Is there a way to calculate the MD5 hash of a file before the upload to the server using Javascript?

While there are JS implementations of the MD5 algorithm, older browsers are generally unable to read files from the local filesystem.
I wrote that in 2009. So what about new browsers?
With a browser that supports the FileAPI, you can read the contents of a file - the user has to have selected it, either with an <input> element or drag-and-drop. As of Jan 2013, here's how the major browsers stack up:
FF 3.6 supports FileReader, FF4 supports even more file based functionality
Chrome has supported the FileAPI since version 7.0.517.41
Internet Explorer 10 has partial FileAPI support
Opera 11.10 has partial support for FileAPI
Safari - I couldn't find a good official source for this, but this site suggests partial support from 5.1, full support for 6.0. Another article reports some inconsistencies with the older Safari versions
How?
See the answer below by Benny Neugebauer which uses the MD5 function of CryptoJS

I've made a library that implements incremental md5 in order to hash large files efficiently.
Basically you read a file in chunks (to keep memory low) and hash it incrementally.
You got basic usage and examples in the readme.
Be aware that you need HTML5 FileAPI, so be sure to check for it.
There is a full example in the test folder.
https://github.com/satazor/SparkMD5

it is pretty easy to calculate the MD5 hash using the MD5 function of CryptoJS and the HTML5 FileReader API. The following code snippet shows how you can read the binary data and calculate the MD5 hash from an image that has been dragged into your Browser:
var holder = document.getElementById('holder');
holder.ondragover = function() {
return false;
};
holder.ondragend = function() {
return false;
};
holder.ondrop = function(event) {
event.preventDefault();
var file = event.dataTransfer.files[0];
var reader = new FileReader();
reader.onload = function(event) {
var binary = event.target.result;
var md5 = CryptoJS.MD5(binary).toString();
console.log(md5);
};
reader.readAsBinaryString(file);
};
I recommend to add some CSS to see the Drag & Drop area:
#holder {
border: 10px dashed #ccc;
width: 300px;
height: 300px;
}
#holder.hover {
border: 10px dashed #333;
}
More about the Drag & Drop functionality can be found here: File API & FileReader
I tested the sample in Google Chrome Version 32.

The following snippet shows an example, which can archive a throughput of 400 MB/s while reading and hashing the file.
It is using a library called hash-wasm, which is based on WebAssembly and calculates the hash faster than js-only libraries. As of 2020, all modern browsers support WebAssembly.
const chunkSize = 64 * 1024 * 1024;
const fileReader = new FileReader();
let hasher = null;
function hashChunk(chunk) {
return new Promise((resolve, reject) => {
fileReader.onload = async(e) => {
const view = new Uint8Array(e.target.result);
hasher.update(view);
resolve();
};
fileReader.readAsArrayBuffer(chunk);
});
}
const readFile = async(file) => {
if (hasher) {
hasher.init();
} else {
hasher = await hashwasm.createMD5();
}
const chunkNumber = Math.floor(file.size / chunkSize);
for (let i = 0; i <= chunkNumber; i++) {
const chunk = file.slice(
chunkSize * i,
Math.min(chunkSize * (i + 1), file.size)
);
await hashChunk(chunk);
}
const hash = hasher.digest();
return Promise.resolve(hash);
};
const fileSelector = document.getElementById("file-input");
const resultElement = document.getElementById("result");
fileSelector.addEventListener("change", async(event) => {
const file = event.target.files[0];
resultElement.innerHTML = "Loading...";
const start = Date.now();
const hash = await readFile(file);
const end = Date.now();
const duration = end - start;
const fileSizeMB = file.size / 1024 / 1024;
const throughput = fileSizeMB / (duration / 1000);
resultElement.innerHTML = `
Hash: ${hash}<br>
Duration: ${duration} ms<br>
Throughput: ${throughput.toFixed(2)} MB/s
`;
});
<script src="https://cdn.jsdelivr.net/npm/hash-wasm"></script>
<!-- defines the global `hashwasm` variable -->
<input type="file" id="file-input">
<div id="result"></div>

HTML5 + spark-md5 and Q
Assuming your'e using a modern browser (that supports HTML5 File API), here's how you calculate the MD5 Hash of a large file (it will calculate the hash on variable chunks)
function calculateMD5Hash(file, bufferSize) {
var def = Q.defer();
var fileReader = new FileReader();
var fileSlicer = File.prototype.slice || File.prototype.mozSlice || File.prototype.webkitSlice;
var hashAlgorithm = new SparkMD5();
var totalParts = Math.ceil(file.size / bufferSize);
var currentPart = 0;
var startTime = new Date().getTime();
fileReader.onload = function(e) {
currentPart += 1;
def.notify({
currentPart: currentPart,
totalParts: totalParts
});
var buffer = e.target.result;
hashAlgorithm.appendBinary(buffer);
if (currentPart < totalParts) {
processNextPart();
return;
}
def.resolve({
hashResult: hashAlgorithm.end(),
duration: new Date().getTime() - startTime
});
};
fileReader.onerror = function(e) {
def.reject(e);
};
function processNextPart() {
var start = currentPart * bufferSize;
var end = Math.min(start + bufferSize, file.size);
fileReader.readAsBinaryString(fileSlicer.call(file, start, end));
}
processNextPart();
return def.promise;
}
function calculate() {
var input = document.getElementById('file');
if (!input.files.length) {
return;
}
var file = input.files[0];
var bufferSize = Math.pow(1024, 2) * 10; // 10MB
calculateMD5Hash(file, bufferSize).then(
function(result) {
// Success
console.log(result);
},
function(err) {
// There was an error,
},
function(progress) {
// We get notified of the progress as it is executed
console.log(progress.currentPart, 'of', progress.totalParts, 'Total bytes:', progress.currentPart * bufferSize, 'of', progress.totalParts * bufferSize);
});
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/q.js/1.4.1/q.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/spark-md5/2.0.2/spark-md5.min.js"></script>
<div>
<input type="file" id="file"/>
<input type="button" onclick="calculate();" value="Calculate" class="btn primary" />
</div>

You need to to use FileAPI. It is available in the latest FF & Chrome, but not IE9.
Grab any md5 JS implementation suggested above. I've tried this and abandoned it because JS was too slow (minutes on large image files). Might revisit it if someone rewrites MD5 using typed arrays.
Code would look something like this:
HTML:
<input type="file" id="file-dialog" multiple="true" accept="image/*">
JS (w JQuery)
$("#file-dialog").change(function() {
handleFiles(this.files);
});
function handleFiles(files) {
for (var i=0; i<files.length; i++) {
var reader = new FileReader();
reader.onload = function() {
var md5 = binl_md5(reader.result, reader.result.length);
console.log("MD5 is " + md5);
};
reader.onerror = function() {
console.error("Could not read the file");
};
reader.readAsBinaryString(files.item(i));
}
}

Apart from the impossibility to get
file system access in JS, I would not
put any trust at all in a
client-generated checksum. So
generating the checksum on the server
is mandatory in any case. – Tomalak
Apr 20 '09 at 14:05
Which is useless in most cases. You want the MD5 computed at client side, so that you can compare it with the code recomputed at server side and conclude the upload went wrong if they differ. I have needed to do that in applications working with large files of scientific data, where receiving uncorrupted files were key. My cases was simple, cause users had the MD5 already computed from their data analysis tools, so I just needed to ask it to them with a text field.

If sha256 is also fine:
async sha256(file: File) {
// get byte array of file
let buffer = await file.arrayBuffer();
// hash the message
const hashBuffer = await crypto.subtle.digest('SHA-256', buffer);
// convert ArrayBuffer to Array
const hashArray = Array.from(new Uint8Array(hashBuffer));
// convert bytes to hex string
const hashHex = hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
return hashHex;
}

To get the hash of files, there are a lot of options. Normally the problem is that it's really slow to get the hash of big files.
I created a little library that get the hash of files, with the 64kb of the start of the file and the 64kb of the end of it.
Live example: http://marcu87.github.com/hashme/ and library: https://github.com/marcu87/hashme

hope you have found a good solution by now. If not, the solution below is an ES6 promise implementation based on js-spark-md5
import SparkMD5 from 'spark-md5';
// Read in chunks of 2MB
const CHUCK_SIZE = 2097152;
/**
* Incrementally calculate checksum of a given file based on MD5 algorithm
*/
export const checksum = (file) =>
new Promise((resolve, reject) => {
let currentChunk = 0;
const chunks = Math.ceil(file.size / CHUCK_SIZE);
const blobSlice =
File.prototype.slice ||
File.prototype.mozSlice ||
File.prototype.webkitSlice;
const spark = new SparkMD5.ArrayBuffer();
const fileReader = new FileReader();
const loadNext = () => {
const start = currentChunk * CHUCK_SIZE;
const end =
start + CHUCK_SIZE >= file.size ? file.size : start + CHUCK_SIZE;
// Selectively read the file and only store part of it in memory.
// This allows client-side applications to process huge files without the need for huge memory
fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));
};
fileReader.onload = e => {
spark.append(e.target.result);
currentChunk++;
if (currentChunk < chunks) loadNext();
else resolve(spark.end());
};
fileReader.onerror = () => {
return reject('Calculating file checksum failed');
};
loadNext();
});

There is a couple scripts out there on the internet to create an MD5 Hash.
The one from webtoolkit is good, http://www.webtoolkit.info/javascript-md5.html
Although, I don't believe it will have access to the local filesystem as that access is limited.

This is another hash-wasm example, but using the streams API, instead of having to set FileReader:
async function calculateSHA1(file: File) {
const hasher = await createSHA1()
const hasherStream = new WritableStream<Uint8Array>({
start: () => {
hasher.init()
// you can set UI state here also
},
write: chunk => {
hasher.update(chunk)
// you can set UI state here also
},
close: () => {
// you can set UI state here also
},
})
await file.stream().pipeTo(hasherStream)
return hasher.digest('hex')
}

I don't believe there is a way in javascript to access the contents of a file upload. So you therefore cannot look at the file contents to generate an MD5 sum.
You can however send the file to the server, which can then send an MD5 sum back or send the file contents back .. but that's a lot of work and probably not worthwhile for your purposes.

We Keep Coding

JavaScript is the programming language of the Web.

Point FileReader.readAsArrayBuffer to WebAssembly's linear memory - javascript

Related

How to pass an array of object from javascript to webAssembly

file input files not read onChange on mobile

Correct MD5 hash in javascript for all filetypes

Calculate SHA-1 checksum of local html5 video file using JavaScript

How to calculate md5 hash of a file using javascript

Categories

Resources