Calculate the correct md5 value in Javascript - javascript

I'm implementing a function to match the md5 value from both frontend and backend.
In the frontend, I'm using SparkMD5 to calculate the md5 value.
However, after trying two files (100mb, 1gb)
Only 100mb file's md5 value could match the value generated from the Mac terminal "md5 FILENAME"
The 1gb file has different values.
Any idea?
JS code
let readFileT0 = performance.now();
var reader = new FileReader();
reader.onload = function() {
var hexHash = SparkMD5.hash(reader.result); // hex hash
console.log("Hex hash:"+hexHash);
console.log("Time elapsed:");
console.log(performance.now() - readFileT0)
};
reader.readAsBinaryString(file);

Related

Client-Side calculated MD5 hash using CryptoJS is differnt to terminal calculation

I have integrated a file upload into my web app. The file should not be uploaded but the MD5 hash of the file should be calculated on the client side and then only this hash should be sent to the server.
Javascript part:
if (input.files && input.files[0]) {
let reader = new FileReader();
reader.onload = (e) => {
let data = e.target.result;
var hashed = CryptoJS.MD5(data);
console.log('hashed: ' + hashed);
}
reader.readAsDataURL(input.files[0]);
}
However, the code above gives me different hash as terminal does (md5sum ).Terminal gives me the same hash as various online converters.
It is the same with SHA1 and SHA256 algorithms I tried.
Example:
This image from Wikipedia gives the following hashes.
Terminal: e5d23cb99614778b2acb163b8ee90810
CryptoJS: 468641711626fcfe6d956ddb21ccd4c7
readAsDataURL() is going to return a base64 string (with a data URI preamble) so that's what your hashing however an MD5 terminal tool is just going to read the raw bytes & hash them as-is.
To fix use:
reader.readAsArrayBuffer(input.files[0]);
to fetch the raw bytes and:
var hashed = CryptoJS.MD5(CryptoJS.lib.WordArray.create(data));
to pass them to CryptoJs in a format it can process.

How to pass image frames camera to a function in wasm (C++)?

I'm trying to build a C++ function and compile it to Wasm using Emscripten.
What this function will do is receive an image and do some process on it and return a result.
My first POC was successful, the user upload image using file input and I pass the data of the image using FileReader API:
const fileReader = new FileReader();
fileReader.onload = (event) => {
const uint8Arr = new Uint8Array(event.target.result);
passToWasm(event.target.result);
};
fileReader.readAsArrayBuffer(file); // I got this `file` from `change` event of the file input.
But when I implemented the camera feed and started to get frames to pass it to Wasm, I started to get exceptions in C++ side, and here's the JS implementation:
let imageData = canvasCtx.getImageData(0, 0, videoWidth, videoHeight);
var data=imageData.data.buffer;
var uint8Arr = new Uint8Array(data);
passToWasm(uint8Arr);
This one throws an exception in C++ side.
Now passToWasm implementation is:
function passToWasm(uint8ArrData) {
// copying the uint8ArrData to the heap
const numBytes = uint8ArrData.length * uint8ArrData.BYTES_PER_ELEMENT;
const dataPtr = Module._malloc(numBytes);
const dataOnHeap = new Uint8Array(Module.HEAPU8.buffer, dataPtr, numBytes);
dataOnHeap.set(uint8ArrData);
// calling the Wasm function
const res = Module._myWasmFunc(dataOnHeap.byteOffset, uint8ArrData.length);
}
While the C++ implementation will be something like this:
void EMSCRIPTEN_KEEPALIVE checkImageQuality(uint8_t* buffer, size_t size) {
// I'm using OpenCV in C++ to process the image data
// So I read the data of the image
cv::Mat raw_data = cv::Mat(1, size, CV_8UC1, buffer);
// Then I convert it
cv::Mat img_data = cv::imdecode(raw_data, cv::IMREAD_COLOR | cv::IMREAD_IGNORE_ORIENTATION);
// in one of the following steps I'm using cvtColor function which causes the issue for some reason
}
The exception I'm getting because of the camera implementation says:
OpenCV(4.1.0-dev) ../modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'
What is the difference between using file input and getting the data to pass it, and getting the data from a canvas as long as both of them are convert it to Uint8Array
I found a solution for this (maybe suits my case only).
When you're trying to get an image data from canvas you get it as 4 channels (RGBA like in PNG), and depending on your image processing code you need to deal with it.
My code was considering that the image should be 3 channels (RGB like in jpeg) so I had to convert it using this code:
canvasBuffer.toBlob(function (blob) {
passToWASM(blob);
},'image/jpeg');

Difference between Javascript & C# Image File array

In C#, I read an image file into a base64 encoded string using:
var img = Image.FromFile(#"C:\xxx.jpg");
MemoryStream ms = new MemoryStream();
img.Save(ms, System.Drawing.Imaging.ImageFormat.Gif);
var arr = ms.ToArray();
var b64 = Convert.ToBase64String(arr);
In javascript, I do this:
var f = document.getElementById('file').files[0], //C:\xxx.jpg
r = new FileReader();
r.onloadend = function (e) {
console.log(btoa(e.target.result));
}
r.readAsBinaryString(f);
I'm getting different results.
Ultimately, I'm trying to read it in Javascript, POST that to an API, save in a database, and then retrieve & use later. If I use the C# code above, I can store the file and retrieve/display it fine. If I try the Javascript option, I don't get a valid image. So I'm trying to work out why they're different / where the problem is. Any pointers?

JavaScript FileReader: Ensure variable is populated with multiple asynchronous file reads

Basically, I'm reading local files to display the data contents to the user. The files have a metadata text section followed by a large binary section. The metadata contains vital information needed to correctly parse the binary section. So, the pattern I'm using to parse the file is the following:
Get the text and binary offsets
Parse metadata and save info about binary parsing
Parse the binary data using the info from step 2
I've setup multiple FileReaders to accomplish this, and everything seems to be working. However, during development I had to be careful about how the intermediate data is saved so it was available for the binary parsing step.
Here's the basic code I've created, with the long file parsing details removed for better readability:
function setupReader(obj) {
var reader = new FileReader();
reader.addEventListener("loadend", function(evt) {
...get start/end locations for text and data sections
parseText(obj);
parseData(obj);
});
var blob = obj.file.slice(0, 58);
reader.readAsBinaryString(blob);
}
function parseText(obj) {
var reader = new FileReader();
reader.addEventListener("loadend", function(evt) {
...do lots of stuff and record new properties in obj
// save obj to scope so it's available to parse data section
$scope.file_obj = obj;
});
var blob = obj.file.slice(obj.text_begin, obj.text_end);
reader.readAsBinaryString(blob);
}
function parseData(obj) {
var reader = new FileReader();
reader.addEventListener("loadend", function(evt) {
...populate array in $scope.file_obj from binary data
});
var blob = obj.file.slice(obj.data_begin, obj.data_end);
reader.readAsBinaryString(blob);
}
My question is whether this pattern guarantees that $scope.file_obj will be available in parseData()?
This seems to be the case, however parseText() happens quite fast, so I'm not sure if I'm just lucky that it finished in time. I want to be sure I understand the behaviour.
Thanks!
To ensure that the data is available on the $scope, I would rather call parseData from inside the 'loadend' event handler in parseText. Otherwise, as you say, you might just be lucky.
Best.

encode/decode image with base64 breaks image

I am trying to encode and decode an image. I am using the FileReader's readAsDataURL method to convert the image to base64. Then to convert it back I have tried using readAsBinaryString() and atob() with no luck. Is there another way to persist images without base64 encoding them?
readAsBinaryString()
Starts reading the contents of the specified Blob, which may be a
File. When the read operation is finished, the readyState will become
DONE, and the onloadend callback, if any, will be called. At that
time, the result attribute contains the raw binary data from the file.
Any idea what I'm doing wrong here?
Sample Code
http://jsfiddle.net/qL86Z/3/
$("#base64Button").on("click", function () {
var file = $("#base64File")[0].files[0]
var reader = new FileReader();
// callback for readAsDataURL
reader.onload = function (encodedFile) {
console.log("reader.onload");
var base64Image = encodedFile.srcElement.result.split("data:image/jpeg;base64,")[1];
var blob = new Blob([base64Image],{type:"image/jpeg"});
var reader2 = new FileReader();
// callback for readAsBinaryString
reader2.onloadend = function(decoded) {
console.log("reader2.onloadend");
console.log(decoded); // this should contain binary format of the image
// console.log(URL.createObjectURL(decoded.binary)); // Doesn't work
};
reader2.readAsBinaryString(blob);
// console.log(URL.createObjectURL(atob(base64Image))); // Doesn't work
};
reader.readAsDataURL(file);
console.log(URL.createObjectURL(file)); // Works
});
Thanks!
After some more research I found the answer from here
I basically needed to wrap the raw binary in an arraybuffer and convert the binary chars to Unicode.
This is the code that was missing,
var binaryImg = atob(base64Image);
var length = binaryImg.length;
var ab = new ArrayBuffer(length);
var ua = new Uint8Array(ab);
for (var i = 0; i < length; i++) {
ua[i] = binaryImg.charCodeAt(i);
}
The full sample code is here
URL.createObjectURL expects a Blob (which can be a File) as its argument. Not a string. That's why URL.createObjectURL(file) works.
Instead, you are creating a FileReader reader that reads file as a data url, then you use that data url to create another Blob (with the same contents). And then you even create a reader2 to get a binary string from the just constructed blob. However, neither the base64Image url string part (even if btoa-decoded to a larger string) nor the decoded.binary string are vaild arguments to URL.createObjectURL!

Categories