I am relatively new to JavaScript and I want to get the hash of a file, and would like to better understand the mechanism and code behind the process.
So, what I need: An MD5 or SHA-256 hash of an uploaded file to my website.
My understanding of how this works: A file is uploaded via an HTML input tag of type 'file', after which it is converted to a binary string, which is consequently hashed.
What I have so far: I have managed to get the hash of an input of type 'text', and also, somehow, the hash of an uploaded file, although the hash did not match with websites I looked at online, so I'm guessing it hashed some other details of the file, instead of the binary string.
Question 1: Am I correct in my understanding of how a file is hashed? Meaning, is it the binary string that gets hashed?
Question 2: What should my code look like to upload a file, hash it, and display the output?
Thank you in advance.
Basically yes, that's how it works.
But, to generate such hash, you don't need to do the conversion to string yourself. Instead, let the SubtleCrypto API handle it itself, and just pass an ArrayBuffer of your file.
async function getHash(blob, algo = "SHA-256") {
// convert your Blob to an ArrayBuffer
// could also use a FileRedaer for this for browsers that don't support Response API
const buf = await new Response(blob).arrayBuffer();
const hash = await crypto.subtle.digest(algo, buf);
let result = '';
const view = new DataView(hash);
for (let i = 0; i < hash.byteLength; i += 4) {
result += view.getUint32(i).toString(16).padStart(2, '0');
}
return result;
}
inp.onchange = e => {
getHash(inp.files[0]).then(console.log);
};
<input id="inp" type="file">
Related
I've got a blob of audio data confirmed to play in the browser but fails to play after storing, retrieving, and conversion of the same data. I've tried a few methods without success, each time returning the error:
Uncaught (in promise) DOMException: Failed to load because no supported source was found
Hasura notes that bytea data must be passed in as a String, so I tried a couple things.
Converting the blob into base64 stores fine but the retrieval and playing of the data doesn't work. I've tried doing conversions within the browser to base64 and then back into blob. I think it's just the data doesn't store properly as bytea if I convert it to base64 first:
// Storing bytea data as base64 string
const arrayBuffer = await blob.arrayBuffer();
const byteArray = new Uint8Array(arrayBuffer);
const charArray = Array.from(byteArray, (x: number) => String.fromCharCode(x));
const encodedString = window.btoa(charArray.join(''));
hasuraRequest....
`
mutation SaveAudioBlob ($input: String) {
insert_testerooey_one(
object: {
blubberz: $input
}
) {
id
blubberz
}
}
`,
{ input: encodedString }
);
// Decoding bytea data
const decodedString = window.atob(encodedString);
const decodedByteArray = new Uint8Array(decodedString.length).map((_, i) =>
decodedString.charCodeAt(i)
);
const decodedBlob = new Blob([decodedByteArray.buffer], { type: 'audio/mpeg' });
const audio4 = new Audio();
audio4.src = URL.createObjectURL(decodedBlob);
audio4.play();
Then I came across a Github issue (https://github.com/hasura/graphql-engine/issues/3336) suggesting the use of a computed field to convert the bytea data to base64, so I tried using that instead of my decoding attempt, only to be met with the same error:
CREATE OR REPLACE FUNCTION public.content_base64(mm testerooey)
RETURNS text
LANGUAGE sql
STABLE
AS $function$
SELECT encode(mm.blobberz, 'base64')
$function$
It seemed like a base64 string was not the way to store bytea data, so I tried converting the data to a hex string prior to storing. It stores ok, I think, but upon retrieval the data doesn't play, and I think it's a similar problem as storing as base64:
// Encoding to hex string
const arrayBuffer = await blob.arrayBuffer();
const byteArray = new Uint8Array(arrayBuffer);
const hexString = Array.from(byteArray, (byte) =>
byte.toString(16).padStart(2, '0')
).join('');
But using the decoded data didn't work again, regardless of whether I tried the computed field method or my own conversion methods. So, am I just not converting it right? Is my line of thinking incorrect? Or what is it I'm doing wrong?
I've got it working if I just convert to base64 and store as a text field but I'd prefer to store as bytea because it takes up less space. I think something's wrong with how the data is either stored, retrieved, or converted, but I don't know how to do it. I know the blob itself is fine because when generated I can play audio with it, it only bugs out after fetching and attempted conversion its stored value. Any ideas?
Also, I'd really like to not store the file in another service like s3, even if drastically simpler.
I have drag&drop event and I would like to hash the filed dragged. I have this:
var file = ev.dataTransfer.items[i].getAsFile();
var hashf = CryptoJS.SHA512(file).toString();
console.log("hashf", hashf)
But when I drag differents files, "hashf" is always the same string.
https://jsfiddle.net/9rfvnbza/1/
The issue is that you are attempting to hash the File object. Hash algorithms expect a string to hash.
When passing the File Object to the CryptoJS.SHA512() method, the API attempts to convert the object to a string. That conversion results in CryptoJS.SHA512() receiving the same string not matter what File object you send provide it.
The string is [object File] - you can replace file in your code with that string and discover it is the same hash code you've see all along.
To fix this, retrieve the text from the file first and pass that to the hashing algorithm:
file.text().then((text) => {
const hashf = CryptoJS.SHA512(text).toString();
console.log("hashf", hashf);
});
If you prefer async/await, you can put it in an IIFE:
(async() => {
const text = await file.text()
const hashf = CryptoJS.SHA512(text).toString();
console.log("hashf", hashf);
})();
So i am dropping a .txt file in an uploader which is converting it into base64 data like this:
const {getRootProps, getInputProps} = useDropzone({
onDrop: async acceptedFiles => {
let font = ''; // its not actually a font just reusing some code i'll change it later its a .txt file so wherever you see font assume its NOT a font.
let reader = new FileReader();
let filename = acceptedFiles[0].name.split(".")[0];
console.log(filename);
reader.readAsDataURL(acceptedFiles[0]);
reader.onload = await function (){
font = reader.result;
console.log(font);
dispatch({type:'SET_FILES',payload:font})
};
setFontSet(true);
}
});
Then a POST request is made to the node js server and I indeed receive the base64 value. I then proceed to convert it back into a .txt file by writing it into a file called signals.txt like this:
server.post('/putInDB',(req,res)=>{
console.log(req.body);
var bitmap = new Buffer(req.body.data, 'base64');
let dirpath = `${process.cwd()}/signals.txt`;
let signalPath = path.normalize(dirpath);
connection.connect();
fs.writeFile(signalPath, bitmap, async (err) => {
if (err) throw err;
console.log('Successfully updated the file data');
//all the ending brackets and stuff
Now the thing is the orignal file looks like this :
Time,1,2,3,4,5,6,7,8,9,10,11,12
0.000000,7.250553,14.951141,5.550423,2.850217,-1.050080,-3.050233,1.850141,2.850217,-3.150240,1.350103,-2.950225,1.150088
But the file when writing back from base64 looks like this :
u«Zµìmþ™ZŠvÚ±î¸Time,1,2,3,4,5,6,7,8,9,10,11,12
0.000000,1.250095,0.250019,-4.150317,-0.350027,3.650278,1.950149,0.950072,-1.250095,-1.150088,-7.750591,-1.850141,-0.050004
See the weird characters in the beginning ? Why is this happening.
Remember to read up on what the functions you use do, because you're using readAsDataURL which does not give you the base64 encoded version of your data: it gives you Data-URL, and Data-URLs have a header prefix to tell URL parsers what kind of data this will be, and how to decode the data directly following the header.
To quote the MDN article:
Note: The blob's result cannot be directly decoded as Base64 without first removing the Data-URL declaration preceding the Base64-encoded data. To retrieve only the Base64 encoded string, first remove data:*/*;base64, from the result.
If you don't, blindly converting the Data-URL from base64 to plain text will give you some nonsense data at the start:
> Buffer.from('data:*/*;base64', 'base64').toString('utf-8')
'u�Z���{�'
Which raises another point: you would have caught this with POST data validation, because the Data-URL that you sent contains characters that are not allowed in base64. POST validation is always a good idea.
I know this isn't the exact code, but it is difficult to reproduce your problem with the code you provided. But the data you are sending needs to be a URL/URI encoded form.
So essentially:
encodeURI(base64data);
Encode URI is built into javascript: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI
EDIT:
I saw you used the function readDataAsUrl(), but try using the encodeURI function and then readDataAsUrl().
So in a client-side HTML page, a user selects a file and uploads it to the JavaScript code. JavaScript parses the file and sends it to the server and back to everyone else who is on the site. Then every client makes a blob download link for the file. It's easy when I can send the file to server and back like this.
But now, I want to make that file available for future users of the site without saving it to a location. This is in a chat program, so I've been sending messages from users as strings to a database. I'd like to create a program to send the aforementioned File object to the shortest string possible and then recreate the file (including all metadata) at another client from this string.
What is the standard way to convert a Blob to a string and back again without losing anything? If there's multiple ways, what results in the shortest string?
I found the answer to my question, I had to modify some other answers from SO questions that only sorta applied to my question. Here's what I found:
This is on the uploading-client, in the function called when a file is uploaded:
let inp = document.getElementById("file_input");
let reader = new FileReader();
reader.onload = function(){
send_off_to_other_clients(reader.result);
}
reader.readAsBinaryString(inp.files[0]);
On the other clients:
<script>
function get_blob_from_string (string, type, name) {
let array = new Uint8Array(string.length);
for (let i = 0; i < string.length; i++){
array[i] = string.charCodeAt(i);
}
let end_file = new Blob([array], {type: type, name: name});
let a = document.createElement("a");
a.href = URL.createObjectURL(end_file);
a.download = name;
a.target = "_blank";
a.click();
}
</script>
end_file is the returned-to-blob version, and then I create an anchor tag to download it. Probably isn't "proper" but it works.
I want to read and write to a file in a specific way.
An example file could be:
name1:100
name2:400
name3:7865786
...etc etc
What would be the best way to read this data in and store in, and eventually write it out?
I don't know which type of data structure to use? I'm still fairly new to javascript.
I want to be able to determine if any key,values are matching.
For example, if I were to add to the file, I could see that name1 is already in the file, and I just edit the value instead of adding a duplicate.
You can use localStorage as a temporary storage between reads and writes.
Though, you cannot actually read and write to a user's filesystem at will using client side JavaScript. You can however request the user to select a file to read the same way you can request the user to save the data you push, as a file.
localStorage allow you to store the data as key-value pairs and it's easy to check if an item exists or not. Optionally simply use a literal object which basically can do the same but only exists in memory. localStorage can be saved between sessions and navigation between pages.
// set some data
localStorage.setItem("key", "value");
// get some data
var data = localStorage.getItem("key");
// check if key exists, set if not (though, you can simply override the key as well)
if (!localStorage.getItem("key")) localStorage.setItem("key", "value");
The method getItem will always return null if the key doesn't exist.
But note that localStorage can only store strings. For binary data and/or large sizes, look into Indexed DB instead.
To read a file you have to request the user to select one (or several):
HTML:
<label>Select a file: <input type=file id=selFile></label>
JavaScript
document.getElementById("selFile").onchange = function() {
var fileReader = new FileReader();
fileReader.onload = function() {
var txt = this.result;
// now we have the selected file as text.
};
fileReader.readAsText(this.files[0]);
};
To save a file you can use File objects this way:
var file = new File([txt], "myFilename.txt", {type: "application/octet-stream"});
var blobUrl = (URL || webkitURL).createObjectURL(file);
window.location = blobUrl;
The reason for using octet-stream is to "force" the browser to show a save as dialog instead of it trying to show the file in the tab, which would happen if we used text/plain as type.
So, how do we get the data between these stages. Assuming you're using key/value approach and text only you can use JSON objects.
var file = JSON.stringify(localStorage);
Then send to user as File blob shown above.
To read you will have to either manually parse the file format if the data exists in a particular format, or if the data is the same as you save out you can read in the file as shown above, then convert it from string to an object:
var data = JSON.parse(txt); // continue in the function block above
Object.assign(localStorage, data); // merge data from object with localStorage
Note that you may have to delete items from the storage first. There is also the chance other data have been stored there so these are cases that needs to be considered, but this is the basis of one approach.
Example
// due to security reasons, localStorage can't be used in stacksnippet,
// so we'll use an object instead
var test = {"myKey": "Hello there!"}; // localStorage.setItem("myKey", "Hello there!");
document.getElementById("selFile").onchange = function() {
var fileReader = new FileReader();
fileReader.onload = function() {
var o = JSON.parse(this.result);
//Object.assign(localStorage, o); // use this with localStorage
alert("done, myKey=" + o["myKey"]); // o[] -> localStorage.getItem("myKey")
};
fileReader.readAsText(this.files[0]);
};
document.querySelector("button").onclick = function() {
var json = JSON.stringify(test); // test -> localStorage
var file = new File([json], "myFilename.txt", {type: "application/octet-stream"});
var blobUrl = (URL || webkitURL).createObjectURL(file);
window.location = blobUrl;
}
Save first: <button>Save file</button> (<code>"myKey" = "Hello there!"</code>)<br><br>
Then read the saved file back in:<br>
<label>Select a file: <input type=file id=selFile></label>
Are you using Nodejs? Or browser javascript?
In either case the structure you should use is js' standard object. Then you can turn it into JSON like this:
var dataJSON = JSON.stringify(yourDataObj)
With Nodejs, you'll want to require the fs module and use one of the writeFile or appendFile functions -- here's sample code:
const fs = require('fs');
fs.writeFileSync('my/file/path', dataJSON);
With browser js, this stackoverflow may help you: Javascript: Create and save file
I know you want to write to a file, but but consider a database instead so that you don't have to reinvent the wheel. INSERT ... ON DUPLICATE KEY UPDATE seems like the logical choice for what you're looking to do.
For security reasons it's not possible to use JavaScript to write to a regular text or similar file on a client's system.
However Asynchronous JavaScript and XML (AJAX) can be used to send an XMLHttpRequest to a file on the server, written in a server-side language like PHP or ASP.
The server-side file can then write to other files, or a database on the server.
Cookies are useful if you just need to save relatively small amounts of data locally on a client's system.
For more information have a look at
Read/write to file using jQuery