I'm implementing image upload via browser form and I'm working with AWS and NodeJS. The process is that user selects a file, provides additional info and it all is send to backend using multipart/form-data.
This works great so payload goes thru API Gateway ---> Lambda and this lambda uploads to S3 bucket. I'm using busboy to deal with multipart data and end up with nice JSON object containing all the data send from frontend, something like:
{
userName: "Homer Simpson",
file: base64endcoded_string,
}
Then I grab this base64endcoded_string and upload to S3 so file sits in there and I'm able to open it, download etc.
Now, obviously I don't trust any input from frontend and I wonder what is the best way to ensure that file being send is not malicious.
In this case I need to allow upload only images, say png,jpg/jpeg up to 2mb in size.
Busboy gives me the MIME type, encoding and other details but not sure if this is reliable enough or I should use something like mmmagick or else. How secure and reliable would these solutions be?
Any pointers would be much appreciated.
OWASP has a section on this with some ideas, anyways i found out that the best method to secure a image upload is to convert it, period, if you can convert it it's an image and you are sure that any attached info (code, hidden data, etc) is removed with the conversion process, if you can't it's not an image.
Another advantage is that you can strip exif info, add some data (watermarks for example), etc
Related
This is part of an experiment I am working on.
Let's say I upload a file eg: .psd (photoshop file) or .sketch (sketch) through the input type file tag, it displays the name of the file and can be downloaded as a .psd / .sketch on click of a button (without data corruption)
How would this be achieved?
Edit 1:
I'm going to add a bit more info as the above was not completely clear.
This is the flow:
User uploads any file
File gets encrypted in the client before sending to a sockets.io server
User on the other end receives this file and is able to decrypt and download.
Note: There is not database connected with the sockets.io. It just listens and responds to whoever connected to the server.
I got the enc/dec part covered. Only thing is uploading and store as ? in a variable so it can be encrypted and doing the opposite on the recepient end (dec and downlodable)
Thanks again in advance :)
I think these are your questions:
How to read a file that was opened/dropped into a <file> element
How to send a file to a server
How to receive a file from a server
When a user opens a file on your file element, you'll be able to use its files property:
for (const file of fileInputEl.files) {
// Do something with file here...
}
Each file implements the Blob interface, which means you can call await file.arrayBuffer() to get an ArrayBuffer, which you can likely use directly in your other library. At a minimum, you can create your byte array from it.
Now, to send data, I strongly recommend that you use HTTP rather than Socket.IO. If you're only sending data one way, there is no need for a Web Socket connection or Socket.IO. If you make a normal HTTP request, you offload all the handling of it to the browser. On the upload end, it can be as simple as:
fetch('https://files.example.com/some-id-here', {
method: 'PUT'
body: file
});
On the receive end, you can simply open a link <a href="https://files.example.com/some-id-here">.
Now, the server part... You say that you want to just pass this file through. You didn't specify at all what you're doing on the server. So, speaking abstractly, when you receive a request for a file, you can just wait and not reply with data until the sending end connects and start uploading. When the sending end sends data, send that data immediately to the receiving end. In fact, you can pipe the request from the sending end to the response on the receiving end.
You'll probably have some initial signalling to choose an ID, so that both ends know where to send/receive from. You can handle this via your usual methods in your chat protocol.
Some other things to consider... WebRTC. There are several off-the-shelf tools for doing this already, where the data can be sent peer-to-peer, saving you some bandwidth. There are some complexities with this, but it might be useful to you.
I know that blob is a data type for binary data as integer is a datatype for int. As they say, It's used to store files directly in database (we move our audio file into blob, and save that blob in database).
Question 1) why to store blob for audio if I can just put the audio in storage for example path /var/www/audio.mp3 and in database I store path_name /var/www/audio.mp3?
Question 2) which is better ? how netflix stores movies? just blobs or what?
Question 3) Curious if there're any cons or prons if you could just give me ideas so that I know when to use them .
Putting the blob in the database, rather than a file, allows you to grow to multiple servers with load balancing. If you put the data in files, you would have to replicate the files between the server. Most databases have built-in replication features, this isn't as easy for regular files.
Better to use external storage/cdn for serving such kind of large content.
How Netflix and our works? They upload content on external bucket i. e. S3 and write file name in db for identification. According to user file access frequency that file cache on CDN/edge location. User will get awesome experience while content server from their nearest edge location
With blob you can store all kinds of stuff.
Do you communicate with an API via SOAP or JSON and want to store it in the database? Use a blob. Want to log what a user filled into a form when it threw an exception? Store the entire post as a blob. You can save everything as is. It's handy for logging if you have different data formats. I know an API which expects some data via SOAP and some as JSON. To log the communication I use blob because the response may be in XML, JSON, a number (http code 203 for empty but accepted) or an exception as array.
My application has to generate reports which should be available for download in XLS format. I have built a REST API using Django Rest Framework and there is an endpoint for report generation. It accepts POST requests with JSON body (report parameters, like from, to, etc., but there is also some data that represented with JSON objects) and returns JSON result. I successfully use it from Javascript, render the report as an HTML table and it works just fine.
My problem is that I need to allow users to save the report as an .xls file with a decent filename (like myawesomereport.04.12-10.12.xls. I tried JS data url approach, but as far as I understand, there is no way to set a filename if you go with that option (except setting a download attribute on an a tag, but its support is limited, so it's not the way to go). I thought that maybe I should open a new window with my API endpoint's url appropriately formed, so it outputs an XLS file, but the problem is that I do not understand if there is a way to send JSON with that request.
How should I approach this problem?
You can set the filename in the backend, by using the header Content-Disposition, so that in the frontend you can use a standard <a> tag.
In DRF, it would look like this:
response['Content-Disposition'] = 'attachment; filename={}'.format(
file_name
)
I'm trying to implement an HTML controller in my webapp that will upload files from the client to my azure blob storage.
I know how to do it in the server side with C# but this solution isn't right for me because i'm dealing with a large volume files(that the client uploads), so I don't want to upload them to my server side, I want that the client will upload them straight to the blob storage.
but here is where I'm lost, maybe you could help me.
Objective: I need to grant SAS for that user.
Solution: I call(using AJAX) to a server side method that generate the string (string - URL + SAS token)
Now all is left to do is split the files to chunks and upload them giving the URL with the token that I generate on the server side.
I read a lot of article about it but every article says different things, half of them was in the period that azure was not supporting CORS, so a huge amount of them out of date.
How can I do the last two things in the right way :
1.Chunk the file.
2.Upload the file.
One last thing i read in some article that i need to split the file to chunks and then upload all the chunks and then to commit or something all the chunks so its become one file in the storage.(maybe i got it in the worng way)
anyway if somebody could help me with guidelines or anything that will help me overcome this two last jobs needed to be done
*Update:
The errors I get(1.OPTION 2. headers):
Open the image in a new tab to see it properly
*Update 2:
This is how i set the CORS:
I am performing multipart upload to S3 straight from the browser, i.e. bypassing my back-end.
My JS code sends a hand-shake request to S3 (a POST), then uploads the file in chunks of 5MB (a PUT for each) and eventually, finalises the file (a POST).
That works well. As you can guess, each request to S3 (hand-shake, part uploads and finalisation) has to be signed. It is of course out of the question to generate the signature in JS as it would expose my AWS secret key.
What I am doing so far is the following : before each request to S3, I send a request to my own back-end (to /sign?method=HTTPMethod&path=URLToSign) which returns the signature string. This way, all AWS credentials stay in the back-end, as should be.
My question is the following : Is this secure?
One thing you should take into account is that anyone that sees the address you are calling to get the signature string could abuse it to upload any file (and any number of files) on behalf of you/your application.
To avoid that, my first guess would be to implement some sort of validation in the back-end, so for example, you wouldn't sign URLs if you were getting more than X requests per second.
Also, if you need to filter some files out (exe's, huge files that could rise your AWS bill) I don't think bypassing your back-end is a good idea because you have no control over which files are getting uploaded (maybe your user uploads a file named kitten.png which is actually a 700 mb iso file)