How can I convert a string of pdf code into a blob?

How can I convert a string of pdf code into a blob? - javascript

I briefly summarize my problem:
I'm calling an API that returns a pdf like
"% PDF-1.4%%1 0 obj
<<
/ Type / Catalog/ PageLayout / OneColumn/
Pages 2 0 R/ PageMode / UseNone
......... "
currently, I receive it in string format to be able to make changes and so far so good, but after making changes I would like to convert the string to blob to download the pdf. In doing this I am having problems, the text string converted to blob does not generate the correct pdf, or rather the pdf once opened is white, when in reality it should have data.
The code I'm using now is the following:
response.text().then((content) => {
//...TODO: Modify pdf
var blob = new Blob([content], { type: "application/pdf" });
saveAs(blob, "invoice.pdf");
}).catch(error => {
console.log(error);
});
The pdf is downloaded but if I open it it is empty.
I would like to be able to modify the pdf string and convert it back into a blob to be able to download it.
Does anyone have an idea how I could do it?

A PDF consists of a set of objects in a non-trivial fashion. If you are receiving it as a string and are using standard string manipulation functions on it, e.g. find and replace you are most likely going to corrupt it. You would have to edit in accord with the standards laid out in the PDF specification and not violate the syntax. This is a very fragile approach, you need to use a PDF library instead to edit your PDF content.

Related

How to save to disk a file of unknown content type, when receiving it as a binary stream in a string. (All my attempts corrupt the file)

I receive a string from an API that is a binary stream and a filename, here is an example (Shortened for brevity).
{ filename: '002-000017.pdf' body: '%PDF-1.7\n\n4 0 obj\n(Identity)\nendobj\n5 0 obj\n(Adobe)\nendobj\n8 0 obj\n<<\n/Filter /FlateDecode\n/Length 181864\n/Length1 542744\n/Type /Stream\n>>\nstream\nx��}\t`�������}dw����~��\t9 #H6'\t�9 e.t.c ' }
Other files simply have \u001c\u0004N�v�$$\u0010$�\u0000
In the above example it is a PDF file, but I dont know what content type the file will be, it could be a .docx, .txt or even no file extension. Also the name will not always have an extension. I take this file and convert it to blob, that I attach to and click an anchor.
I have tried specifying no type, 'application/unknown' and 'application/octet-stream'. I even tried hardcoding it to 'application/pdf' for the above sample, just to see if that would work. In every scenario the pdf is blank. (Or unreadable with other file types, docx for example just crashes out when opening with word).
The exception is .txt, any .txt files work perfectly.
I have confirmed the string is convertable manually and readable, I did this using PowerShell without issue. The issue is im doing something wrong in my web side javascript code. What am I missing here?
Code:
const { filename } = response
const blob = new Blob([response.body])
const blobURL = window.URL.createObjectURL(blob);
// Using React Ref to handle the anchor.
downloadAnchor!.current!.href = blobURL;
downloadAnchor!.current!.download = 'filename';
downloadAnchor!.current!.click();

Base64 Upload Image Issue

Sorry for the long question, but I wanted to express it as well as I could, and ask it in a way that is understood easily. I have a program that allows a user to crop an image using croppie.js in JavaScript, and send the image to a Hunchentoot server on the backend running a Lisp program. I am having an issue saving the Base64 image to a .png file once the user uploads it. Once the post request is sent to the server I am getting the Base64 image as a string, removing invalid characters from the Base64 request by creating a subsequence of the string without the heading sent by the post request and also substituting the "%" character for the "+" character to make the Base64 valid. Next I remove the substring +3D+3D at the end of my string, because the s-base64 library that I am using in Common Lisp complains +3D+3D is invalid padding, and I replace it with "==" which is considered valid padding. Next I create a byte array by using the s-base64 library to translate the Base64 string to a byte array, and store it in a variable. Then I loop through the byte array and write each byte to the output file. When that is finished I decided to print the end of the byte array to the console so that I could see if the output and ending padding is valid, which it appears to be. Here is that part of the code, with comments to make it clearer:
(define-easy-handler (handle-image :uri "/handle-image.html") ()
(let ((data-source (hunchentoot:raw-post-data :force-text t))) ;get Base64 string
(let ((new-string (subseq data-source 36))) ;create a subsequence of Base64 string
(let ((final-string (substitute #\+ #\% new-string))) ;substitute % for +
(let ((end (search "+3D+3D" final-string))) ;find the invalid padding
(setf final-string (concatenate 'string (subseq final-string 0 end) "==")) ;add valid padding
(let ((byte-array (with-input-from-string (in final-string) ;create byte array (or simple-vector) out of Base64 string
(decode-base64-bytes in))))
(with-open-file (out "path/path/path/path/profile-image.png" ;create output stream to store file
:direction :output
:if-exists :supersede
:element-type 'unsigned-byte)
(dotimes (i (length byte-array)) ;write each byte from the byte array to output stream
(write-byte (aref byte-array i) out)))) ;close stream
(format t "!!!!!!!!: ~a" (subseq final-string (- (length final-string) 30))))))) ;print ending to console to ensure proper padding
"Upload Successful") ;send response to client
And here is some of my JavaScript code:
$(document).ready(function(){
$image_crop = $('#image_demo').croppie({
enableExif: true,
viewport: {
width:200,
height:200,
type:'square' //circle
},
boundary:{
width:300,
height:300
}
});
As you can see, I first create the cropper. I allow the user to have a 200 x 200 square to crop, and the total size of the cropping space is 300 x 300. There are no issues with that part of the code:
$('#upload_image').on('change', function(){
var reader = new FileReader();
reader.onload = function (event) {
$image_crop.croppie('bind', {
url: event.target.result
}).then(function(){
console.log('jQuery bind complete');
});
}
Above I bind the image they've uploaded to the cropper (like if you upload a Facebook image), when they select a file. Again, no issues:
reader.readAsDataURL(this.files[0]);
$('#uploadImageModal').modal('show');
Above I read the file that they've selected and then the modal "pops up" as if you're cropping a Facebook or Instagram photo:
$('.crop_image').click(function(event){
$image_crop.croppie('result', {
type: 'canvas',
size: 'viewport'
}).then(function(response){
$.ajax({
url:"handle-image.html",
type: "POST",
data:{"image": response},
success:function(data){
$('#uploadImageModal').modal('hide');
$('#uploaded_image').html(data);
}
});
})
});
Above I upload the ajax request, and if the upload was successful they will get a message from the server that it was success, and I hide the modal for image cropping as well.
Now the issue is that the image is simply blank. I know that the Base64 is valid because I used a Base64 conversion tool to see if the string was valid. I also went back and did my research in regards to Bits, Bytes, Pixels, and dimensions of images to see how the computer interacts with them, so i'm not sure why my image is simply displaying blank. Here is a look at what the cropper looks like on my website:
The bind is working, and then I get the message that the upload was successful. BUT after writing the image and viewing it in the file system it is either a blank image, or will sometimes say that the image type is not supported.
The reason why I am tagging PHP in this post is because I am sure some people have had similar issues in PHP with uploading a cropped image via ajax, and some of those solutions might be applicable in this case, but obviously will require me translating the solution to the lisp syntax. My assumption is that something is wrong with my code when I translate the string to a byte array and write it to a file, but I thought it'd be good to post other sections of my code if I am overlooking something.

As Brad commented, you should first try to use binary uploads directly.
That aside: if you encounter a % in a base64-encoded string, it most likely means that the entire thing is additionally URL-encoded. A quick apropos search gave do-urlencode as a library to decode that. Replacing % with + makes valid base64, but the result does not necessarily represent valid jpg.
Also: use let* instead of nested let forms. Maybe use write-sequence instead of byte-wise output.

Thanks to the answer from #Brad and #Svante I was able to solve the problem. I decided to put the image that I want to upload within a form element, add the blob image from the canvas as FormData for the form, and then send the FormData via an ajax post request:
$('.crop_image').on('click mousedown touchstart', function(event){ //When the crop image button is pressed event.
$image_crop.croppie('result', { //Get the result of the cropper
size: 'viewport', //Set the size to the viewport size (180 x 120)
format: 'png', //.png format
type: 'blob'
}).then(function (blob){
$form = $('#uploadForm');
var fd = new FormData($form);
fd.append('upload_image', blob);
$.ajax({
url: "handle-image.html",
type: "POST",
data: fd,
processData: false,
contentType: false,
success: function (data){
$('#uploadImageModal').modal('hide');
$('#uploaded_image').html(data);
}
});
})
});
Here I just decided to change the croppie result from the type "canvas"to the type "form", and specify that it will be a .png file. From there I added the form data from my newly created form to the .ajax request and went from there. Thanks to #Brad and #Svante for the help here.

Html DOM to Array Buffer to make BLOB for pdf?

I want to show my HTML DOM as pdf in new tab using BLOB. We can achieve this by calling a API which will return a blob but I want to do this without the involvement of any server side API. I am facing problems in converting my html string to ARRAY BUFFER. Here is the code on stackblitz you can test this and please let me know how to solve this. Thanks
generate pdf from html dom

Consider using TextEncoder, which can easily encode/decode Strings and ArrayBuffers:
const encoder = new TextEncoder(),
decoder = new TextDecoder(),
text = 'Hello',
textEncoded = encoder.encode(text),
textDecoded = decoder.decode(textEncoded);
console.log({ text, textDecoded, textEncoded });

Saving file with JavaScript File API results wrong encoding

I have a problem (or may be two) with saving files using HTML5 File API.
A files comes from the server as a byte array and I need to save it. I tried several ways described on SO:
creating blob and opening it in a new tab
creating a hidden anchor tag with "data:" in href attribute
using FileSaver.js
All approaches allow to save the file but with breaking it by changing the encoding to UTF-8, while the file (in current test case) is in ANSI. And it seems that I have to problems: at the server side and at the client side.
Server side:
Server side is ASP.NET Web API 2 app, which controller sends the file using HttpResponseMessage with StreamContent. The ContentType is correct and corresponds with actual file type.
But as can be seen on the screenshot below server's answer (data.length) is less then actual file size calculated at upload (file.size). Also here could be seen that HTML5 File object has yet another size (f.size).
If I add CharSet with value "ANSI" to server's response message's ContentType property, file data will be the same as it was uploaded, but on saving result file still has wrong size and become broken:
Client side:
I tried to set charset using the JS File options, but it didn't help. As could be found here and here Eli Grey, the author of FileUplaod.js says that
The encoding/charset in the type is just metadata for the browser, not an encoding directive.
which means, if I understood it right, that it is impossible to change the encoding of the file.
Issue result: at the end I can successfully download broken files which are unable to open.
So I have two questions:
How can I save file "as is" using File API. At present time I cannot use simple way with direct link and 'download' attribute because of serverside check for access_token in request header. May be this is the "bottle neck" of the problem?
How can I avoid setting CharSet at server side and also send byte array "as is"? While this problem could be hacked in some way I guess it's more critical. For example, while "ANSI" charset solves the problem with the current file, WinMerge shows that it's encoding is Cyrillic 'Windows-1251' and also can any other.
P.S. the issue is related to all file types (extensions) except *.txt.
Update
Server side code:
public HttpResponseMessage DownloadAttachment(Guid fileId)
{
var stream = GetFileStream(fileId);
var message = new HttpResponseMessage(HttpStatusCode.OK);
message.Content = new StreamContent(stream);
message.Content.Headers.ContentLength = file.Size;
message.Content.Headers.ContentType = new MediaTypeHeaderValue(file.ContentType)
{
// without this charset files sent with bigger size
// than they are as shown on image 1
CharSet = "ANSI"
};
message.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment")
{
FileName = file.FileName + file.Extension,
Size = file.Size
};
return message;
}
Client side code (TypeScript):
/*
* Handler for click event on download <a> tag
*/
private downloadFile(file: Models.File) {
var self = this;
this.$service.downloadAttachment(this.entityId, file.fileId).then(
// on success
function (data, status, headers, config) {
var fileName = file.fileName + file.extension;
var clientFile = new File([data], fileName);
// here's the issue ---^
saveAs(clientFile, fileName);
},
// on fail
function (error) {
self.alertError(error);
});
}
My code is almost the same as in answers on related questions on SO: instead of setting direct link in 'a' tag, I handle click on it and download file content via XHR (in my case using Angularjs $http service). Getting the file content I create a Blob object (in my case I use File class that derives from Blob) and then try to save it using FileSaver.js. I also tried approach with encoded URL to Blob in href attribute, but it only opens a new tab with a file broken the same way. I found that the problem is in Blob class - calling it's constructor with 'normal' file data I get an instance with 'wrong' size as could be seen on first two screenshots. So, as I understand, my problem not in the way I try to save my file, but in the way I create it - File API

Image not showing using chrome filesystem toURL

I have the following code to write an image into the filesystem, and read it back for display. Prior to trying out the filesystem API, I loaded the whole base64 image into the src attribute and the image displayed fine. Problem is the images can be large so if you add a few 5MB images, you run out of memory. So I thought I'd just write them to the tmp storage and only pass the URL into the src attribute.
Trouble is, nothing gets displayed.
Initially I thought it might be something wrong with the URL, but then I went into the filesystem directory, found the image it was referring to and physically replaced it with the real binary image and renamed it to the same as the replaced image. This worked fine and the image is displayed correctly, so the URL looks good.
The only conclusion I can come to is that the writing of the image is somehow wrong - particularly the point where the blob is created. I've looked through the blob API and can't see anything that I may have missed, however I'm obviously doing something wrong because it seems to be working for everyone else.
As an aside, I also tried to store the image in IndexedDB and use the createObjectURL to display the image - again, although the URL looks correct, nothing is displayed on the screen. Hence the attempt at the filesystem API. The blob creation is identical in both cases, with the same data.
The source data is a base64 encoded string as I mentioned. Yes, I did also try to store the raw base64 data in the blob (with and without the prefix) and that didn't work either.
Other info - chrome version 28, on linux Ubuntu
//strip the base64 `enter code here`stuff ...
var regex = /^data.+;base64,/;
if (regex.test(imgobj)) { //its base64
imgobj = imgobj.replace(regex,"");
//imgobj = B64.decode(imgobj);
imgobj = window.atob(imgobj);
} else {
console.log("it's already :", typeof imgobj);
}
// store the object into the tmp space
window.requestFileSystem(window.TEMPORARY, 10*1024*1024, function(fs) {
// check if the file already exists
fs.root.getFile(imagename, {create: false}, function(fileEntry) {
console.log("File exists: ", fileEntry);
callback(fileEntry.toURL(), fileEntry.name);
//
}, function (e) { //file doesn't exist
fs.root.getFile(imagename, {create: true}, function (fe) {
console.log("file is: ", fe);
fe.createWriter(function(fw){
fw.onwriteend = function(e) {
console.log("write complete: ", e);
console.log("size of file: ", e.total)
callback(fe.toURL(), fe.name);
};
fw.onerror = function(e) {
console.log("Write failed: ", e.toString());
};
var data = new Blob([imgobj], {type: "image/png"});
fw.write(data);
}, fsErrorHandler);
}, fsErrorHandler);
});
// now create a file
}, fsErrorHandler);
Output from the callback is:
<img class="imgx" src="filesystem:file:///temporary/closed-padlock.png" width="270px" height="270px" id="img1" data-imgname="closed-padlock.png">
I'm at a bit of a standstill unless someone can provide some guidance...
UPDATE
I ran a test to encode and decode the base64 image with both the B64encoder/decoder and atob/btoa -
console.log(imgobj); // this is the original base64 file from the canvas.toDataURL function
/* B64 is broken*/
B64imgobjdecode = B64.decode(imgobj);
B64imgobjencode = B64.encode(B64imgobjdecode);
console.log(B64imgobjencode);
/* atob and btoa decodes and encodes correctly*/
atobimgobj = window.atob(imgobj);
btoaimgobj = window.btoa(atobimgobj);
console.log(btoaimgobj);
The results show that the btoa/atob functions work correctly but the B64 does not - probably because the original encoding didn't use the B64.encode function...
The resulting file in filesystem TEMPORARY, I ran through an online base64 encoder for comparison and the results are totally different. So the question is - while in the filesystem temp storage, is the image supposed to be an exact image, or is it padded with 'something' which only the filesystem API understands? Remember I put the original PNG in the file system directory and the image displayed correctly, which tends to indicate that the meta-data about the image (eg. the filename) is held elsewhere...
Can someone who has a working implementation of this confirm if the images are stored as images in the filesystem, or are padded with additional meta-data?

So to answer my own question - the core problem was in the base64 encoding/decoding - I've since then changed this to use things like ajax and responseTypes like arraybuffer and blob and things have started working.
To answer the last part of the question, this is what I've found - in the filesystem tmp storage, yes the file is supposed to be an exact binary copy - verified this in chrome and phonegap.

We Keep Coding

JavaScript is the programming language of the Web.

How can I convert a string of pdf code into a blob? - javascript

Related

How to save to disk a file of unknown content type, when receiving it as a binary stream in a string. (All my attempts corrupt the file)

Base64 Upload Image Issue

Html DOM to Array Buffer to make BLOB for pdf?

Saving file with JavaScript File API results wrong encoding

Image not showing using chrome filesystem toURL

Categories

Resources