How to create a PDF file from any Base64 string? - javascript

I want to input any Base64 string to function and get the PDF from there. So tried this way, It download the PDF but there is a error
"Failed to load PDF document."
This is the way I tried,
let data = "SGVsbG8gd29ybGQ=" //hello world
var bufferArray = this.base64ToArrayBuffer(data);
var binary_string = window.atob(data)
var len = bufferArray.length;
var bytes = new Uint8Array(len);
for (var i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
let blob = new Blob([bytes.buffer], { type: 'application/pdf' })
var url = URL.createObjectURL(blob);
window.open(url);
//convert base64 string to arraybuffer
base64ToArrayBuffer(data) {
var bString = window.atob(data);
var bLength = bString.length;
var bytes = new Uint8Array(bLength);
for (var i = 0; i < bLength; i++) {
var ascii = bString.charCodeAt(i);
bytes[i] = ascii;
}
return bytes;
};

Base64 is not pdf so hello.b64 will never morph into hello.pdf
It needs a pdf header page and trailer in decimal bytes, those cannot be easily added as base64 object wrapping as too many variables.
The text/pdf needs careful script as text to wrap around the hello text see hello example https://stackoverflow.com/a/70748286/10802527
So as Base64 for example
JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgSGVsbG8gV29ybGQgICApJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G
<iframe type="application/pdf" width="95%" height=150 src="data:application/pdf;base64,JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgSGVsbG8gV29ybGQgICApJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G">frame</iframe>
Try above but may be blocked by security it will look like this for some users but not ALL !
In comments you asked how text could be manipulated in java script, and my stock answer is java script cannot generally be easily used to build PDF or edit Base64 content. However if you have prepared placeholders it can be changed by find and replace. But must be done with care as the total file length should never be changed.
As an example take the above as a prior template and switch the content to.
JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgRmFyZS10aGVlLXdlbGwpJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G
So by find and replace SGVsbG8gV29ybGQgICAp with RmFyZS10aGVlLXdlbGwp we get a text change:- (it is important the string length is a multiple of 4 and the length is the same)
<iframe type="application/pdf" width="95%" height=150 src="data:application/pdf;base64,JVBERi0xLjIgDQo5IDAgb2JqDQo8PA0KPj4NCnN0cmVhbQ0KQlQvIDMyIFRmKCAgRmFyZS10aGVlLXdlbGwpJyBFVA0KZW5kc3RyZWFtDQplbmRvYmoNCjQgMCBvYmoNCjw8DQovVHlwZSAvUGFnZQ0KL1BhcmVudCA1IDAgUg0KL0NvbnRlbnRzIDkgMCBSDQo+Pg0KZW5kb2JqDQo1IDAgb2JqDQo8PA0KL0tpZHMgWzQgMCBSIF0NCi9Db3VudCAxDQovVHlwZSAvUGFnZXMNCi9NZWRpYUJveCBbIDAgMCAyNTAgNTAgXQ0KPj4NCmVuZG9iag0KMyAwIG9iag0KPDwNCi9QYWdlcyA1IDAgUg0KL1R5cGUgL0NhdGFsb2cNCj4+DQplbmRvYmoNCnRyYWlsZXINCjw8DQovUm9vdCAzIDAgUg0KPj4NCiUlRU9G">frame</iframe>
and the result be
There are strict rules to be followed when using this method:-
Hello World ) is the template, note the inclusion of white space before the ) limit thus
Fare-thee-well) is as far as substitution is allowed in this case
so source field must be pre-planned to be big enough for largest replacement and is based on a plain text length of multiples of 3 (matches base64 blocks of 4)

Related

javascript, how could we read a local text file with accent letters into it?

I have one doubt because I need to read a local file and I have been studying some threads, and I have seen various ways to handle it, in most of the cases there is an input file.
I would need to load it directly through code.
I have studied this thread:
How to read a local text file?
And I could read it.
The surprising part was when I tried to split the lines and words, it showed: � replacing accent letters.
The code I have right now is:
myFileReader.js
function readTextFile(file) {
var rawFile = new XMLHttpRequest();
rawFile.open("GET", file, false);
rawFile.onreadystatechange = function () {
if (rawFile.readyState === 4) {
if (rawFile.status === 200 || rawFile.status == 0) {
allText = rawFile.responseText;
console.log('The complete text is', allText);
let lineArr = intoLines(allText);
let firstLineWords = intoWords(lineArr[0]);
let secondLineWords = intoWords(lineArr[1]);
console.log('Our first line is: ', lineArr[0]);
let atlas = {};
for (let i = 0; i < firstLineWords.length; i++) {
console.log(`Our ${i} word in the first line is : ${firstLineWords[i]}`);
console.log(`Our ${i} word in the SECOND line is : ${secondLineWords[i]}`);
atlas[firstLineWords[i]] = secondLineWords[i];
}
console.log('The atlas is: ', atlas);
let atlasJson = JSON.stringify(atlas);
console.log('Atlas as json is: ', atlasJson);
download(atlasJson, 'atlasJson.txt', 'text/plain');
}
}
};
rawFile.send(null);
}
function download(text, name, type) {
var a = document.getElementById("a");
var file = new Blob([text], {type: type});
a.href = URL.createObjectURL(file);
a.download = name;
}
function intoLines(text) {
// splitting all text data into array "\n" is splitting data from each new line
//and saving each new line as each element*
var lineArr = text.split('\n');
//just to check if it works output lineArr[index] as below
return lineArr;
}
function intoWords(lines) {
var wordsArr = lines.split('" "');
return wordsArr;
}
The doubt is: how could we handle those special character which are the vowels with accent?
I ask this, because even in the IDE thet interrogation marks appeared if we load the txt in UTF-8, so then I changed to ISO-8859-1 and it loaded well.
Also I have studied:
Read UTF-8 special chars from external file using Javascript
Convert special characters to HTML in Javascript
Reading a local text file from a local javascript file?
In addition, could you explain if there is a shorter way to load files in client javascript. For example in Java there is the FileReader / FileWriter / BufferedWriter. Is theren in Javascript something similar?
Thank you for you help!
It sounds like the file is encoded with ISO-8859-1 (or possibly the very-similar Windows-1252).
There's no BOM or equivalent for those encodings.
The only solutions I can see are:
Use a (local) server and have it return the HTTP Content-Type header with the encoding identified as a charset, e.g. Content-Type: text/plain; encoding=ISO-8859-1
Use UTF-8 instead (e.g., open the file in an editor as ISO-8859-1, then save it as UTF-8 instead), as that's the default encoding for XHR response bodies.
Put your text in an .html file with the corresponding content type,
for example:
<meta http-equiv="Content-Type" content="text/html; charset="UTF-8">
enclose the text between two tags ("####" in my example) (or put in a div)
Read the html page, extract the content and select the text:
window.open(url); //..
var content = newWindow.document.body.innerHTML;
var strSep="####";
var x = content.indexOf(strSep);
x=x+strSep.length;
var y = content.lastIndexOf(strSep);
var points=content.slice(x, y);

JAVASCRIPT decode a base64string (which is an encoded zipfile) to a zipfile and get the zipfiles content by name

the question says it all, im receiving a base64 encoded ZIPFILE from the server, which I first want to decode to a ZIPFILE in memory and then get the ZIPFILES content, which is a json-file.
I tried to use JSZIP but im totally lost in this case ... the base64 string is received with javascript by a promise.
So my question in short is: How can I convert a base64 encoded ZIPFILE to a ZIPFILE in memory to get its contents.
BASE64 -> ZIPFILE -> CONTENT
I use this complicated process to save much space on my database. And I dont want to handle this process on server-side, but on clientside with JS.
Thanks in advance!
If anyone is interested in my solution to this problem read my answer right here:
I received the data in a base64-string format, then converted the string to a blob. Then I used the blob-handle to load the zipfile with the JSZip-Library. After that I could just grab the contents of the zipfile. Code is below:
function base64ToBlob(base64) {
let binaryString = window.atob(base64);
let binaryLen = binaryString.length;
let ab = new ArrayBuffer(binaryLen);
let ia = new Uint8Array(ab);
for (let i = 0; i < binaryLen; i++) {
ia[i] = binaryString.charCodeAt(i);
}
let bb = new Blob([ab]);
bb.lastModifiedDate = new Date();
bb.name = "archive.zip";
bb.type = "zip";
return bb;
}
To get the contents of the zipfile:
let blob = base64ToBlob(resolved);
let zip = new JSZip();
zip.loadAsync(blob).then(function(zip) {
zip.file("archived.json").async("string").then(function (content) {
console.log(content);
// content is the file as a string
});
}).catch((e) => {
});
As you can see, first the blob is created from the base64-string. Then the handle is given over to the JSZip loadAsync method. After that you have to set the name of the file which you want to retrieve from the zipfile. In this case it is the file called "archived.json". Now because of the async("string") function the file (file contents) are returned as a string. To further use the extracted string, just work with the content variable.

Saving binary data in a browser without it getting UTF8 encoded on download

My web app receives data in the form of a base64 encoded string, which is decodes using atob, and stores via URL.createObjectURL(). This data is then downloaded via the right-click save-as dialog. The downloaded filed always matches the source file when the source file is ascii encoded. However this isn't the case when the source file is just plain binary data. A diff of a non ascii encoded downloaded file vs its source file appears to show that the downloaded file is UTF-8 encoded. How can this problem be fixed? Please note, I'm locked into using firefox 10.
Convert the string to a Arraybuffer and it should work. If there is any way that you can get the data into an array buffer directly without passing a sting that would be the best solution.
The following code is tested in FF10, and are using the now obsolete MozBlobBuilder.
fiddle
var str="",
idx, len,
buf, view, blobbuild, blob, url,
elem;
// create a test string
for (var idx = 0; idx < 256; ++idx) {
str += String.fromCharCode(idx);
}
// create a buffer
buf = new ArrayBuffer(str.length);
view = new Uint8Array(buf);
// convert string to buffer
for (idx = 0, len = str.length; idx < len; ++idx) {
view[idx] = str.charCodeAt(idx);
}
blobbuild = new MozBlobBuilder();
blobbuild.append(buf);
blob = blobbuild.getBlob('application/octet-stream');
url = URL.createObjectURL(blob);
elem = document.createElement('a');
elem.href = url;
elem.textContent = 'Test';
document.body.appendChild(elem);

Change JavaScript string encoding

At the moment I have a large JavaScript string I'm attempting to write to a file, but in a different encoding (ISO-8859-1). I was hoping to use something like downloadify. Downloadify only accepts normal JavaScript strings or base64 encoded strings.
Because of this, I've decided to compress my string using JSZip which generates a nicely base64 encoded string that can be passed to downloadify, and downloaded to my desktop. Huzzah! The issue is that the string I compressed, of course, is still the wrong encoding.
Luckily JSZip can take a Uint8Array as data, instead of a string. So is there any way to convert a JavaScript string into a ISO-8859-1 encoded string and store it in a Uint8Array?
Alternatively, if I'm approaching this all wrong, is there a better solution all together? Is there a fancy JavaScript string class that can use different internal encodings?
Edit: To clarify, I'm not pushing this string to a webpage so it won't automatically convert it for me. I'm doing something like this:
var zip = new JSZip();
zip.file("genSave.txt", result);
return zip.generate({compression:"DEFLATE"});
And for this to make sense, I would need result to be in the proper encoding (and JSZip only takes strings, arraybuffers, or uint8arrays).
Final Edit (This was -not- a duplicate question because the result wasn't being displayed in the browser or transmitted to a server where the encoding could be changed):
This turned out to be a little more obscure than I had thought, so I ended up rolling my own solution. It's not nearly as robust as a proper solution would be, but it'll convert a JavaScript string into windows-1252 encoding, and stick it in a Uint8Array:
var enc = new string_transcoder("windows-1252");
var tenc = enc.transcode(result); //This is now a Uint8Array
You can then either use it in the array like I did:
//Make this into a zip
var zip = new JSZip();
zip.file("genSave.txt", tenc);
return zip.generate({compression:"DEFLATE"});
Or convert it into a windows-1252 encoded string using this string encoding library:
var string = TextDecoder("windows-1252").decode(tenc);
To use this function, either use:
<script src="//www.eu4editor.com/string_transcoder.js"></script>
Or include this:
function string_transcoder (target) {
this.encodeList = encodings[target];
if (this.encodeList === undefined) {
return undefined;
}
//Initialize the easy encodings
if (target === "windows-1252") {
var i;
for (i = 0x0; i <= 0x7F; i++) {
this.encodeList[i] = i;
}
for (i = 0xA0; i <= 0xFF; i++) {
this.encodeList[i] = i;
}
}
}
string_transcoder.prototype.transcode = function (inString) {
var res = new Uint8Array(inString.length), i;
for (i = 0; i < inString.length; i++) {
var temp = inString.charCodeAt(i);
var tempEncode = (this.encodeList)[temp];
if (tempEncode === undefined) {
return undefined; //This encoding is messed up
} else {
res[i] = tempEncode;
}
}
return res;
};
encodings = {
"windows-1252": {0x20AC:0x80, 0x201A:0x82, 0x0192:0x83, 0x201E:0x84, 0x2026:0x85, 0x2020:0x86, 0x2021:0x87, 0x02C6:0x88, 0x2030:0x89, 0x0160:0x8A, 0x2039:0x8B, 0x0152:0x8C, 0x017D:0x8E, 0x2018:0x91, 0x2019:0x92, 0x201C:0x93, 0x201D:0x94, 0x2022:0x95, 0x2013:0x96, 0x2014:0x97, 0x02DC:0x98, 0x2122:0x99, 0x0161:0x9A, 0x203A:0x9B, 0x0153:0x9C, 0x017E:0x9E, 0x0178:0x9F}
};
This turned out to be a little more obscure than [the author] had thought, so [the author] ended up rolling [his] own solution. It's not nearly as robust as a proper solution would be, but it'll convert a JavaScript string into windows-1252 encoding, and stick it in a Uint8Array:
var enc = new string_transcoder("windows-1252");
var tenc = enc.transcode(result); //This is now a Uint8Array
You can then either use it in the array like [the author] did:
//Make this into a zip
var zip = new JSZip();
zip.file("genSave.txt", tenc);
return zip.generate({compression:"DEFLATE"});
Or convert it into a windows-1252 encoded string using this string encoding library:
var string = TextDecoder("windows-1252").decode(tenc);
To use this function, either use:
<script src="//www.eu4editor.com/string_transcoder.js"></script>
Or include this:
function string_transcoder (target) {
this.encodeList = encodings[target];
if (this.encodeList === undefined) {
return undefined;
}
//Initialize the easy encodings
if (target === "windows-1252") {
var i;
for (i = 0x0; i <= 0x7F; i++) {
this.encodeList[i] = i;
}
for (i = 0xA0; i <= 0xFF; i++) {
this.encodeList[i] = i;
}
}
}
string_transcoder.prototype.transcode = function (inString) {
var res = new Uint8Array(inString.length), i;
for (i = 0; i < inString.length; i++) {
var temp = inString.charCodeAt(i);
var tempEncode = (this.encodeList)[temp];
if (tempEncode === undefined) {
return undefined; //This encoding is messed up
} else {
res[i] = tempEncode;
}
}
return res;
};
encodings = {
"windows-1252": {0x20AC:0x80, 0x201A:0x82, 0x0192:0x83, 0x201E:0x84, 0x2026:0x85, 0x2020:0x86, 0x2021:0x87, 0x02C6:0x88, 0x2030:0x89, 0x0160:0x8A, 0x2039:0x8B, 0x0152:0x8C, 0x017D:0x8E, 0x2018:0x91, 0x2019:0x92, 0x201C:0x93, 0x201D:0x94, 0x2022:0x95, 0x2013:0x96, 0x2014:0x97, 0x02DC:0x98, 0x2122:0x99, 0x0161:0x9A, 0x203A:0x9B, 0x0153:0x9C, 0x017E:0x9E, 0x0178:0x9F}
};
Test the following script:
<script type="text/javascript" charset="utf-8">
The best solution for me was posted here and this is my one-liner:
<!-- Required for non-UTF encodings (quite big) -->
<script src="encoding-indexes.js"></script>
<script src="encoding.js"></script>
...
// windows-1252 is just one typical example encoding/transcoding
let transcodedString = new TextDecoder( 'windows-1252' ).decode(
new TextEncoder().encode( someUtf8String ))
or this if the transcoding has to be applied on multiple inputs reusing the encoder and decoder:
let srcArr = [ ... ] // some UTF-8 string array
let encoder = new TextEncoder()
let decoder = new TextDecoder( 'windows-1252' )
let transcodedArr = srcArr.forEach( (s,i) => {
srcArr[i] = decoder.decode( encoder.encode( s )) })
(The slightly modified other answer from related question:)
This is what I found after a more specific Google search than just
UTF-8 encode/decode. so for those who are looking for a converting
library to convert between encodings, here you go.
github.com/inexorabletash/text-encoding
var uint8array = new TextEncoder().encode(str);
var str = new TextDecoder(encoding).decode(uint8array);
Paste from repo readme
All encodings from the Encoding specification are supported:
utf-8 ibm866 iso-8859-2 iso-8859-3 iso-8859-4 iso-8859-5 iso-8859-6
iso-8859-7 iso-8859-8 iso-8859-8-i iso-8859-10 iso-8859-13 iso-8859-14
iso-8859-15 iso-8859-16 koi8-r koi8-u macintosh windows-874 windows-1250
windows-1251 windows-1252 windows-1253 windows-1254 windows-1255
windows-1256 windows-1257 windows-1258 x-mac-cyrillic gb18030 hz-gb-2312
big5 euc-jp iso-2022-jp shift_jis euc-kr replacement utf-16be utf-16le
x-user-defined
(Some encodings may be supported under other names, e.g. ascii,
iso-8859-1, etc. See Encoding for additional labels for each
encoding.)

Downloading generated binary content contains utf-8 encoded chars in disk-file

I am trying to save a generated zip-file to disk from within a chrome extension with the follwing code:
function sendFile (nm, file) {
var a = document.createElement('a');
a.href = window.URL.createObjectURL(file);
a.download = nm; // file name
a.style.display = 'none';
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
}
function downloadZip (nm) {
window.URL = window.webkitURL || window.URL;
var content;
content = zip.generate();
var file = new Blob ([content], {type:'application/base64'});
sendFile ("x.b64", file);
content = zip.generate({base64:false});
var file = new Blob ([content], {type:'application/binary'});
sendFile ("x.zip", file);
}
Currently this saves the contents of my zip in two versions, the first one is base64 encoded, and when I decode it with base64 -d the resulting zip is ok.
The second version should just save the raw data (the zip file), but this raw data arrives utf-8 encoded on my disk. (each value >= 0x80 is preprended with 0xc2). So how to get rid of this utf-8 encoding? Tried various type-strings like application/zip, or ommitting the type info completely, it just arrives always with utf-8 encoding. I am also curious how to make the browser store/convert base64-data (the first case) by itself, so that they arrive as decoded binary data on my disk... I'm using Chrome Version 23.0.1271.95 m
PS: The second content I analysed with a hexdump-utility inside the browser: it does not contain utf-8 encodings (or my hexdump calls something which does implicit conversion). For completeness (sorry, its just transposed from c, so it might not be that cool js-code), I append it here:
function hex (bytes, val) {
var ret="";
var tmp="";
for (var i=0;i<bytes;i++) {
tmp=val.toString (16);
if (tmp.length<2)
tmp="0"+tmp;
ret=tmp+ret;
val>>=8;
}
return ret;
}
function hexdump (buf, len) {
var p=0;
while (p<len) {
line=hex (2,p);
var i;
for (i=0;i<16;i++) {
if (i==8)
line +=" ";
if (p+i<len)
line+=" "+hex(1,buf.charCodeAt(p+i));
else
line+=" ";
}
line+=" |";
for (i=0;i<16;i++) {
if (p+i<len) {
var cc=buf.charCodeAt (p+i);
line+= ((cc>=32)&&(cc<=127)&&(cc!='|')?String.fromCharCode(cc):'.');
}
}
p+=16;
console.log (line);
}
}
From working draft:
If element is a DOMString, run the following substeps:
Let s be the result of converting element to a sequence of Unicode characters [Unicode] using the algorithm for doing so in WebIDL
[WebIDL].
Encode s as UTF-8 and append the resulting bytes to bytes.
So strings are always converted to UTF-8, and there is no parameter to affect this. This doesn't affect base64 strings because they only contain characters that match single byte per codepoint, with the codepoint and byte having the same value. Luckily Blob exposes lower level interface (direct bytes), so that limitation doesn't really matter.
You could do this:
var binaryString = zip.generate({base64: false}), //By glancing over the source I trust the string is in "binary" form
len = binaryString.length, //I.E. having only code points 0 - 255 that represent bytes
bytes = new Uint8Array(len);
for( var i = 0; i < len; ++i ) {
bytes[i] = binaryString.charCodeAt(i);
}
var file = new Blob([bytes], {type:'application/zip'});
sendFile( "myzip.zip", file );

Categories