Create PDF in browser: Custom Font

Create PDF in browser: Custom Font - javascript

I know two libraries to create PDF files using Javascript in the browser ([1], [2]) but none of them allows to embed a custom font into the document.
[2] allows to set a custom font, but only for the standard PDF fonts (Courier, Times-Roman) and none of them is actively developed anymore.
Does anyone know a library to create PDF files in the browser that is still actively developed and supports the embedding of custom fonts?
Cheers,
Manuel

Ok, looks like current implementations do not support it.
So I'm porting libharu to javascript using emscripten:
Project:
https://github.com/manuels/hpdf.js
Demos:
http://manuels.github.com/hpdf.js/

If anyone else is looking, there's also this: https://github.com/devongovett/pdfkit
It looks more actively developed than hpdf, BUT I couldn't get it working just using browserify with the node module brfs as the docs mention (firstly brfs only works with static paths, but it also didn't seem to output the raw data from the font properly), I had to do this to get it to work:
if your font has no cmap: open, then export the font as ttf (with glyph map in the export options) with fontforge
get the base64 of the ttf file in string format (I used python to read the contents of the ttf file, encode with base64, remove line breaks, then save out to another file)
paste the string as a variable in your script
create a buffer object, and pass that as the font with pdfkit, ie
fontCenturyGothicBase64 = "your base64 encoded string here";
fontCenturyGothic = new Buffer(fontCenturyGothicBase64, 'base64');
doc.font(fontCenturyGothic);
use browserify on the javascript file (Buffer is a node object rather than pure js)
Maybe it's possible without using the Buffer object (and thus browserify), I haven't tried.

Related

Turn CSV file into array using Vanilla Javascript without input element

I am trying to read turn a CSV file, the file is on local, could be same folder with the script file. Since I am writing JSX for photoshop, I couldn't use any other library. And there are a lot of tutorial out there using input element which is not what I need. The path of the file could be hard coded. What I am trying to do is read the CSV, and take out some data. Please advise!
Let me explain it clearly!
I am writing JSX for photoshop script which has no browser element - input tag something like that. And it must be pure Javascript no library such as jQuery. I did a lot of google search what they do is taking the input tag from browser let user select the CSV file, I just want the file path is hard code, it is a fixed path and filename. And I don't see any tutorial for read CSV file and turn into array via vanilla javascript.

You can use the File class. How this works is explained in the ExtendScript toolkit docs which are installed on your computer alongside Creative Cloud. An online version can also be found here. (The scripting guide references this under the File object on page 110, referring to a section about JavaScript on different platforms on page 32, which then refers to the ExtendScript docs.)
Example:
const file = new File("/c/Users/user/Desktop/text.csv");
file.encoding = 'UTF-8';
file.open("r");
const contents = file.read();
file.close();
alert(contents);

Javascript for multiple file conversion in Adobe Acrobat failing

I am trying to convert multiple files from PDF to plain text in Adobe. I found a solution online that reads:
/* PDF to Text */
this.saveAs("C:\Users\sandr\Dropbox\Light\Doctorate\Supervisor meetings\2018\October\Method\test_corpus\2sleep.tar\2sleep\2sleep\pdf\txt_output" + this.documentFileName + ".txt","com.adobe.acrobat.plain-text");
The script runs but it always gives an error saying it could not open the file and it doesn't actually create the text file. Does anyone know why this is?

Because Acrobat Javascript needs to run on both Mac and Windows, you need to use platform independent paths. Windows-specific file names and paths won't work. For example...
this.saveAs("C/Users/user/Dropbox/foo.pdf");
Also...
this.documentFileName
will have a ".pdf" extension at the end, you may want to trim that before appending the ".txt"

What is this encoding / why are these .txt files not plain text?

I am browsing the deck.gl repo. It ships with some examples with text files, for example this one. These files have a .txt extension, but aren't plain text:
!OohmwFjqwbMg#[?ADKJYXF#^?N?FAD
=wnmwFvvwbM_#WNg####C?C_#UA?AD#?Of#_#UTu#??BK?A??FUVP?#JF?AVP?#JF?AVPGTA?EL#?
=urmwF|swbM_#UFS##BK?C#C#A#E?CIGA?GE?CIGA#CF?#ABA#CJ##GR]Ud#wA\T?#DB?AXP?#DB?A\T
<aymwFnvwbMaAOKCA#OKPk#CCDKAADKAADKAADKAADKAAL_#fBjAIVCCEL
The examples also contain JavaScript files that look as though they are used to decode these files, for example this one for the file above.
What exactly is going on here? I assume this is a way of reducing the size of the data, but why not just rely on browser gzipping?
And why use a plain text extension when the file is clearly plain text? And why have a custom decoder at all?

It looks like a custom encoding that uses byte values to encode coordinates/GeoJSON features.
For example, this line from /dist-demo/data/building-data.txt:
!GqgmwFrhwbM}C}##K#IBO#IlBh#BOBMn#PHBGd#KC
is decoded using the decodePolyline() utility function into this array:
[
[0.00004,0.00001],
[40.70541,0.00002],
[40.7062,-74.01624],
[40.70619,-74.01593],
[40.70618,-74.01587],
[40.70616,-74.01582],
[40.70615,-74.01574],
[40.7056,-74.01569],
[40.70558,-74.0159],
[40.70556,-74.01582],
[40.70532,-74.01575],
[40.70527,-74.01584],
[40.70531,-74.01586],
[40.70537,-74.01605],
[40.70537,-74.01603]
]
which is substantially larger in JSON format.
So my guess would be that the main reason is to be able to use smaller data files that are still portable/cacheable. It's still line-based clear text, so it's diffable as well.
Also, these files are still compressible. I assume that a full JSON file is not only larger to begin with but also exhibits less favorable compression characteristics than this file. A quick test on building-data.txt shows a compression ratio of roughly 2:1 for gzip/deflate (139,089 bytes to 72,660 bytes compressed). The compression result for the same file in raw JSON won't be anywhere near that.

Windows UWP apps via Cordova: convert canvas into stream for InkRecognizer in Javascript

I'm developing an application in Cordova for Android and Windows and struggle with the recogniztion of the text and numbers in canvas element on Windows platform (W10)
So last couple of days I've wasted my time trying to use the Windows.Media.OCR namespace for the recognition of the handwritten numbers on my HTML5 canvas scribble pad as you can see here on another SO question
I've then found the Windows.UI.Input.Inking namespace and there are few classes available for the Javascript solutions. I've found there is an InkManager that can recognize InkStrokes either in its own collection or strokes in InkRecognizerContainer.
InkRecognizerContainer has the "loadAsync()" method that accepts the input stream. So I've thought I'd just load the canvas converted to stream, and use the InkManager to recognize this container.
Unfortunately, if I try to use the HTML5 canvas converted to stream it throws me "WIN RT: Unsepcified Error" but not in the callbacks, it just crashes the app.
var blob = canvas.msToBlob();
var randomAccessStream = blob.msDetachStream();
var inkStrokeContainer = new Windows.UI.Input.Inking.InkStrokeContainer();
inkStrokeContainer.loadAsync(randomAccessStream).done(function () {
debugger
}, function (error) {
console.log(error);
});
Any help would be greatly appreciated as I'm spending way too much time on this.

InkStrokeContainer.LoadAsync requires a file with ink stroke information, not an arbitrary bitmap. Generally this will be an ISF (Ink Serialized Format) file saved out from a previous InkStrokeContainer. ISF files include stroke information as metadata in a gif file, so they can be displayed by normal gif viewers, but typical gif files do not include ISF data and cannot load into InkStrokeContainers.
InkManager does handwriting recognition not OCR. It requires individual stroke information and takes into account properties such as stroke order and direction. To use it you'll need to pass pointer information to the InkManager, typically as the input occurs, so the InkManager can build the strokes to recognize.
Take a look at the Simplified Ink Sample for an example. The JavaScript version uses WinJS rather than Cordova, but it shouldn't be too hard to convert. The inking is Windows specific, so you'll need to put this in a platform specific part of your app.

Generate a Word document in JavaScript with Docx.js?

I am trying to use docx.js to generate a Word document but I can't seem to get it to work.
I copied the raw code into the Google Chrome console after amending line 247 to fix a "'textAlign' undefined error"
if (inNode.style && inNode.style.textAlign){..}
Which makes the function convertContent available. The result of which is an Object e.g.
JSON.stringify( convertContent($('<p>Word!</p>)[0]) )
Results in -
"{"string":
"<w:body>
<w:p>
<w:r>
<w:t xml:space=\"preserve\">Word!</w:t>
</w:r>
</w:p>
</w:body>"
,"charSpaceCount":5
,"charCount":5,
"pCount":1}"
I copied
<w:body>
<w:p>
<w:r>
<w:t xml:space="preserve">Word!</w:t>
</w:r>
</w:p>
</w:body>
into Notepad++ and saved it as a file with an extension of 'docx' but when I open it in MS Word but it says 'cannot be opened because there is a problem with the contents'.
Am I missing some attribute or XML tags or something?

You can generate a Docx Document from a template using docxtemplater (library I have created).
It can replace tags by their values (like a template engine), and also replace images in a paid version.
Here is a demo of the templating engine: https://docxtemplater.com/demo/

This code can't work on a JSFiddle because of the ajaxCalls to local files (everything that is in the blankfolder), or you should enter all files in ByteArray format and use the jsFiddle echo API: http://doc.jsfiddle.net/use/echo.html

I know this is an older question and you already have an answer, but I struggled getting this to work for a day, so I thought I'd share my results.
Like you, I had to fix the textAlign bug by changing the line to this:
if (inNode.style && inNode.style.textAlign)
Also, it didn't handle HTML comments. So, I had to add the following line above the check for a "#text" node in the for loop:
if (inNodeChild.nodeName === '#comment') continue;
To create the docx was tricky since there is absolutely no documentation on this thing as of yet. But looking through the code, I see that it is expecting the HTML to be in a File object. For my purposes, I wanted to use the HTML I rendered, not some HTML file the user has to select to upload. So I had to trick it by making my own object with the same property that it was looking for and pass it in. To save it to the client, I use FileSaver.js, which requires a blob. I included this function that converts base64 into a blob. So my code to implement it is this:
var result = docx({ DOM: $('#myDiv')[0] });
var blob = b64toBlob(result.base64, "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
saveAs(blob, "test.docx");
In the end, this would work for simple Word documents, but isn't nearly sophisticated for anything more. I couldn't get any of my styles to render and I didn't even attempt to get images working. I've since abandoned this approach and am now researching DocxgenJS or some server-side solution.

You may find this link useful,
http://evidenceprime.github.io/html-docx-js/
An online demo here:
http://evidenceprime.github.io/html-docx-js/test/sample.html

You are doing the correct thing codewise, but your file is not a valid docx file. If you look through the docx() function in docx.js, you will see that a docx file is actually a zip containing several xml files.

I am using Open Xml SDK for JavaScript.
http://ericwhite.com/blog/open-xml-sdk-for-javascript/
Basically, on web server, I have a empty docx file as new template.
when user in browser click new docx file, I will retrieve the empty docx file as template, convert it to BASE64 and return it as Ajax response.
in client scripts, you convert the BASE64 string to byte array and using openxmlsdk.js to load the byte array as an javascript OpenXmlPackage object.
once you have the package loaded, you can use regular OpenXmlPart to create a real document. (inserting image, creating table/row ).
the last step is stream it out to end user as a document. this part is security related. in my code I send it back to webserver and gets saved temporarily. and prepare a http response to notify end user to download it.
Check the URL above, there are useful samples of doing this in JavaScript.

We Keep Coding

JavaScript is the programming language of the Web.

Create PDF in browser: Custom Font - javascript

Ok, looks like current implementations do not support it. So I'm porting libharu to javascript using emscripten: Project: https://github.com/manuels/hpdf.js Demos: http://manuels.github.com/hpdf.js/

Related

Turn CSV file into array using Vanilla Javascript without input element

Javascript for multiple file conversion in Adobe Acrobat failing

What is this encoding / why are these .txt files not plain text?

Windows UWP apps via Cordova: convert canvas into stream for InkRecognizer in Javascript

Generate a Word document in JavaScript with Docx.js?

Categories

Resources