how can we do pdf merging using javascript

how can we do pdf merging using javascript - javascript

I wanted to do client side scrpting for merging and splitting pdf, so i wanted to use itextsharp. Can that be used with javascript. I am new to Javascript. Please help me with your valuable suggestions.

I found an entirely client-side solution using the PDF-LIB library: https://pdf-lib.js.org/
It uses the function mergeAllPDFs which takes one parameter: urls, which is an array of urls to the files.
Make sure to include the following in the header:
<script src='https://cdn.jsdelivr.net/npm/pdf-lib/dist/pdf-lib.js'></script>
<script src='https://cdn.jsdelivr.net/npm/pdf-lib/dist/pdf-lib.min.js'></script>
Then:
async function mergeAllPDFs(urls) {
const pdfDoc = await PDFLib.PDFDocument.create();
const numDocs = urls.length;
for(var i = 0; i < numDocs; i++) {
const donorPdfBytes = await fetch(urls[i]).then(res => res.arrayBuffer());
const donorPdfDoc = await PDFLib.PDFDocument.load(donorPdfBytes);
const docLength = donorPdfDoc.getPageCount();
for(var k = 0; k < docLength; k++) {
const [donorPage] = await pdfDoc.copyPages(donorPdfDoc, [k]);
//console.log("Doc " + i+ ", page " + k);
pdfDoc.addPage(donorPage);
}
}
const pdfDataUri = await pdfDoc.saveAsBase64({ dataUri: true });
//console.log(pdfDataUri);
// strip off the first part to the first comma "data:image/png;base64,iVBORw0K..."
var data_pdf = pdfDataUri.substring(pdfDataUri.indexOf(',')+1);
}

There are several client-side JavaScript libraries supporting merging and splitting existing PDFs that I'm aware of which might be useful to you:
PDF Assembler supports this and has a live demo. The Prior Art / Alternatives part of its README is worth checking out, where several other JavaScript PDF libraries (not necessarily client-side) are mentioned and commented on.
pdf-lib is a newer library under active development. Similarly, the Prior Art part is worth checking out.

If you just want to display multiple PDFs as merged into a single document in the browser, this is surely possible with pdf.js - see my answer here.
Surely that example could also be used to show a specific subset of pages, thus giving the user the ability to split it.
However, if you need the result to be available for download, there's no way (afaik) around server-side processing - at least if you want to stay in the open source, free of charge realm.
Update: Using a combination of pdf.js (Mozilla) to render the pdf - which happens on the canvas by default - and jsPdf (parall.ax) one should be able to get the merged result (or even anything else that was drawn on your canvas/es) for download, printing, etc. by using the canvas' toDataUrl() approach found this answer to export page by page (using jsPdf's addPage function)

If you want to do a merge operation then you can use easy-pdf-merge below is the link :
https://www.npmjs.com/package/easy-pdf-merge
Here is an example:
var merge = require('easy-pdf-merge');
merge(['File One.pdf','File Two.pdf'],'File Ouput.pdf',function(err){
if(err)
return console.log(err);
console.log('Successfully merged!');
});
Hope this helps.

No, you can't use iTextSharp (a .Net port of iText, which was written in Java) with JavaScript in a browser.
You could use iText in a Java applet, or there are a couple of PDF libraries for JavaScript if you search (mostly experimental ones, I understand, such as this one Mozilla did, or this one).

You can use pure JS with pdfjs-dist:
function pdfMerge(urls, divRootId) {
//necessário pois para manter as promisses sincronizadas com await
(async function loop() {
for (url_item of urls) {
console.log("loading: " + url_item);
var loadingTask = pdfjsLib.getDocument(url_item);
//sem isso fica dessincronizado
await loadingTask.promise.then(function (pdf) {
pdf.getMetadata().then(function (metaData) {
console.log("pdf (" + urls + ") version: " + metaData.info.PDFFormatVersion); //versão do pdf
}).catch(function (err) {
console.log('Error getting meta data');
console.log(err);
});
console.log("páginas: " + pdf.numPages);
let i = 0;
while (i < pdf.numPages) {
var pageNumber = i;
pdf.getPage(pageNumber).then(function (page) {
var div = document.createElement("div");
var documentosDiv = document.querySelector('#' + divRootId);
documentosDiv.appendChild(div);
var canvas = document.createElement("canvas");
div.appendChild(canvas);
// Prepare canvas using PDF page dimensions
var viewport = page.getViewport({scale: 1, });
//var canvas = document.getElementById('the-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
renderTask.promise.then(function () {
console.log('Page rendered');
});
});
i++;
}
// Fetch the first page
}, function (reason) {
// PDF loading error
console.error(reason);
});
}
})();
}
html example:
<script src="pdf.js"></script>
<script src="pdf.worker.js"></script>
<h1>Merge example</h1>
<div id="documentos-container"></div>
<style>
canvas {
border:1px solid black;
}
</style>

Here https://codepen.io/hudson-moreira/pen/MWmpqPb you have a working example of adding multiple pdfs and combining them into one, all directly from the front-end using the Hopding/pdf-lib library. The layout needs some adjustment but in this example it is possible to dynamically add and remove files directly from the screen.
In short, this function combines the PDFs, but it is necessary to read them in Uint8Array format before applying this function.
async function joinPdf() {
const mergedPdf = await PDFDocument.create();
for (let document of window.arrayOfPdf) {
document = await PDFDocument.load(document.bytes);
const copiedPages = await mergedPdf.copyPages(document, document.getPageIndices());
copiedPages.forEach((page) => mergedPdf.addPage(page));
}
var pdfBytes = await mergedPdf.save();
download(pdfBytes, "pdfconbined" + new Date().getTime() + ".pdf", "application/pdf");
}

This is a library for javascript to generate PDFs. It works on the client:
http://parall.ax/products/jspdf#
Another solution would be to make an ajax call, generate the PDF on the server with whatever technology you want to use (itextsharp) and return the generated PDF to the client.

I achieved this using pdfjs on browser (i.e. compiled it to browser environment using parcel)
https://github.com/Munawwar/merge-pdfs-on-browser (folder pdfjs-approach)

Related

How to execute / access local file from Thunderbird WebExtension?

I like to write a Thunderbird AddOn that encrypts stuff. For this, I already extracted all data from the compose window. Now I have to save this into files and run a local executable for encryption. But I found no way to save the files and execute an executable on the local machine. How can I do that?
I found the File and Directory Entries API documentation, but it seems to not work. I always get undefined while trying to get the object with this code:
var filesystem = FileSystemEntry.filesystem;
console.log(filesystem); // --> undefined
At least, is there a working AddOn that I can examine to find out how this is working and maybe what permissions I have to request in the manifest.json?
NOTE: Must work cross-platform (Windows and Linux).

The answer is, that WebExtensions are currently not able to execute local files. Also, saving to some local folder on the disk is also not possible.
Instead, you need to add some WebExtension Experiment to your project and there use the legacy APIs. There you can use the IOUtils and FileUtils extensions to reach your goal:
Execute a file:
In your background JS file:
var ret = await browser.experiment.execute("/usr/bin/executable", [ "-v" ]);
In the experiment you can execute like this:
var { ExtensionCommon } = ChromeUtils.import("resource://gre/modules/ExtensionCommon.jsm");
var { FileUtils } = ChromeUtils.import("resource://gre/modules/FileUtils.jsm");
var { XPCOMUtils } = ChromeUtils.import("resource://gre/modules/XPCOMUtils.jsm");
XPCOMUtils.defineLazyGlobalGetters(this, ["IOUtils");
async execute(executable, arrParams) {
var fileExists = await IOUtils.exists(executable);
if (!fileExists) {
Services.wm.getMostRecentWindow("mail:3pane")
.alert("Executable [" + executable + "] not found!");
return false;
}
var progPath = new FileUtils.File(executable);
let process = Cc["#mozilla.org/process/util;1"].createInstance(Ci.nsIProcess);
process.init(progPath);
process.startHidden = false;
process.noShell = true;
process.run(true, arrParams, arrParams.length);
return true;
},
Save an attachment to disk:
In your backround JS file you can do like this:
var f = messenger.compose.getAttachmentFile(attachment.id)
var blob = await f.arrayBuffer();
var t = await browser.experiment.writeFileBinary(tempFile, blob);
In the experiment you can then write the file like this:
async writeFileBinary(filename, data) {
// first we need to convert the arrayBuffer to some Uint8Array
var uint8 = new Uint8Array(data);
uint8.reduce((binary, uint8) => binary + uint8.toString(2), "");
// then we can save it
var ret = await IOUtils.write(filename, uint8);
return ret;
},
IOUtils documentation:
https://searchfox.org/mozilla-central/source/dom/chrome-webidl/IOUtils.webidl
FileUtils documentation:
https://searchfox.org/mozilla-central/source/toolkit/modules/FileUtils.jsm

Google chrome froce async function?

I'm a beginner in developpement, and for my personal knowledge I try to make a website in html, css and js.
The website get a random project from Artstation with Axios at a Json and display it on the web page, I have 24 random image generated
I share you the Js code :
const images = document.getElementsByClassName("pic");
const redirect1 = document.getElementsByClassName("redirect1");
const redirect2 = document.getElementsByClassName("redirect2");
randomButton.addEventListener("click", function () {
for (i = 0; i < images.length; i++) {
images[i].src = "small_square.png";
loadImages(i);
}
});
async function loadImages(i) {
const response = await axios.get("https://www.artstation.com/random_project.json");
images[i].src = response.data.cover.smaller_square_image_url;
redirect1[i].href = response.data.permalink;
redirect2[i].href = response.data.permalink;
}
Don't juge it I know is not verry clean, now I explain my problem, I made the code like that for generate the 24 images simultaneously, the problem is if I not open my console in google chrome he load the function one by one, this is a issue I don't have with firefox
I have made the result in video
Did you know how fix that ?

Using pdf.js to render a PDF but it doesn't work and I don't get any error messages to help me debug the issue

I'm trying to build a Flask app where I upload pdf's and I'm working on previewing them before submitting to the back-end.
The script I'm using is as follows:
const imageUploadValidation = (function () {
"use strict";
pdfjsLib.GlobalWorkerOptions.workerSrc =
"https://mozilla.github.io/pdf.js/build/pdf.js";
const onFilePicked = function (event) {
// Select file Nodelist containing 1 file
const files = event.target.files;
const filename = files[0].name;
if (filename.lastIndexOf(".") <= 0) {
return alert("Please add a valid file!");
}
const fileReader = new FileReader();
fileReader.onload = function (e) {
const pdfData = e.target.result;
let loadingTask = pdfjsLib.getDocument({ data: pdfData })
loadingTask.promise.then(function (pdf) {
console.log("PDF loaded", pdf);
pdf.getPage(1).then((page) => {
console.log("page loaded", page);
// var scale = 1.5;
// var viewport = page.getViewport({ scale: scale });
var iframe = document.getElementById("image-preview");
iframe.src = page
// var context = canvas.getContext("2d");
// canvas.height = viewport.height;
// canvas.width = viewport.width;
// var renderContext = {
// canvasContext: context,
// viewport: viewport,
// };
// var renderTask = page.render(renderContext);
// renderTask.promise.then(function () {
// console.log("Page rendered");
// });
});
})
.catch((error) => {
console.log(error);
});
};
const pdf = fileReader.readAsArrayBuffer(files[0]);
console.log("read as Data URL", pdf);
};
const Constructor = function (selector) {
const publicAPI = {};
const changeHandler = (e) => {
// console.log(e)
onFilePicked(e);
};
publicAPI.init = function (selector) {
// Check for errors.
const fileInput = document.querySelector(selector);
if (!selector || typeof selector !== "string") {
throw new Error("Please provide a valid selector");
}
fileInput.addEventListener("change", changeHandler);
};
publicAPI.init(selector);
return publicAPI;
};
return Constructor;
})();
imageUploadValidation("form input[type=file]");
The loading task promise never seems to run. Everything seems to work up until that point. I'm not familiar with this Promise syntax, so I can't be sure if the problem is there or how I'm passing in the pdf file.
P.S. The commented out code is the original way I had this setup, what
s uncommented was just me testing a different way.

Check Datatype
First you might want to check what your getting back from your FileReader, specifically what is the datatype for pdfData. If you have a look at the documentation (direct link) getDocument is expecting a Unit8Array or a binary string.
Add Missing Parameters
The next problem you have is your missing required parameters in your call to getDocument. Here is the minimum required arguments:
var args = {
url: 'https://example.com/the-pdf-to-load.pdf',
cMapUrl: "./cmaps/",
cMapPacked: true,
}
I have never used the data argument in place of the url but as long as you supply the correct datatype you should be fine. Notice that cMapUrl should be a relative or absolute path to the cmap folder. PDFJS often needs these files to actually interpret a PDF file. Here are all the files from the demo repository (GitHub pages): cmaps You'll need to add these to your project.
Instead of using data I would recommend uploading your files as blobs and then all you have to do is supply the blob URL as url. I am not familiar with how to do that, I just know its possible in modern browsers.
Where Is Your Viewer / You Don't Need iFrame or Canvas
PDFJS just needs a div to place the PDF inside of. It's picky about some of the CSS rules, for exmaple it MUST be positioned absolute, otherwise PDFJS generates the pages as 0px height.
I don't see PDFViewer or PDFLinkService in your code. It looks like you are trying to build the entire viewer from scratch yourself. This is no small endeavor. When you get loadingTask working correctly the response should be handled something like this:
loadingTask.promise.then(
// Success function.
function( doc ) {
// viewer is holding: new pdfjsViewer.PDFViewer()
// linkService is: new pdfjsViewer.PDFLinkService()
viewer.setDocument( doc );
linkService.setDocument( doc );
},
// Error function.
function( exception ) {
// What type of error occurred?
if ( exception.name == 'PasswordException' ) {
// Password missing, prompt the user and try again.
elem.appendChild( getPdfPasswordBox() );
} else {
// Some other error, stop trying to load this PDF.
console.error( exception );
}
/**
* Additional exceptions can be reversed engineered from here:
* https://github.com/mozilla/pdf.js/blob/master/examples/mobile-viewer/viewer.js
*/
}
);
Notice that PDFViewer does all the hard work for you. PDFLinkService is needed if you want links in the PDF to work. You really should checkout the live demo and the example files.
Its a lot of work but these example files specifically can teach you all you need to know about PDFJS.
Example / Sample Code
Here is some sample code from a project I did with PDFJS. The code is a bit advanced but it should help you reverse engineer how PDFJS is working under the hood a bit better.
pdfObj = An object to store all the info and objects for this PDF file. I load multiple PDFs on a single page so I need this to keep them separate from each other.
updatePageInfo = My custome function that is called by PDFJS's eventBus when the user changes pages in the PDF; this happens as they scroll from page to page.
pdfjsViewer.DownloadManager = I allow users to download the PDFs so I need to use this.
pdfjsViewer.EventBus = Handles events like loading, page changing, and so on for the PDF. I am not 100% certain but I think the PDFViewer requires this.
pdfjsViewer.PDFViewer = What handles actually showing your PDF to users. container is the element on the page to render in, remember it must be positioned absolute.
// Create a new PDF object for this PDF.
var pdfObj = {
'container': elem.querySelector('.pdf-view-wrapper'),
'document': null,
'download': new pdfjsViewer.DownloadManager(),
'eventBus': new pdfjsViewer.EventBus(),
'history': null,
'id': id,
'linkService': null,
'loaded': 0,
'loader': null,
'pageTotal': 0,
'src': elem.dataset.pdf,
'timeoutCount': 0,
'viewer': null
};
// Update the eventBus to dispatch page change events to our own function.
pdfObj.eventBus.on( 'pagechanging', function pagechange(evt) {
updatePageInfo( evt );
} );
// Create and attach the PDFLinkService that handles links and navigation in the viewer.
var linkService = new pdfjsViewer.PDFLinkService( {
'eventBus': pdfObj.eventBus,
'externalLinkEnabled': true,
'externalLinkRel': 'noopener noreferrer nofollow',
'externalLinkTarget': 2 // Blank
} );
pdfObj.linkService = linkService;
// Create the actual PDFViewer that shows the PDF to the user.
var pdfViewer = new pdfjsViewer.PDFViewer(
{
'container': pdfObj.container,
'enableScripting': false, // Block embeded scripts for security
'enableWebGL': true,
'eventBus': pdfObj.eventBus,
'linkService': pdfObj.linkService,
'renderInteractiveForms': true, // Allow form fields to be editable
'textLayerMode': 2
}
);
pdfObj.viewer = pdfViewer;
pdfObj.linkService.setViewer( pdfObj.viewer );

MS PowerPoint JavaScript API BETA - How can I insert a new slide with insertSlidesFromBase64(base64File, options)?

I am experimenting with the JavaScript APIs in beta/preview for a MS PowerPoint Add-in.
What I want to achieve is to insert a new slide from a base64-encoded .pttx file into the current document.
I would expect that this is possible with the insertSlidesFromBase64(base64File, options) method, which is documented here:
PowerPoint API doc
I have included https://appsforoffice.microsoft.com/lib/beta/hosted/office.js in the Add-in
I am working on Mac OS 10.15.7
I have updated PowerPoint to the newest version in the Beta Channel. The PowerPoint Version is 16.44 (20111100).
Now, I am not quite sure if the beta API Methods are actually available in my environment.
The bigger issue I am facing though is, that I don't know on which object I can call this method. I think the method should be available somewhere in the context of the current document/presentation?!?
I think a very simple example of how I can insert a "base64EncodedPptx" with
insertSlidesFromBase64("base64EncodedPptx")
would solve the problem.

Your Mac PowerPoint version should have the implementation of this API.
In terms of very simple usage, here are some code snipplets:
await PowerPoint.run(async function(context) {
context.presentation.insertSlidesFromBase64( base64EncodedPptxFileAsString );
context.sync();
});
await PowerPoint.run(async function (context) {
context.presentation.insertSlidesFromBase64( base64EncodedPptxFileAsString,
{
formatting: "UseDestinationTheme",
targetSlideId: "257#",
sourceSlideIds: ["257#3396654126", "258#"]
});
context.sync();
});
From javascript side, you can use a file picker for example to get the base64 string:
If you have this in HTML
<form>
<input type="file" id="file" />
</form>
and this in the script:
$("#file").change(() => tryCatch(useInsertSlidesApi));
async function useInsertSlidesApi() {
const myFile = <HTMLInputElement>document.getElementById("file");
const reader = new FileReader();
reader.onload = async (event) => {
// strip off the metadata before the base64-encoded string
const startIndex = reader.result.toString().indexOf("base64,");
const copyBase64 = reader.result.toString().substr(startIndex + 7);
await PowerPoint.run(async function(context) {
context.presentation.insertSlidesFromBase64(copyBase64);
context.sync();
});
};
// read in the file as a data URL so we can parse the base64-encoded string
reader.readAsDataURL(myFile.files[0]);
}
/** Default helper for invoking an action and handling errors. */
async function tryCatch(callback) {
try {
await callback();
} catch (error) {
// Note: In a production add-in, you'd want to notify the user through your add-in's UI.
console.error(error);
}
}

Stream canvas to imgITools to make ico

So on WinXP I have been having a hard time converting a PNG file to an ICO with canvas. I found this method encodeImage. I don't know if it works but it looks promising but I can't figure out how to feed the image I drew on a canvas into imgITools.decodeData.
What should I use for aImageStream and/or aMimeType?
imgTools.decodeImageData(aImageStream, aMimeType, imgContainer);
This is more of my code:
img['asdf'].file = new FileUtils.File(myPathHere);
let iconStream;
try {
let imgTools = Cc["#mozilla.org/image/tools;1"]
.createInstance(Ci.imgITools);
let imgContainer = { value: null };
imgTools.decodeImageData(aImageStream, aMimeType, imgContainer);
iconStream = imgTools.encodeImage(imgContainer.value,
"image/vnd.microsoft.icon",
"format=bmp;bpp=32");
} catch (e) {
alert('failure converting icon ' + e)
throw("processIcon - Failure converting icon (" + e + ")");
}
let outputStream = FileUtils.openSafeFileOutputStream(img['asdf'].file);
NetUtil.asyncCopy(iconStream, outStream, netutilCallback);

Since you're having a canvas already(?), it would be probably easier to use either on of the following canvas methods:
toDataURI/toDataURIHD
toBlob/toBlobHD
mozFetchAsStream
There is also the undocumented -moz-parse-options:, e.g -moz-parse-options:format=bmp;bpp=32. (ref-tests seem to depend on it, so it isn't going away anytime soon I'd think).
So, here is an example for loading stuff into an ArrayBuffer.
(canvas.toBlobHD || canvas.toBlob).call(canvas, function (b) {
var r = new FileReader();
r.onloadend = function () {
// r.result contains the ArrayBuffer.
};
r.readAsArrayBuffer(b);
}, "image/vnd.microsoft.icon", "-moz-parse-options:format=bmp;bpp=32");
Here is a more complete example fiddle creating a 256x256 BMP icon.
Since you likely want to feed that data into js-ctypes, having an ArrayBuffer is great because you can create pointers from it directly, or write it to a file using OS.File.

We Keep Coding

JavaScript is the programming language of the Web.

how can we do pdf merging using javascript - javascript

I wanted to do client side scrpting for merging and splitting pdf, so i wanted to use itextsharp. Can that be used with javascript. I am new to Javascript. Please help me with your valuable suggestions.

No, you can't use iTextSharp (a .Net port of iText, which was written in Java) with JavaScript in a browser. You could use iText in a Java applet, or there are a couple of PDF libraries for JavaScript if you search (mostly experimental ones, I understand, such as this one Mozilla did, or this one).

This is a library for javascript to generate PDFs. It works on the client: http://parall.ax/products/jspdf# Another solution would be to make an ajax call, generate the PDF on the server with whatever technology you want to use (itextsharp) and return the generated PDF to the client.

I achieved this using pdfjs on browser (i.e. compiled it to browser environment using parcel) https://github.com/Munawwar/merge-pdfs-on-browser (folder pdfjs-approach)

Related

How to execute / access local file from Thunderbird WebExtension?

Google chrome froce async function?

Using pdf.js to render a PDF but it doesn't work and I don't get any error messages to help me debug the issue

MS PowerPoint JavaScript API BETA - How can I insert a new slide with insertSlidesFromBase64(base64File, options)?

Stream canvas to imgITools to make ico

Categories

Resources