I would like to implement an in-browser Microsoft Word document merge feature that will convert the merged document into PDF and offer it to the user for download. I would like to this process to be supported in Google Chrome and Firefox. Here is how I would like it to work:
Client-side JavaScript obtains the Word template document in docx format, either from a server, or by asking the user for a file upload (which it can then read using the FileReader API)
The JavaScript uses its local data structures (e.g., data lists it has obtained via Ajax) to expand the template into a document. It can do this either directly, by unzipping the docx file and processing its contents, or using DOCx.js. The template expansion is just a matter of substituting template variables with values obtained from the local data structures.
The JavaScript then converts the expanded template into PDF.
The JavaScript offers the PDF file to the user for download, e.g., using Downloadify.
The difficulty I am having is in step 3. My understanding (based on all the Googling I have done so far) is that I have the following options:
Require that the local machine is a Windows machine, and invoke Word on it, to convert to PDF. This can be done using a little bit of scripting using WScript.shell, and it looks doable with Internet Explorer. But based on what I have read, it doesn't look like I can call WScript.shell from within either Chrome or Firefox, because of their security constraints.
I am open to trying Silverlight to do the conversion, but I have not found enough documentation on how to do this. Ideally, if I used Silverlight, I would like to write the Silverlight code in JavaScript, because (a) I don't know much CSharp, and (b) I think it would be much easier in JavaScript.
Create a web service that will convert a given docx file to a pdf file, and invoke that service via Ajax. I would rather not do this, if possible, for a few reasons: (a) I tried using docx4java (I am a reasonably skilled Java programmer) but the conversion process is far too slow, and it does not preserve document content very well; and (b) I would like to avoid a call out to the network, to avoid security issues. It does seem possible to write a little service on a Windows server for doing the conversion, and if there is no other good option, I might go that route.
If I have been unclear about anything, please let me know. I would appreciate your ideas and feedback.
I love command line tools.
Load the doc to your server and use LibreOffice to convert it to PDF via the command line
soffice.exe --headless --convert-to pdf --outdir E:\Docs\Out E:\Docs\In\a.doc
You can display a progress bar to the user and when complete give them the option to download the doc.
More info on LibreOffice's command line parameters go here
Done.
Old old question now, but for anyone who stumbles across this, web assembly (wasm) now makes this sort of approach possible.
We've just released https://www.npmjs.com/package/#nativedocuments/docx-wasm which can perform the conversion locally.
Related
I am trying to complete an impossible mission.
I need to generate docx documents on ServiceNow (server side) which implements the Javascript Rhino engine. Doing do on the client side is super easy, I usually use docxtemplater or similar great libraries. The problem here is that we need to build it on the server and using ServiceNow technologies (script includes, etc).
That said, I am trying to port the client docxtemplater version but I am struggling because on the server there is no concept of DOM.
At the same time, using the server side version is difficult because ServiceNow does not use Node js but Rhino, and all libraries out there are based on Node.
The best thing I was able to do using vanilla js is to generate a data uri that, when downloaded from the browser, returns a docx document, but I was wandering if anyone has any suggestions.
Thanks a lot.
There are at least two ways to accomplish this. One is to embrace the nightmare, and either transpile the OpenXml JS libs to ES5 compatibility or rewrite them. The other is to create a MS Word template, encode as Base64 text (as it's zipped XML) and save in ServiceNow, then unzip and traverse the XML using the ServiceNow XMLDocument2 library to update the text. Finally, you re-zip and save the file to create the updated OpenXml document.
The second solution requires you to get JSZip in ES5.
The source code to my solution is currently proprietary and I am not free to share it, but it can be done. Just make sure you've got a big enough budget, as it's not trivial and takes a fair amount of time to implement.
I have a node web app that needs to convert a docx file into pdf (using client side resources only and no plugins). I've found a possible solution by converting my docx into HTML using docxjs and then HTML to PDF using jspdf (docx->HTML->PDF).
This solution could make it but I encountered several issues especially with rendering. I know that docxjs doesn't keep the same rendering in HTML as the docx file so it is a problem...
So my question is do you know any free module/solution that could directly do the job without going through HTML (I'm open to odt as a source as well)? If not, what would you advise me to do?
Thanks
As you already know there is no ready-to-use and open libs for this.. You just can't get good results with available variants. My suggesition is:
Use third party API. Like https://market.mashape.com/convertapi/word2pdf-1#!documentation
Create your own service for this purpose. If you have such ability, I suggest to create a small server on node.js (I bet you know how to do this). You can use Libreoffice as a good converter with good render quality like this:
libreoffice -headless -invisible -convert-to pdf {$file_name} -outdir /www-disk/
Don't forget that this is usually takes a lot of time, do not block the request-answer flow: use separate process for each convert operation.
And the last thing. Libreoffice is not very lightweight but it has good quality. You can also find notable unoconv tool.
As of January 2019, there is docx-wasm, which works in node and performs the conversion locally where node is installed. Proprietary but freemium.
It appears that even after three years ncohen had not found an answer. It was also unclear if it had to be a free (as in dollars) solution.
The original requirements were:
using client side resources only and no plugins
Do you mean you don't want server side conversion? Right, I would like my app to be totally autonomous.
Since all the other answers/comments only offered server side component solutions, which the author clearly stated was not what they wanted, here is a proposed answer.
The company I work for has had this solution for a few years now, that can convert DOCX (not odt yet) files to PDF completely in the browser, with no server side component required. This currently uses either asm.js/PNaCl/WASM depending on the exact browser being used.
https://www.pdftron.com/samples/web/samples/viewing/viewing/
Open an office file using the demo above, and you will see no server communication. Everything is done client side. This demo works on mobile browsers also.
Is it possible with Javascript or jQuery to convert mp3, wav, etc. to m4r format?
Let's assume you had a library that can change the format of files.
Let's also assume you only need the application to work on current browsers that implement FileAPI or FileReference so you can have access to uploaded files (you can't have access to them without FileAPI or FileReference unless you use Flash or Java Applets or equivalent technologies).
You wouldn't be able to write the output file back to the user because JavaScript is not allowed to access the local filesystem.
Your only solution would become sending the converted file to the server and the server sending it back to you with a force download directive so that the user will be prompted to download the results.
Now back to if there were a library that can the conversion (or even native JavaScript)... I haven't heard of any. It's not impossible to build one but it is impractical and wouldn't run very fast.
Edit:
Let's not forget Node.js!
It's a backend server that uses Google Chrome's V8 JavaScript interpretor/compiler. And it runs JavaScript as a backend scripting engine.
You have access to filesystem, databases and everything if you use that (or any other backend system for that matter) and still be using JavaScript. You can use libraries too. Either written in JavaScript or libraries written in other languages that have been linked to interface with Node.js.
Edit 2:
There is a PC emulator written entirely in JavaScript. It runs binary executables if you want it to. It's called JSLinux.
If you're feeling particularly rambunctious you can grab the ffmpeg binary executable (compiled with static linking). And embed it into your application code as an uuencoded string then use JSLinux to execute the commands and grab the results.
Indeed, it is possible doing this on the client using the latest js technologies. A web-worker thread can do the work in the background. At least in Firefox and Chrome it is also possible to read ("upload in memory") and write ("download from memory") files using the new W3C File API, see here.
I managed to read files via drag&drop from and within the client using google's GWT which in the end is plain javascript, so it must also be possible to do it "natively".
Besides that, the conversation algorithm of course has to be implemented in a javascript web worker to avoid blocking the gui. This should be the hardest part, but not impossible, though.
You would need a backend to do this. You may want to look into the PHPExtension of FFmpeg
I have a Rails application that has some JavaScript that needs to parse CSVs and make some AJAX calls based on each record.
I'd like to just load the local CSV directly into browser memory and have the JavaScript parse it and make the required AJAX calls but I haven't found a cross-browser, dependable way to accomplish this (I need to support cruddy old IE6).
I could upload the CSV to my rails application but I plan on hosting the application on Heroku and as I understand it, Heroku doesn't allow you to edit the files system(create files). I could also write the CSVs to a database but these CSVs are large 10mb+ and I imagine I will undoubtedly suffer performance costs in doing this.
Is my best option pushing the CSV to Rails and having Rails respond with a JSON or string version of the CSV? This seems somewhat computationally expensive given the size of these CSVs. I'd prefer to keep it on the client-side. If that's the case can someone point me to an example on how to accomplish this or something similar?
Edit: I don't want users to have to copy and paste these CSVs into a textfield manually.
Edit2: Also, I'm aware of the security restrictions on accessing the local filesystem via JS. A solid flash embed is an acceptable option.
I am trying to write a small web tool which takes an Excel file, parses the contents and then compares the data with another dataset. Can this be easily done in JavaScript? Is there a JavaScript library which does this?
How would you load a file into JavaScript in the first place?
In addition, Excel is a proprietary format and complex enough that server side libraries with years in development (such as Apache POI) haven't yet managed to correctly 100% reverse engineer these Microsoft formats.
So I think that the answer is that you can't.
Update: That is in pure JavaScript.
Update 2: It is now possible to load files in JavaScript: https://developer.mozilla.org/en-US/docs/DOM/FileReader
In the past four years, there have been many advancements. HTML5 File API has been embraced by the major browser vendors and performance enhancements actually make it somewhat possible to parse excel files (both xls and xlsx) in the browser.
My entries in this space:
http://oss.sheetjs.com/js-xls/ (xls)
http://oss.sheetjs.com/js-xlsx/ (xlsx)
Both are pure-JS parsers
To do everything in js, you'll have to use ActiveX and probably the office web components as well. Just a suggestion, but you probably don't want to go this route; it'll be inefficient and IE/Win only. You'll be better off with a server based solution.
You will need to use ActiveX (see W3C Schools on the use of AJAX) and register the file in the hosting computers Dataconnectors (only the computer hosting the file). Unlike mentioned before, this method is not Microsoft platform dependant (for the client anyways) and you do not need to have Office components installed.
This can be done for most datafiles registered in Windows, including MDB's, and allows you as much control as you want, as you can assign different Windows Accounts for different purposes.
Like I said before, this all is serverside and has no impact on the client, apart from maybe retrieving credentials, actions and all that.
This method uses JavaScript, SQL (no, not even MSSQL, just SQL standard) and requires only that the hosting computer is running ANY Microsoft NT platform.
What Windows dataconnectors do is provide a generalised interface for various data components much like DirectX does for videocards and other peripherals. You can also use it to link an MDB (Microsoft Access) to a MySQL server and feed data live that way, which I believe is even simpler than using XLS spreadsheets...especially since you can import XLS into MDB.
Do you really need an Excel file? Why not use Excel to export the data in CSV or XML and load that?
The Excel file format is very specific to Excel's implementation. If you just need the data, use a file format that just contains the data.