Extract PDF Form Data Using JavaScript and write to CSV File - javascript

I have been given a PDF file with a form. The form is not formatted as a table. My requirement is to extract the form field values, and write them to a CSV file which can be imported into Excel. I have tried using the automated "Merge data files to Spreadsheet" menu item in Acrobat Pro, but the output includes both the labels and form field values. I am interested in mostly just the form field values.
I would like to use JavaScript to extract the form data, and instruct JavaScript how to write the CSV (since I know what the end spreadsheet should look like). I got as far as extracting the form fields:
this.getField("Today_s_Date").value;
And following this post: How to write a text file in Acrobat Javascript , I tried to write to CSV using:
var cMyC = "abc";
var doc = this.createDataObject({cName: "test.txt", cValue: cMyC});
but I get the following error:
"SyntaxError: syntax error
1:Console:Exec"
Ideally, I do not want to use an online third party tool to do this, because the data is sensitive. But please let me know if you have suggestions. The ideal output will be a CSV file that an end business user can open in Excel to see the spreadsheet format of her choice.
Has anyone done this before? Open to hearing any alternative solutions as well. Thanks in advance!

Your code should work, make sure you are selecting the entire code when running it in the console.
For security reasons you are limited in what you can output from Acrobat without user interaction. There is a good discussion of what can be output from PDF's here, and if you haven't already, be sure to check out what's possible with exportDataObject() in the reference.
An example to get you going -- you could place a button on the form that would iterate through each of the fields in the form, adding them to an array that could then be output as a csv.
Something like:
var fieldValues = [];
for (var i = 0; i < this.numFields; i++)
fieldValues.push(this.getField(this.getNthFieldName(i)).value);
this.createDataObject('output.csv', fieldValues.join());
this.exportDataObject({ cName:'output.csv', nLaunch:'2'});
In this example the .csv would be opened as a temporary file by the default csv program on the machine. Alternatively you could omit nLaunch, and give the user a file save dialog.

Related

Parse csv with line breaks in a single cell in google app script

I am trying to load some csv data onto spreadsheet. Unfortunately, I keep getting this error when I try to feed my csv data onto Utilities.parseCsv():
Exception: Could not parse text.
The payload for parseCsv looks something like this:
"\"colName1\", \"colName2\"\n, \"<some html>\n<more html>\", \"colVal2\"\n"
Initially, I thought it might be because there are some html stuff inside the csv data. However, after doing some more testing, I realized it's the \n that's screwing things up. That's because when I take out the \n<more html>\" part from that csv string, it's now able to parse my csv data. Is there a way to get around this without removing that portion of the payload?
My code is pretty simple but it looks like this:
const response = UrlFetchApp.fetch(csvLink);
const csvData = response.getContentText();
const parsedCsvData = Utilities.parseCsv(csvData);
where csvLink is a url to the csv that allows users to grab the csv file/contents.
Essentially, I want to be able to parse my csv whilst being able to keep the line breaks present in a single cell.
Instead of using parseCsv(), I had to approach this problem in a different way since I couldn't just replace all \n with a different character. I ended up using batchUpdate() instead. This way, I was able to paste my csv data directly onto google spreadsheet in a more efficient manner.
Here is a pretty good github post that talks about different possible ways to write csv data onto google spreadsheet :
https://gist.github.com/tanaikech/030203c695b308606041587e6da269e7

Javascript to prompt for pdf files to insert?

I am attempting to have a button on a form that will launch file explorer for the user to select pdf files to insert into form.
The insertPages script will insert pages from a specific cPath, but I need the user to be able to select the pages to insert, as they will be different from case to case. Is there a way to accomplish this using javascript?
I am using Bluebeam, which is very similar to Acrobat. I have created several templates and javascript code using the Acrobat API Reference, and thus far the Bluebeam engine appears to operate nearly identically. In a perfect world, the button would launch the "Insert Pages" menu in Bluebeam.
Thanks in advance for the help!!
If this were Acrobat, I'd use app.browseForDoc(). The returned object has three properties...
cFS - A string containing the resulting file system name for the chosen file.
cPath - A string containing the resulting path for the chosen file.
cURL - A string containing the resulting URL for the chosen file.
Get the full path to the file from there then use insertPages.
In Acrobat, it can only be run in a Privileged context. I'm not sure if it will work the same way in BlueBeam though they have done a fairly good job of duplicating the form field related JavaScripts.

Text file data into a webpage for graphing

I am new to web dev and I have a text file that I created using C# to collect some data from a website. Now I want to use that data to make graphs or some way to show the info on a website. Is it possible to use I/O in javascript or what is my best option here? Thanks in advance.
You have several options at your disposal:
Use a server-side technology (like ASP.Net, Node.js etc) to load, parse and display the file contents as HTML
Put the file on a web server and use AJAX to load and parse it. As #Quantastical suggested in his comment, convert the file to JSON forma for easir handling in Javascript.
Have the original program save the file in HTML format instead of text, and serve that page. You could just serve the txt file as is, but the user experience would be horrible.
Probably option 1 makes the most sense, with a combination of 1 + 2 to achieve some dynamic behavior the most recommended.
If you are working in C# and ASP then one option is to render the html from the server without need for javascript.
In C# the System.IO namespace gives access to the File object.
String thetext = File.ReadAllText(fileName);
or
String[] thetextLines = File.ReadAllLines(fileName);
or
If you have JSON or Xml in the file then you can also read and deserialize into an object for easier use.
When you have the text you can create the ASP/HTML elements with the data. A crude example would be:
HtmlGenericControl label = new HtmlGenericControl("div");
label.InnerHTML = theText;
Page.Controls.Add(label);
There are also HTMLEncode and HTMLDecode methods if you need them.
Of course that is a really crude example of loading the text at server and then adding Html to the Asp Page. Your question doesn't say where you want this processing to happen. Javascript might be better or a combination or C# and javascript.
Lastly to resolve a physical file path from a virtual path you can use HttpContext.Current.Server.MapPath(virtualPath). A physical path is required to use the File methods shown above.

Javascript - Read file and compare content

I'm having a file which contains a couple of space separated (or comma separated, it will be editable) serial-numbers (all unique).
Now through my Oracle APEX I get one serial-number. My goal is to check if this serial code which could be passed on to a parameter of obtained through $v('P#_SERIAL_ID') is equal to one of the serial-numbers in the file.
Is this even possible within Javascript? If so, is there an existing function/code to achieve my goal?
Stackoverflow questions that didn't help me but look alike:
Javascript-read-file-contents
C#-reading-and-editing-file
Java-string-comparison
You can do this without JavaScript. Import your file into Apex through an Data Load Wizard Page so you will have the content of your file into a table. This way you can compare your information through some kind of SQL validation.
If you don`t like the Data Load Wizard Page you can add a file browse item on a simple page that will take your file and save it as a blob into the database. From there you can again process the file and compare the values.

How to send only the text from a text file

What I need to do is:
Let user choose txt file from his disc
Get the text from it to let's say a variable
Send it (the variable value) via AJAX
For the first point I want to know if I should use normal input type (like if I would like to send file via POST) <input type="file">
For the second point I need to know how to get the name of the file user selected and then read text from it. Also I'm not good with javascript so I don't really know how long can a string be there (file will have about 15k lines on average)
For the third I need nothing to know if I can have the data stored in a variable or an array.
Thanks in advance.
P.S. I guess javascript is not a fast language, but (depending on the editor) it sometimes opens on my computer the way that I have all the needed data in first 5 or 6 lines. Is it possible to read only first few lines from the file?
It is possible to get what you want using the File API as #dandavis and other commentors have mentioned (and linked), but there are some things to consider about that solution, namely browser support. Bottom line is the File API is currently a working draft of the w3c. And bottom line is even w3c recommended things aren't always fully supported by all browsers.
What solution is "best" for you really boils down to what browser/versions you want to support. If it were my own personal project or for a "modern" site/audience, I would use the File API. But if this is for something that requires maximum browser support (for older browsers), I would not currently recommend using the File API.
So having said all that, here is a suggested solution that does NOT involve using the FIle API.
supply an input type file in a form for the user to specify file. User will have to select the file (javascript cannot do this)
use form.submit() or set the target attribute to submit the form. There is an iframe trick for submitting a form without refreshing the page.
use server-side language of choice to respond with the file info (name, contents, etc.). For example in php you'd access the posted file with $_FILES
then you can use javascript to parse the response. Normally you'd send it as a json encoded response. Then you can do whatever you want with the file info in javascript.
With Chrome and Firefox you can read the contents of a text file like this:
HTML:
<input type="file" id="in-file" />
JavaScript with jQuery:
var fileInput = $('#in-file');
fileInput.change(function(e) {
var reader = new FileReader();
reader.onload = function(e) {
console.log(reader.result);
}
reader.readAsText(fileInput[0].files[0]);
});
IE doesn't support the FileReader object.

Categories