Get PDF content from browser via Javascript - javascript

Background:
I am trying to migrate a traditional web application into Windows 8 Store App with the least efforts, the basic idea is run the web application into the web view.
Problem:
1. The web app will produce a PDF file and display it directly in the browser using PDF binary (Base64), is it possible to listen the action, and intercept it, and so that i can get the PDF content?
2. If it's impossible to intercept the action (open PDF in a popup browser tab), is it possible to get the PDF content via Javascript in a existent rendered browser page (with PDF rendered)?
3. I previously thought that it's PDF viewer plugin helps browser to open PDF, but sounds not, as i checked my IE setting, there's no PDF reader plugin enabled, then how it can open PDF?
4. In the existing web application, the PDF page is invoked via "window.open(url,...), while the URL is the link to response PDF binary, the corresponding server side implementation of this link is very sample, it just "write" the PDF binary into the "response" object, and set the "content type" with "application/pdf", that's it.

Related

How can I get all the text in Chrome's PDF viewer in Javascript?

I am writing a Chrome extension that manipulates PDF files and am looking to get all the text of a PDF file that's currently open in Chrome's PDF viewer.
I learned from
How can I get selected text in pdf in Javascript?
how to get selected text, but I couldn't find in the API a function that extracts the entire text (or better yet the PDF itself, and then send it to a server). Is it possible?
I am aware of the solution of sending the URL and downloading it on the server side, but sometimes it is problematic (e.g. PDFs from password-protected websites).
Thanks.
I have tried looking in the API, https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/resources/pdf/pdf_scripting_api.ts
I have also tried "downloading" the PDF on client side and sending it as Blob to a server, but it's a problematic solution (sometimes blocked for example, requires extra permission).

PDF overview links not working inside my angularjs app

My web application used PDF.js to load PDFs in the browser. It gets the PDFs from a REST API.
The web app is a single page angularjs affair. You can navigate inside the app and open up one of these PDFs. When you do open one of them, the relative links to other PDFs inside the Outline of the first PDF do not work.
When I access the REST API URL at say, api.example.com/rest/my-pdf.pdf, with firefox (which used PDF.js to render PDFs), the PDF opens and the Outline has the correct relative links.
I suspect that it has something to do with my app being on a different origin than the PDF serving REST API.
Each OutlineItem has an action dictionary with my relative links in there. The action dictionary has an ActionType (S) of Launch and a FileStream (F) value shown below.
I'm using the PDF.js viewer.html file to view PDFs.
This is done by creating an iframe like so:
<iframe src="/path/to/pdfjs/viewer.html?file=http://api.example.com/rest/my-pdf.pdf">
How can I get relative links working here?
P.S. I've scratched out only the actual filenames in red. I've left the path as is if it helps.
try this method
onclick=" window.open('../../assets/python/python.pdf'); return true;"
Hope it works for you.
The problem was that the links in the PDF outline were the "Launch ActionType".
PDF.js can't actually handle "Launch" types because it cannot launch external programs, seeing as it runs in a browser.
So, what PDF.js does is it pretends that the "Launch" type is a "GoToR ActionType" (Go To Remote).
This will work if the URL that the PDF is loaded from is the origin that PDF.js is also loaded from.
If this is not the case, like in my app (where PDF.js is loaded in an iframe inside an angular app), PDF.js is unable to resolve the URLs and so it leaves the links blank.
Section 8.5 of the PDF Reference contains information on Actions in PDFs.
PDF.js source file shows how different ActionTypes are processed.

Window.open loading localhost PDF as Blank page

I am running a local Tomcat server within which I am hosting a PDF report. Once my Tomcat server is up and running, if I enter the following URL in my browser:
https://localhost:9000/Report.pdf
Then the page displays just fine. But within my javascript application if I call window.open(https://localhost:9000/Report.pdf), then a page will open with that URL, but it will display as blank. Refreshing or reloading the page won’t help, I need to close the whole tab and paste the URL in order to get it to load properly.
The server is being linked through the Symphony Messaging application, so https is a must. I'm not quite sure if that's what's causing the error or windows.open just doesn't work with a pdf file. I've tested it with other file types (ex. https://localhost:9000/logo.png) and it works just fine.
I've seen some similar questions about passing a byte array into window.open to display a PDF, but this seems kind of redundant: do I really have to convert the PDF to a byte array and then have window.open convert it back to PDF format just to display?
FYI both the HTML page and underlying Javascript from which I am attempting to call window.open are hosted on localhost:9000 as well.

ways to open large-sized pdf without downloading (using open on javascript)

Good day,
I have a system that renders large amount of data through pdf ( 30mb + ). Now I want the user to view pdf first so he can either download it or just print it right away. as of the moment I am forcing the user to download the file since open( 'datauri here' ) wont work with larger files.. the problem with downloading is that files are multiplying and consumes space over time and also its not necessary for me to download all files that that they want to print right away.
I need a functionality that is similar to chrome's preview when using windows.print
can you please suggest any ideas or other things to do this?
I am currently using javascript library to create pdf (pdfmake). I am also using chrome as my main browser
You would have to make sure that the PDF is optimized for fast web view, and that your server is using the byteserving protocol for serving the file.
If that is the case, a useful PDF viewer (such as the web browser component provided by Acrobat/Reader) understands this protocol and requests (after the first page plus overhead of the PDF) only the data for the pages which are to be displayed.
A quick search did, however, not reveal whether the Chrome PDF viewing component is smart enough to understand the byteserving protocol.

Open PDF from HTML5 Storage

I want to store pdf files client side in one of the HTML5 storages (indexedDB or localstorage) and then open them later with the adobe reader.
The scenario is as follows:
The user visites my site and downloads a bunch of pdf's into a storage
Later the user revisits the site and wants to view one of the pre downloaded pdf's.
He chooses one of the stored pdf's and it gets rendered with the adobe reader (or the
default pdf renderer).
Is this possible with pure html5/js or do i have to write a firefox extension?
You can use the data URI scheme (http://en.wikipedia.org/wiki/Data_URI_scheme).
Something like this, but with a PDF:
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAKBJREFUeNpiYBjpgBFd4P///wJAaj0QO9DEQiAg5ID9tLIcmwMYsDgABhqoaTHMUHRxpsGYBv5TGqTIZsDkYWLo6gc8BEYdMOqAUQeMOoAqDgAWcgZAfB9EU63SIAGALH8PZb+H8v+jVz64KiOK6wIg+ADEArj4hOoCajiAqMpqtDIadcCoA0YdQIoDDtCqQ4KtBY3NAYG0csQowAYAAgwAgSqbls5coPEAAAAASUVORK5CYII=
You can see this example at its original page: http://iconhandbook.co.uk/reference/examples/data/
Create links with PDF type and base64 encoded data (representing the PDF binary)
PDF name
The base64 encoded content can be stored in HTML5 storage.
Warning: does not work for IE (excuses for security reasons).

Categories