pdfjs cannot use function 'getPageIterator'

pdfjs cannot use function 'getPageIterator' - javascript

I am trying to follow the documentation for pdfjs found here https://www.pdftron.com/documentation/core/guides/features/manipulation/remove/ in an attempt to remove a page from a PDF I have uploaded to my html page. Here is my html code:
<script src="https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.2.228/pdf.js"></script>
<input type="file" id="input"/>
<script>
document.getElementById('input').addEventListener('change', function(e){
var reader = new FileReader()
reader.onload = function(x){
window['pdfjs-dist/build/pdf'].getDocument({data:x.target.result}).promise.then(function(doc){
doc.pageRemove(doc.getPageIterator(5));
console.log(doc.numPages)
})}
reader.readAsBinaryString(e.target.files[0])
}, false)
</script>
which gives this console error when I upload a PDF file to the page:
removeDemo.html:10 Uncaught (in promise) TypeError: doc.getPageIterator is not a function
The PDF I am uploading has more than 5 pages, so asking to remove the 5th page in particular shouldn't be the problem. Other functionality does seem to work however, for example, I have a line in the above code that prints the number of pages of the document. This works fine when I comment out the 'getPageIterator' line. So it seems to be a problem with this specific function, rather than a more general problem. I would like to know what is causing this problem. In case this is relevant, I am running this in chromium on a macbook pro.
Please let me know if there is something in the above question that I can further clarify.

Mozilla led pdf.js is primarily a browser plugin pdf viewer, without editor functions.
The function your calling doc.pageRemove(doc is for use with PDFTron webview / edit SDK and thus specific to that commercial JavaScript library.

The documentation you linked to, from PDFTron, has nothing to do with PDF.js, it is a completely separate SDK. This is why you get an error making a SDK API call on a different SDK.
Since PDF.js does not support removing pages from a PDF (nor editing in general), then I assume your intention is to use PDFTron SDK to remove (or edit in other ways) a PDF file in the browser client side.
In which case you want to do the following.
See this sample:
https://www.pdftron.com/documentation/web/samples/pdf-manipulation/#page-operations
See here to get started with the SDK:
https://www.pdftron.com/documentation/web/get-started/

Related

How to move a file with JavaScript

I want to move a txt file from drive c to drive d and I found this code by searching but it does not work properly.
please guide me.
Thanks
<html>
<body>
<script language="JScript">
function move() {
var object = new ActiveXObject("Scripting.FileSystemObject");
var file = object.GetFile("C:\\1.txt");
file.Move("d:\\");
console.log("File is moved successfully");
}
</script>
<button onClick="move()">Move File txt</button>
</body>
</html>

Browsers do not provide any features that let code provided by a webpage move files the users' hard disks.
The code you've found may have worked in old versions of Internet Explorer (I think the feature was removed in later versions) but only when the security settings were altered from the default to allow it.
You could probably use it in server-side Classic ASP (but then it would move files on the server rather than the client).
For a browser-style UI which can do this, look to tools like Electron which pair a custom browser with Node.js in a desktop application. You can then use the Node.js side of your custom application to move files.
Obviously this will require that the user download and install your application and use that instead of their web browser.

Getting JSON from a website with no visible files using p5.js

I am trying to import a json file from a website using p5.js, and i thought it would be quite easy, however when i tried it i realized the json was actually just in plain text on the page (It is the only thing on the page). I checked chrome web tools to look at index.html, but i was greeted by "(index)", is it a problem with google or am i just going to have to use something else than this?
function preload() {
httpGet('leaderboard.popcat.click', 'json', function(response) {
});
}
//there are the setup and draw functions aswell
I got an error when i ran the code aswell, it was
Error: JSONP request to url failed
here is a picture of the page btw, (the url is leaderboard.popcat.click)
EDIT: The main problem i am having is that there is no file in https://leaderboard.popcat.click/, not the getting of json.
The network tab says no such url exists, and i believe that is because of the fact that i didn't specify a file.
Here is the console output aswell

I solved my issue by starting chrome in no-cors mode or whatever it's called AND using the full path of the website. I got it into no-cors mode by making a shortcut with this link
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --disable-web-security --user-data-dir=~/chromeTemp"
and running it as an administrator

PDF.js not working when deploying to different Server in IE

I have a local IIS site where i developed some code with PDF.js. There it worked fine to load a specific PDF and read the text contents from it.
Then I copied everything to the a library in a SharePoint Server (thats the only difference, IIS vs SharePoint) and changed all references. The code does not throw any Errors, with debugging level info it just prints
Info: Cannot use postMessage Transfers
to the console. Adding a console.log line into the PDF.js catch block of the promise did not result in any new information. It doesn't even get to the first logging inside the then:
var pdfobj = PDFJS.getDocument(docPath);
pdfobj.then(function (pdf) {
console.log(pdf);
any ideas?
EDITS: Updated from PDF.JS 1.1 to 1.2
There are not many error logs in PDF.js. I accidently hardcoded a wrong URL where even the server is non existent... and no error log, not even the then(...).catch(...) is called?
It is working now in Firefox but not in IE and I cannot see any reason for this. The Info message about Cannot use postMessage Transfers is also only displayed in IE (using IE 11).

It does work now. I am not sure what I did to fix it, but I will update this answer when I know. I think it has something to do with the directory structure of the PDF.js files. Previously I just uploaded all JS files (there were no errors though).
Still there is no exception handling when the PDF does not exist.

How do I prevent Javascript from mutating a page in Selenium? How do I download the original page source? [duplicate]

This question already has answers here:
getting the raw source from Firefox with javascript
(3 answers)
Closed 8 years ago.
I'm not using Selenium to automate testing, but to automate saving AJAX pages that inject content, even if they require prior authentication to access.
I tried
tl;dr: I tried multiple tools for downloading sites with AJAX and gave up because they were hard to work with or simply didn't work. I'm resorting to using Selenium after trying out WebHTTrack (whose GUI wasn't able to start up on my Ubuntu machine + was a headache to provide authentication with in interactive-terminal mode), wget (which didn't download any of the scripts of stylesheets included on my page, see the bottom for what I tried with wget)... and then I finally gave up after a promising post on using a Mozilla XULRunner AJAX scraper called Crowbar simply seg-faulted on me. So...
ended up making my own broken thing in NodeJS and Selenium-WebdriverJS
My NodeJS script uses selenium-webdriver npm module which is "officially supported by the main project" to:
provide login information + do necessary button-clicking & typing for authentication
download all JS and CSS referenced on target page
download target page with original JS/CSS file links change to local file paths
Now when I view my test page locally I see double of many page elements because the target site loads HTML snippets into the page each time it's loaded. I use this to download my target page right now:
var $;
var getTarget = function () {
driver.getPageSource().then(function (source) {
$ = cheerio.load(source.toString());
});
};
var targetHtmlDest = 'test.html';
var writeTarget = function () {
fs.writeFile(targetHtmlDest, $.html());
}
driver.get(targetSite)
.then(authenticate)
.then(getRoot)
.then(downloadResources)
.then(writeRoot);
driver.quit();
The problem is that the page source I get is the already modified page source, instead of the original one. Trying to run alert("x");window.stop(); within driver.executeAsyncScript() and driver.executeScript() does nothing.

Perhaps using Curl to get the page (you can pass authentication in the command) will get you the bare source?
Otherwise you may be able to turn off JavaScript on your test browsers to prevent JS actions from firing.

PDF files do not open in Internet Explorer with Adobe Reader 10.0 - users get an empty gray screen. How can I fix this for my users?

There is a known issue with opening a PDF in Internet Explorer (v 6, 7, 8, 9) with Adobe Reader X (version 10.0.*). The browser window loads with an empty gray screen (and doesn't even have a Reader toolbar). It works perfectly fine with Firefox, Chrome, or with Adobe Reader 10.1.*.
I have discovered several workarounds. For example, hitting "Refresh" will load the document properly. Upgrading to Adobe Reader 10.1.*, or downgrading to 9.*, fixes the issue too.
However, all of these solutions require the user to figure it out. Most of my users get very confused at seeing this gray screen, and end up blaming the PDF file and blaming the website for being broken. Honestly, until I researched the issue, I blamed the PDF too!
So, I am trying to figure out a way to fix this issue for my users.
I've considered providing a "Download PDF" link (that sets the Content-Disposition header to attachment instead of inline), but my company does not like that solution at all, because we really want these PDF files to display in the browser.
Has anyone else experienced this issue?
What are some possible solutions or workarounds?
I'm really hoping for a solution that is seamless to the end-user, because I can't rely on them to know how to change their Adobe Reader settings, or to automatically install updates.
Here's the dreaded Gray Screen:
Edit: screenshot was deleted from file server! Sorry!
The image was a browser window, with the regular toolbar, but a solid gray background, no UI whatsoever.
Background info:
Although I don't think the following information is related to my issue, I'll include it for reference:
This is an ASP.NET MVC application, and has jQuery available.
The link to the PDF file has target=_blank so that it opens in a new window.
The PDF file is being generated on-the-fly, and all the content headers are being set appropriately.
The URL does NOT include the .pdf extension, but we do set the content-disposition header with a valid .pdf filename and the inline setting.
Edit: Here is the source code that I'm using to serve up the PDF files.
First, the Controller Action:
public ActionResult ComplianceCertificate(int id){
byte[] pdfBytes = ComplianceBusiness.GetCertificate(id);
return new PdfResult(pdfBytes, false, "Compliance Certificate {0}.pdf", id);
}
And here is the ActionResult (PdfResult, inherits System.Web.Mvc.FileContentResult):
using System.Net.Mime;
using System.Web.Mvc;
/// <summary>
/// Returns the proper Response Headers and "Content-Disposition" for a PDF file,
/// and allows you to specify the filename and whether it will be downloaded by the browser.
/// </summary>
public class PdfResult : FileContentResult
{
public ContentDisposition ContentDisposition { get; private set; }
/// <summary>
/// Returns a PDF FileResult.
/// </summary>
/// <param name="pdfFileContents">The data for the PDF file</param>
/// <param name="download">Determines if the file should be shown in the browser or downloaded as a file</param>
/// <param name="filename">The filename that will be shown if the file is downloaded or saved.</param>
/// <param name="filenameArgs">A list of arguments to be formatted into the filename.</param>
/// <returns></returns>
[JetBrains.Annotations.StringFormatMethod("filename")]
public PdfResult(byte[] pdfFileContents, bool download, string filename, params object[] filenameArgs)
: base(pdfFileContents, "application/pdf")
{
// Format the filename:
if (filenameArgs != null && filenameArgs.Length > 0)
{
filename = string.Format(filename, filenameArgs);
}
// Add the filename to the Content-Disposition
ContentDisposition = new ContentDisposition
{
Inline = !download,
FileName = filename,
Size = pdfFileContents.Length,
};
}
protected override void WriteFile(System.Web.HttpResponseBase response)
{
// Add the filename to the Content-Disposition
response.AddHeader("Content-Disposition", ContentDisposition.ToString());
base.WriteFile(response);
}
}

It's been 4 months since asking this question, and I still haven't found a good solution.
However, I did find a decent workaround, which I will share in case others have the same issue.
I will try to update this answer, too, if I make further progress.
First of all, my research has shown that there are several possible combinations of user-settings and site settings that cause a variety of PDF display issues. These include:
Broken version of Adobe Reader (10.0.*)
HTTPS site with Internet Explorer and the default setting "Don't save encrypted files to disk"
Adobe Reader setting - disable "Display PDF files in my browser"
Slow hardware (thanks #ahochhaus)
I spent some time researching PDF display options at pdfobject.com, which is an EXCELLENT resource and I learned a lot.
The workaround I came up with is to embed the PDF file inside an empty HTML page. It is very simple: See some similar examples at pdfobject.com.
<html>
<head>...</head>
<body>
<object data="/pdf/sample.pdf" type="application/pdf" height="100%" width="100%"></object>
</body>
</html>
However, here's a list of caveats:
This ignores all user-preferences for PDFs - for example, I personally like PDFs to open in a stand-alone Adobe Reader, but that is ignored
This doesn't work if you don't have the Adobe Reader plugin installed/enabled, so I added a "Get Adobe Reader" section to the html, and a link to download the file, which usually gets completely hidden by the <object /> tag, ... but ...
In Internet Explorer, if the plugin fails to load, the empty object will still hide the "Get Adobe Reader" section, so I had to set the z-index to show it ... but ...
Google Chrome's built-in PDF viewer also displays the "Get Adobe Reader" section on top of the PDF, so I had to do browser detection to determine whether to show the "Get Reader".
This is a huge list of caveats. I believe it covers all the bases, but I am definitely not comfortable applying this to EVERY user (most of whom do not have an issue).
Therefore, we decided to ONLY do this embedded option if the user opts-in for it. On our PDF page, we have a section that says "Having trouble viewing PDFs?", which lets you change your setting to "embedded", and we store that setting in a cookie.
In our GetPDF Action, we look for the embed=true cookie. This determines whether we return the PDF file, or if we return a View of HTML with the embedded PDF.
Ugh. This was even less fun than writing IE6-compatible JavaScript.
I hope that others with the same problem can find comfort knowing that they're not alone!

I don't have an exact solution, but I'll post my experiences with this in case they help anyone else.
From my testing, the gray screen is only triggered on slower machines [1]. To date, I have not been able to recreate it on newer hardware [2]. All of my tests have been in IE8 with Adobe Reader 10.1.2. For my tests I turned off SSL and removed all headers that could have disabled caching.
To recreate the gray screen, I followed the following steps:
1) Navigate to a page that links to a PDF
2) Open the PDF in a new window or tab (either via the context menu or target="_blank")
3) In my tests, this PDF will open without error (however I have received user reports indicating failure on the first PDF load)
4) Close the newly opened window or tab
5) Open the PDF (again) in a new window or tab
6) This PDF will not open, but instead only show the "gray screen" mentioned by the first user (all subsequent PDFs that are loaded will also not display -- until all browser windows are closed)
I performed the above test with several different PDF files (both static and dynamic) generated from different sources and the gray screen issue always occurs when following the above steps (on the "slow" computer).
To mitigate the problem in my application, I "tore down" the page that links to the PDF (removed parts piece by piece until the gray screen no longer occurred). In my particular application (built on closure-library) removing all references to goog.userAgent.adobeReader [3] appears to have fixed the issue. This exact solution won't work with jquery or .net MVC but maybe the process can help you isolate the source of the issue. I have not yet taken the time to isolate which particular portion of goog.userAgent.adobeReader triggers the bug in Adobe Reader, but it is likely that jquery might have similar plugin detection code to that used in closure-library.
[1] Machine experiencing gray screen:
Win Server '03 SP3
AMD Sempron 2400+ at 1.6GHz
256MB memory
[2] Machine not experiencing gray screen:
Win XP x64 SP2
AMD Athlon II X4 620 at 2.6 GHz
4GB memory
[3] http://closure-library.googlecode.com/svn/docs/closure_goog_useragent_adobereader.js.source.html

I ran into this issue around the time MVC1 was first released. See Generating PDF, error with IE and HTTPS regarding the Cache-Control header.

For Win7 Acrobat Pro X
Since I did all these without rechecking to see if the problem still existed afterwards, I am not sure which on of these actually fixed the problem, but one of them did. In fact, after doing the #3 and rebooting, it worked perfectly.
FYI: Below is the order in which I stepped through the repair.
Go to Control Panel > folders options under each of the General, View and Search Tabs
click the Restore Defaults button and the Reset Folders button
Go to Internet Explorer, Tools > Options > Advanced > Reset ( I did not need to delete personal settings)
Open Acrobat Pro X, under Edit > Preferences > General.
At the bottom of page select Default PDF Handler. I chose Adobe Pro X, and click Apply.
You may be asked to reboot (I did).
Best Wishes

In my case the solution was quite simple.
I added this header and the browsers opened the file in every test.
header('Content-Disposition: attachment; filename="filename.pdf"');

I had this problem. Reinstalling the latest version of Adobe Reader did nothing. Adobe Reader worked in Chrome but not in IE. This worked for me ...
1) Go to IE's Tools-->Compatibility View menu.
2) Enter a website that has the PDF you wish to see. Click OK.
3) Restart IE
4) Go to the website you entered and select the PDF. It should come up.
5) Go back to Compatibility View and delete the entry you made.
6) Adobe Reader works OK now in IE on all websites.
It's a strange fix, but it worked for me. I needed to go through an Adobe acceptance screen after reinstall that only appeared after I did the Compatibility View trick. Once accepted, it seemed to work everywhere. Pretty flaky stuff. Hope this helps someone.

Hm, would it be possible to simply do this:
The first time your user opens a pdf, using Javascript you make a popout that basically says "If you cannot see your document, please click HERE". Make "HERE" a big button where it will explain to your user what's the problem. Also make another button "everything's fine". If the user clicks on this one, you remember it, so it isn't displayed in the future.
I'm trying to be practical. Going to great lengths trying to solve this kind of problem "properly" for a small subset of Adobe Reader versions doesn't sound very productive to me.

Experimenting more, the underlying cause in my app (calling goog.userAgent.adobeReader) was accessing Adobe Reader via an ActiveXObject on the page with the link to the PDF. This minimal test case causes the gray screen for me (however removing the ActiveXObject causes no gray screen).
<!DOCTYPE html>
<html lang="en">
<head>
<title>hi</title>
<meta charset="utf-8">
</head>
<body>
<script>
new ActiveXObject('AcroPDF.PDF.1');
</script>
<a target="_blank" href="http://partners.adobe.com/public/developer/en/xml/AdobeXMLFormsSamples.pdf">link</a>
</body>
</html>
I'm very interested if others are able to reproduce the problem with this test case and following the steps from my other post ("I don't have an exact solution...") on a "slow" computer.
Sorry for posting a new answer, but I couldn't figure out how to add a code block in a comment on my previous post.
For a video example of this minimal test case, see: http://youtu.be/IgEcxzM6Kck

I realize this is a rather late post but still a possible solution for the OP. I use IE9 on Win 7 and have been having Adobe Reader's grey screen issues for several months when trying to open pdf bank and credit card statements online. I could open everything in Firefox or Opera but not IE. I finally tried PDF-Viewer, set it as the default pdf viewer in its preferences and no more problems. I'm sure there are other free viewers out there, like Foxit, PDF-Xchange, etc., that will give better results than Reader with less headaches. Adobe is like some of the other big companies that develop software on a take it or leave it basis ... so I left it.

We were getting this issue even after updating to the latest Adobe Reader version.
Two different methods solved this issue for us:
Using the free version of Foxit Reader application in place of Adobe Reader
But, since most of our clients use Adobe Reader, so instead of requiring users to use Foxit Reader, we started using window.open(url) to open the pdf instead of window.location.href = url. Adobe was losing the file handle on for some reason in different iframes when the pdf was opened using the window.location.href method.

We Keep Coding

JavaScript is the programming language of the Web.