How to save a webpage locally including pictures,etc - javascript

I am building an add-on for an application. The clients are paying to view some webpages and download some files out of it. They want to automate this downloading process by add-on. So instead of selecting "Save Page as" and waiting for the download's completion, they can click the add-on and forget the process. The problem is, the webpage is providing some cookies to the browser. So the best way is File-> "Save Page As" . I want to do it through the add-on. Is there any firefox-javascript way for this?. I used nsiDownloader. But it saves only html, not the pictures,etc. Can anybody guide me in this issue?
EDIT:
Hi, This is the code which did the trick, thanks to sai prasad
var dir =Components.classes["#mozilla.org/file/local;1"]
.createInstance(Components.interfaces.nsILocalFile);
dir.initWithPath("C:\\filename");
var file = Components.classes["#mozilla.org/file/local;1"]
.createInstance(Components.interfaces.nsILocalFile);
file.initWithPath("C:\\filename.html");
var wbp = Components.classes['#mozilla.org/embedding/browser/nsWebBrowserPersist;1']
.createInstance(Components.interfaces.nsIWebBrowserPersist);
alert("going to save");
wbp.saveDocument(content.document, file,dir, null, null, null);
alert("saved");
EDIT:
But, still some webpages are not saved exactly as "Save Page as". Those saved pages are not rendered like original pages, they are look like some html example.

Since you mention that File->"Save Page As" is working as expected, I tried looking through the source code (chrome://browser/content/browser.xul) and found this:
https://developer.mozilla.org/en/nsIWebBrowserPersist#saveDocument()
Make sure that you shall call this function only after the webpage is completely loaded (not DOMContentLoaded)!!

Related

Chrome asks if an application should be opened

We have a website that allows the user to download and open a Word file.
Chrome started recently opening an annoying popup when the users downloads the Word
The javascript code called to open the file is
function webDAVOnLineEditionAcm(docURL) {
try {
setTimeout(function(){ ITHit.WebDAV.Client.DocManager.EditDocument(docURL, "/", protocolInstallCallback); }, 5000);
} catch (e) {
console.log(e);
}
}
Is there some solution to tell Chrome to always open that kind of links? It is annoying our users, that need to frequently click that button but also breaks all our acceptance tests...
Short answer, no. This is a security feature to prevent a malicious script from being able to launch local applications or open executable files.
One solution is to educate your users. Show a notice on the page when they click to open that explains how to suppress the prompt.
Another solution is to trigger a download of the file. They can then open that file by clicking on it in the downloads list. They won't be prompted in this case.
Finally, a more involved solution is to process the word document and convert it to HTML so they can be viewed directly on the site. You could then pair this with the download solution to get the original document.

Android webview displaying differently when from file instead of from online

I am trying to save the HTML from a webpage into a file so if my app is opened and no internet is available then the webview loads from a file instead. Here is my debugging code - the first time the view is created it downloads the file. The second time onwards it opens the files
WebView cwebView = (WebView) rootView.findViewById(R.id.aboutWebview);
if(loadedLatest) {
cwebView.loadData(FileUtils.read(Values.aboutWebviewOfflineFile,getContext()),"text/html", "UTF-8");
cwebView.getSettings().setJavaScriptEnabled(true);
//cwebView.getSettings().setBuiltInZoomControls(true);
}else {
cwebView.loadUrl(Values.aboutPageURL);
new GetWebviewContents(getContext()).execute(Values.aboutPageURL, Values.aboutWebviewOfflineFile) //Saves the HTML to a file;
loadedLatest = true;
}
The HTML download and file seem to be working correctly however the webview looks completely different when from online and when from the file - it is much narrower and images overlap. I have tried using .loadurl(File...) and it has the same effect. Enabling Javascript makes no difference.
Does anyone have any idea what could be causing this?
Thanks
I don't know for sure, but it could be the issue of Cross-Origin blocking by the WebView. To verify that this is the issue, open the same webpage that you saved on your app, on your laptop or desktop, and save it on your disk. Then try to open that saved page and see the Javascript console on your's browser Developer Tools and see if it shows any Cross-Origin access restriction errors.

Force browser to refresh javascript code while developing an MVC View?

Pretty straight-forward, I'm developing an MVC5 application and have noticed (lately) that my Browser appears to be caching the JavaScript code I have on the view within #section Scripts { }.
Currently I am developing with Chrome and I have tried CTRL+F5 & CTRL+SHFT+R which reloads the page, but the alert() I uncommented within the javascript code is still rendering as commented. I also tried going to my localhost through Incognito Mode as well as other Browsers (Firefox, IE) and am getting the same behavior. This is my /Home/Index.cshtml View, which is the default View which loads when the application starts. I have also tried adding some extra HTML text into the page and again the new code is not taking effect/showing.
My current Chrome version is Version 41.0.2272.118 m if anyone has any ideas what might be going on?
UPDATE:
I have gone under the Developer Tools => General Settings in Chrome and checked [X] Disable cache (while DevTools is open) and then repeatedly (with DevTools still open) tried CTRL+SHFT+R and CTRL+F5 with the same results of before where my changes are not taking effect.
UPDATE 2:
With DevTools open I have also held the Refresh button down and tried Normal/Hard/and Empty Cache & Hard Reload options all with the same result. For simplicity of testing I added an alert in the below to dispaly as soon as the page loads (and currently no alert comes up):
$(document).ready(function () {
alert("Test");
// Other Code/Functions -- No Error showing in Console
});
If you are using Bundling from MVC, you have two options to disable caching:
Use BundleTable.EnableOptimizations. This instructs the bundling to minify and optimize your bundle even while debugging. It generates a hash in the process, based on the content of the script, so your customers browsers can cache this file for a long time. It will generate a whole different hash the next time your file changes, so your customers can see your changes. The downside is that your script will become unreadable and you won't be able to debug it, so this might not be your best option.
Use System.Web.Optimization.BundleTable.Bundles.ResolveBundleUrl("url", true) to resolve your script's URL, the second parameter (true) is requiring a hash to be generated with the URL, thus, preventing caching from your browser when you change the file. This is exactly the same hash generated in the first option, but without minifying.
I created a small demo showing that the second option prevents caching from happening, the trick is getting the hash generated from your script's content without minifying your script.
I created a script file called myscript.js with this content:
$(document).ready(function () {
alert('a');
});
Then I added this to my BundleConfig.cs:
// PLEASE NOTE this is **NOT** a ScriptBundle
bundles.Add(new Bundle("~/bundles/myscripts").Include(
"~/Scripts/myscript*"));
If you add a ScriptBundle, you will get a minified response again, since ScriptBundle is just a Bundle using JsMinify transformation (source). That's why we just use Bundle.
Now you can just add your script using this method to resolve the script URL with the hash appendend. You can use the Script.Render
#Scripts.Render(System.Web.Optimization.BundleTable.Bundles.ResolveBundleUrl("~/bundles/myscripts", true))
Or the script tag:
<script src="#System.Web.Optimization.BundleTable.Bundles.ResolveBundleUrl("~/bundles/myscripts", true)"></script>
Either way will generate a URL with a hash to prevent caching:
After editing my file:
You might want to add a no_cache variable after your script url like:
<script src="js/stg/Stg.js?nocache=#random_number"></script>
If you manage to put a random number to the place i indicated, the browser will automatically download the latest version of the script after an F5
A quick trick that solves this problem consists of opening the script file in a new tab, then refresh it on this page.
If you happen to have Chrome dev tools open it will even refresh it there.
From dev tool you can even easily right click-open in new tab the script.

Chrome: JavaScript window.open to be Save-able

Imagine a FTP client written in HTML and JavaScript. This part works. But it would be nice if user can "copy the listing" into clipboard. Turns out that clipboard stuff is not so easy in JS (besides, listings can be huge). So better is to pop up a window with the generated listing, then user can chose to Copy'Paste, or Save the page to disk.
Currently I do:
my_window = window.open("", "Copy List");
my_window.document.write('<pre&gt\n'+string+'&lt/pre&gt');
my_window.document.close();
Which works. I get a new tab, and the listing I have generated in "string" displays nicely.
But Chrome disables/greyes-out the "Save Page" option. It would be nice if user can save the page (html or txt). What magic is required to open a window/tab and let them save the content?
Since we use WebSockets (key1/key2) this only works in Chrome, no other browsers needed.
Way after the fact but you can use a data URI for this:
window.open("data:text/plain;base64,"+btoa(theCode))

PDF files do not open in Internet Explorer with Adobe Reader 10.0 - users get an empty gray screen. How can I fix this for my users?

There is a known issue with opening a PDF in Internet Explorer (v 6, 7, 8, 9) with Adobe Reader X (version 10.0.*). The browser window loads with an empty gray screen (and doesn't even have a Reader toolbar). It works perfectly fine with Firefox, Chrome, or with Adobe Reader 10.1.*.
I have discovered several workarounds. For example, hitting "Refresh" will load the document properly. Upgrading to Adobe Reader 10.1.*, or downgrading to 9.*, fixes the issue too.
However, all of these solutions require the user to figure it out. Most of my users get very confused at seeing this gray screen, and end up blaming the PDF file and blaming the website for being broken. Honestly, until I researched the issue, I blamed the PDF too!
So, I am trying to figure out a way to fix this issue for my users.
I've considered providing a "Download PDF" link (that sets the Content-Disposition header to attachment instead of inline), but my company does not like that solution at all, because we really want these PDF files to display in the browser.
Has anyone else experienced this issue?
What are some possible solutions or workarounds?
I'm really hoping for a solution that is seamless to the end-user, because I can't rely on them to know how to change their Adobe Reader settings, or to automatically install updates.
Here's the dreaded Gray Screen:
Edit: screenshot was deleted from file server! Sorry!
The image was a browser window, with the regular toolbar, but a solid gray background, no UI whatsoever.
Background info:
Although I don't think the following information is related to my issue, I'll include it for reference:
This is an ASP.NET MVC application, and has jQuery available.
The link to the PDF file has target=_blank so that it opens in a new window.
The PDF file is being generated on-the-fly, and all the content headers are being set appropriately.
The URL does NOT include the .pdf extension, but we do set the content-disposition header with a valid .pdf filename and the inline setting.
Edit: Here is the source code that I'm using to serve up the PDF files.
First, the Controller Action:
public ActionResult ComplianceCertificate(int id){
byte[] pdfBytes = ComplianceBusiness.GetCertificate(id);
return new PdfResult(pdfBytes, false, "Compliance Certificate {0}.pdf", id);
}
And here is the ActionResult (PdfResult, inherits System.Web.Mvc.FileContentResult):
using System.Net.Mime;
using System.Web.Mvc;
/// <summary>
/// Returns the proper Response Headers and "Content-Disposition" for a PDF file,
/// and allows you to specify the filename and whether it will be downloaded by the browser.
/// </summary>
public class PdfResult : FileContentResult
{
public ContentDisposition ContentDisposition { get; private set; }
/// <summary>
/// Returns a PDF FileResult.
/// </summary>
/// <param name="pdfFileContents">The data for the PDF file</param>
/// <param name="download">Determines if the file should be shown in the browser or downloaded as a file</param>
/// <param name="filename">The filename that will be shown if the file is downloaded or saved.</param>
/// <param name="filenameArgs">A list of arguments to be formatted into the filename.</param>
/// <returns></returns>
[JetBrains.Annotations.StringFormatMethod("filename")]
public PdfResult(byte[] pdfFileContents, bool download, string filename, params object[] filenameArgs)
: base(pdfFileContents, "application/pdf")
{
// Format the filename:
if (filenameArgs != null && filenameArgs.Length > 0)
{
filename = string.Format(filename, filenameArgs);
}
// Add the filename to the Content-Disposition
ContentDisposition = new ContentDisposition
{
Inline = !download,
FileName = filename,
Size = pdfFileContents.Length,
};
}
protected override void WriteFile(System.Web.HttpResponseBase response)
{
// Add the filename to the Content-Disposition
response.AddHeader("Content-Disposition", ContentDisposition.ToString());
base.WriteFile(response);
}
}
It's been 4 months since asking this question, and I still haven't found a good solution.
However, I did find a decent workaround, which I will share in case others have the same issue.
I will try to update this answer, too, if I make further progress.
First of all, my research has shown that there are several possible combinations of user-settings and site settings that cause a variety of PDF display issues. These include:
Broken version of Adobe Reader (10.0.*)
HTTPS site with Internet Explorer and the default setting "Don't save encrypted files to disk"
Adobe Reader setting - disable "Display PDF files in my browser"
Slow hardware (thanks #ahochhaus)
I spent some time researching PDF display options at pdfobject.com, which is an EXCELLENT resource and I learned a lot.
The workaround I came up with is to embed the PDF file inside an empty HTML page. It is very simple: See some similar examples at pdfobject.com.
<html>
<head>...</head>
<body>
<object data="/pdf/sample.pdf" type="application/pdf" height="100%" width="100%"></object>
</body>
</html>
However, here's a list of caveats:
This ignores all user-preferences for PDFs - for example, I personally like PDFs to open in a stand-alone Adobe Reader, but that is ignored
This doesn't work if you don't have the Adobe Reader plugin installed/enabled, so I added a "Get Adobe Reader" section to the html, and a link to download the file, which usually gets completely hidden by the <object /> tag, ... but ...
In Internet Explorer, if the plugin fails to load, the empty object will still hide the "Get Adobe Reader" section, so I had to set the z-index to show it ... but ...
Google Chrome's built-in PDF viewer also displays the "Get Adobe Reader" section on top of the PDF, so I had to do browser detection to determine whether to show the "Get Reader".
This is a huge list of caveats. I believe it covers all the bases, but I am definitely not comfortable applying this to EVERY user (most of whom do not have an issue).
Therefore, we decided to ONLY do this embedded option if the user opts-in for it. On our PDF page, we have a section that says "Having trouble viewing PDFs?", which lets you change your setting to "embedded", and we store that setting in a cookie.
In our GetPDF Action, we look for the embed=true cookie. This determines whether we return the PDF file, or if we return a View of HTML with the embedded PDF.
Ugh. This was even less fun than writing IE6-compatible JavaScript.
I hope that others with the same problem can find comfort knowing that they're not alone!
I don't have an exact solution, but I'll post my experiences with this in case they help anyone else.
From my testing, the gray screen is only triggered on slower machines [1]. To date, I have not been able to recreate it on newer hardware [2]. All of my tests have been in IE8 with Adobe Reader 10.1.2. For my tests I turned off SSL and removed all headers that could have disabled caching.
To recreate the gray screen, I followed the following steps:
1) Navigate to a page that links to a PDF
2) Open the PDF in a new window or tab (either via the context menu or target="_blank")
3) In my tests, this PDF will open without error (however I have received user reports indicating failure on the first PDF load)
4) Close the newly opened window or tab
5) Open the PDF (again) in a new window or tab
6) This PDF will not open, but instead only show the "gray screen" mentioned by the first user (all subsequent PDFs that are loaded will also not display -- until all browser windows are closed)
I performed the above test with several different PDF files (both static and dynamic) generated from different sources and the gray screen issue always occurs when following the above steps (on the "slow" computer).
To mitigate the problem in my application, I "tore down" the page that links to the PDF (removed parts piece by piece until the gray screen no longer occurred). In my particular application (built on closure-library) removing all references to goog.userAgent.adobeReader [3] appears to have fixed the issue. This exact solution won't work with jquery or .net MVC but maybe the process can help you isolate the source of the issue. I have not yet taken the time to isolate which particular portion of goog.userAgent.adobeReader triggers the bug in Adobe Reader, but it is likely that jquery might have similar plugin detection code to that used in closure-library.
[1] Machine experiencing gray screen:
Win Server '03 SP3
AMD Sempron 2400+ at 1.6GHz
256MB memory
[2] Machine not experiencing gray screen:
Win XP x64 SP2
AMD Athlon II X4 620 at 2.6 GHz
4GB memory
[3] http://closure-library.googlecode.com/svn/docs/closure_goog_useragent_adobereader.js.source.html
I ran into this issue around the time MVC1 was first released. See Generating PDF, error with IE and HTTPS regarding the Cache-Control header.
For Win7 Acrobat Pro X
Since I did all these without rechecking to see if the problem still existed afterwards, I am not sure which on of these actually fixed the problem, but one of them did. In fact, after doing the #3 and rebooting, it worked perfectly.
FYI: Below is the order in which I stepped through the repair.
Go to Control Panel > folders options under each of the General, View and Search Tabs
click the Restore Defaults button and the Reset Folders button
Go to Internet Explorer, Tools > Options > Advanced > Reset ( I did not need to delete personal settings)
Open Acrobat Pro X, under Edit > Preferences > General.
At the bottom of page select Default PDF Handler. I chose Adobe Pro X, and click Apply.
You may be asked to reboot (I did).
Best Wishes
In my case the solution was quite simple.
I added this header and the browsers opened the file in every test.
header('Content-Disposition: attachment; filename="filename.pdf"');
I had this problem. Reinstalling the latest version of Adobe Reader did nothing. Adobe Reader worked in Chrome but not in IE. This worked for me ...
1) Go to IE's Tools-->Compatibility View menu.
2) Enter a website that has the PDF you wish to see. Click OK.
3) Restart IE
4) Go to the website you entered and select the PDF. It should come up.
5) Go back to Compatibility View and delete the entry you made.
6) Adobe Reader works OK now in IE on all websites.
It's a strange fix, but it worked for me. I needed to go through an Adobe acceptance screen after reinstall that only appeared after I did the Compatibility View trick. Once accepted, it seemed to work everywhere. Pretty flaky stuff. Hope this helps someone.
Hm, would it be possible to simply do this:
The first time your user opens a pdf, using Javascript you make a popout that basically says "If you cannot see your document, please click HERE". Make "HERE" a big button where it will explain to your user what's the problem. Also make another button "everything's fine". If the user clicks on this one, you remember it, so it isn't displayed in the future.
I'm trying to be practical. Going to great lengths trying to solve this kind of problem "properly" for a small subset of Adobe Reader versions doesn't sound very productive to me.
Experimenting more, the underlying cause in my app (calling goog.userAgent.adobeReader) was accessing Adobe Reader via an ActiveXObject on the page with the link to the PDF. This minimal test case causes the gray screen for me (however removing the ActiveXObject causes no gray screen).
<!DOCTYPE html>
<html lang="en">
<head>
<title>hi</title>
<meta charset="utf-8">
</head>
<body>
<script>
new ActiveXObject('AcroPDF.PDF.1');
</script>
<a target="_blank" href="http://partners.adobe.com/public/developer/en/xml/AdobeXMLFormsSamples.pdf">link</a>
</body>
</html>
I'm very interested if others are able to reproduce the problem with this test case and following the steps from my other post ("I don't have an exact solution...") on a "slow" computer.
Sorry for posting a new answer, but I couldn't figure out how to add a code block in a comment on my previous post.
For a video example of this minimal test case, see: http://youtu.be/IgEcxzM6Kck
I realize this is a rather late post but still a possible solution for the OP. I use IE9 on Win 7 and have been having Adobe Reader's grey screen issues for several months when trying to open pdf bank and credit card statements online. I could open everything in Firefox or Opera but not IE. I finally tried PDF-Viewer, set it as the default pdf viewer in its preferences and no more problems. I'm sure there are other free viewers out there, like Foxit, PDF-Xchange, etc., that will give better results than Reader with less headaches. Adobe is like some of the other big companies that develop software on a take it or leave it basis ... so I left it.
We were getting this issue even after updating to the latest Adobe Reader version.
Two different methods solved this issue for us:
Using the free version of Foxit Reader application in place of Adobe Reader
But, since most of our clients use Adobe Reader, so instead of requiring users to use Foxit Reader, we started using window.open(url) to open the pdf instead of window.location.href = url. Adobe was losing the file handle on for some reason in different iframes when the pdf was opened using the window.location.href method.

Categories