Get rendered HTML page file from URL input with browser - javascript

What would be fastest and most less consuming (CPU, RAM) way to get JavaScript rendered HTML page and save it on drive based on URL with ordinary browser in headless (Google Chrome or Firefox) mode?
Idea is to also have proxy options in browser changed per request as well.
I'm well aware of Selenium, Puppeteer, PhantomJS and similar solutions. This needs to be done with REAL browser, remotely managed through some API on Linux environment.
I've found only JS API implementations for building addons but haven't found any solutions except Remote browser for which I'm not sure weather is updated any more.
Any pointers, snippets or whatever are more than welcome since I can't find anything.

Is it necessary for the JavaScript rendered HTML
Page to be functional after it is saved?
Just take a screenshot using Python and save it on drive.

Related

How to make a user-accessible file on the user's filesystem with a chrome extension

I'm making an extension that among other things edit a javascript file in an external editor (one on the user's computer). The extension has the javascript file saved in chrome.storage but it will ofcourse be a lot easier for the user to write code in their own editor.
This is why I decided to find something that creates a file on the user's filesystem which the user can find and edit it themselves, and if any changes are made, sync that back up to the extension (either by periodically checking or by using some listener).
I have looked around but nothing really seems to fit what I'm trying to do. Chrome's fileSystem API only works for chrome apps, not chrome extensions and the HTML5 fileSystem API does not allow for a simple filesystem URL to be requested and opened, instead it obfuscates the stored file and makes it practically impossible to edit that file easily.
Something else I looked at which might be more promising is letting the user edit one of the files in the directory where the extension is stored and somehow retrieving that content. This is however going to be a bit tough to implement with chrome's all the hash checking going on in chrome extensions not to mention the general modifying of those files' contents by the extension (possibly by hacking around by specifying your own update URL and "updating" a dummy javascript file that is going to be written to).
Is there any way to simply ask for a location to store a file and then allow the user to edit that file and sync it back up?
No, extensions are sandboxed from the real filesystem.
As you said, it's possible to read extension's own files; however, this is read-only for the extension and modifying those files on a deployed extension will result in Chrome detecting extension "tampering" and immediate disabling as a precaution.
The only way for a Chrome extension to escape the sandbox is, as wOxxOm suggested, a Native Host module. Note that this cannot be distributed in Chrome Web Store with the extension; it needs a separate installer.
Alternatively, you could use some sort of cloud storage with API to access it; e.g. a user could store something in a Dropbox subfolder, and your extension can authorize access to it via Dropbox API. Unfortunately, there is no "native" solution like syncFileSystem for Apps.

Can I open a Windows Explorer window from a web app?

I built a CRM for a client of mine, and now they've requested an interesting feature:
For each customer record, they have a matching directory of files on their local computer. They want the ability to open that folder in Windows Explorer directly from within the web app (the app doesn't need access to the directory/files; it just has to launch Windows Explorer so that the user can interact with their files).
This is obviously not possible with regular JavaScript running in the browser (thankfully). I thought there might be some way to accomplish this by building a Chrome extension for this purpose, but it seems Chrome extensions/apps can only access a sandboxed filesystem, which doesn't serve my needs at all. Building an NPAPI plugin in out of the question since Chrome is discontinuing support for NPAPI.
File URIs don't solve this problem either. Their display is ugly, there's no drag-and-drop, no big icons/thumbnails, no sorting etc. They want the full capability of the Windows Explorer.
The only viable option I thought of is to create a local node.js server, make a localhost CORS request to that server, and then run an exec command from node.
Any better idea?
One possibility is to register a custom URI protocol handler with the user's operating system, and then your web page can contain links using your custom protocol, such as openfolder://c/path/to/folder This sort of customization is probably most commonly seen in practice with itunes:// links.
A quick Google search led me to this decent looking tutorial: https://support.shotgunsoftware.com/hc/en-us/articles/200213756-How-to-launch-external-applications-using-custom-protocols-rock-instead-of-http-
The downside is that the user will have to run a small installer of some sort in order to set the correct registry entries (or whatever the non-Windows equivalent is for other OSes) and to drop a small script on disk. That would be much lighter-weight than running a node.js server like you proposed, though.
The linked tutorial uses a Python script, but even that is probably overkill for your needs. A batch file would likely suffice.
EDIT: One additional note, please be aware of the security implications of implementing a custom handler like this. Any webpage in any browser can potentially take advantage of your custom protocol, and an attacker would be able to pass arbitrary data to your script. You should take steps to ensure that the script will not accidentally execute arbitrary commands that may be injected by a malicious web page, and that it will only open a folder and nothing else.
That would require each customer to run a node.js server, which seems unrealistic in your case.
You could use File URIs.
Browsers will refuse to open them by default. However, as suggested in this answer, you could ask your customers to install LocalLinks.

How to Launch a PDF from a Chrome Packaged App?

My chrome packaged app contains a PDF, and I would like to let the user view it. If I open it in the current frame I get the error "Chrome PDF Viewer is not Allowed".
Frankly, the chrome PDF viewer is pretty awful, so I'd rather let the user view it in their PDF viewer of choice anyway. If I disable the chrome PDF plugin (just as an experiment) and I try to open the PDF using chrome.app.window.open, it "downloads" the PDF, and then the user could open it. But this has two issues:
I can't realistically make the user go to chrome://plugins and do that disable
There isn't any browser window, so the user has no idea the download happened
Any suggestions? Opening PDFs that are embedded in my app is kind of a must-have feature for this app.
I've looked at this extensively, and have come to the conclusion that there's no way to get a Chrome App to open a PDF that's local. I, too, have tried data URIs.
I don't think the issue is the PDF support in the window, as it's still Chrome, or the size of the PDFs. Rather, I think it's just an engineering problem, one that might get solved someday.
As for me, I build the PDF in my Chrome App. Since I can't display it, and there's no server to upload it to, I write it to a file of the user's choosing and let the user deal with it on his/her own.
I've got this working, but whether it is a solution for you depends a lot on your use case. The solution has three parts:
Use pdfjs to do the actual rendering.
To get this running in a packaged app, you'd need to do some violence to the internationalization support. And even after you do that, you'll find that some PDFs refuse to load for no apparent reason whatsoever. So don't bother trying to make pdfjs work in a packaged app. Just:
Put your entire app into a <webview> with a persist partition, and use a HTML5 cache manifest to get all your files available for offline viewing.
Yeah, yeah, I know that cache manifests are not cool anymore. But if you can list all your files for use in a packaged app, then you are doing the one case where cache manifests actually work great.
Then use a packaged app to distribute a tiny wrapper around your page with the webview in it.
You'll also get the benefit that you don't have to rewrite your app to live within the draconian packaged app rules (eval, sync xhr, 2GB limit, etc.).
You can see a working example at m.kaon.com/c/ka (visit with Chrome to get the desktop app; if you visit that with Firefox, you'll get access to a hosted app that is using the same tricks). PDFs are down in the bottom "Why Choose Kaon" section.

Download multiple images without asking in Chrome extension

I am currently creating a Chrome extension (which uses javascripts mainly) that allows users to scrape the images on a webpage and download them. I have finished the link scraping part, and the code will return an array like:
["http://example.com/image1.jpg","http://example.com/image2.jpg"]
But how do I download all of the links in ONE CLICK? I tried listing all photos on a new tab and let the users to Ctrl+S save the page. But this greatly affects the UI and I do not like it. I do not host webpage so server side script may not be working.. Any other solutions?
As far as I know, Chrome extensions technically can't save files to disk like Firefox.
The only way to do this is using NPAPI
Unfortunately, extensions using npapi will most likely not be accepted by the Web Store due to security problems. Of course it'll be okay if you use it for yourself or host the extension on your website.
You can install and examize the code of the following extensions, maybe you can even use the provided npapi too:
Screen Capture (by Google) https://chrome.google.com/webstore/detail/cpngackimfmofbokmjmljamhdncknpmg
Chrome Toolbox (by Google) https://chrome.google.com/webstore/detail/fjccknnhdnkbanjilpjddjhmkghmachn
Awesome Screenshot: Capture & Annotate https://chrome.google.com/webstore/detail/alelhddbbhepgpmgidjdcjakblofbmce
Download Asisstant (by Google) - got killed I guess.

Save webpages for offline access in web app

I have a web app (sencha/phonegap) that includes a feature allowing users to click on buttons that link to Wikipedia articles. This obviously works fine if the device has internet access, but I get numerous requests to make the app work when the app is offline too. To accomplish this, I'd like to give the user the option to download the linked articles/webpages for offline access. When the device does not have internet access, the app would instead display the saved version (which might be stale/out-of-date, but is better than nothing). What are possible ways to accomplish this task?
My first thought was to somehow use the html manifest to cache the pages in the phone's browser, which sounds possible on the Android browser, but iOS apparently has a 5MB browser cache limit - too small.
My next thought was to save the needed html & associated files and bundle them up inside the app. But this seems a rather cumbersome approach, the app becomes much larger than it needs to be, and the webpages are stale back to the date the app was installed.
Using javascript, is it possible to download webpages, which I could then save (on the sd card, for example) for access later?
Or is there a more elegant approach?
If anyone could point me in the right direction it would be much appreciated.
In pure Javascript you can make an Ajax request to download a page. Then you can use the FileWriter to write the responseText to a file on the file system. However, that won't help you when it comes to images. You'll need to use the FileTransfer.download() command to get the binary image files.
If I were you I'd:
Use AJAX to download the html.
Parse the html looking for images.
Use FileTransfer.download to get the images.

Categories