Headless Chrome: Run a webpage from command line without launching it?

Headless Chrome: Run a webpage from command line without launching it? - javascript

I have a webpage that uses D3, canvg and gif.js to generate GIFs of time-lapse maps. The page generates 3,000 gifs, one at a time. The page is not meant for public consumption.
While it works pretty well to just open this page and download the GIFs, it tends to be asking a lot of the browser. So I'm curious if there's a way to run a page headlessly from the command-line without actually opening it, but running the full app to render the page.
Why not just use Phantom from Node, you might ask? For starters, Phantom is hard! But more importantly, I've never had complete success using Phantom or any other client-side browser engine, like jsdom, to completely render SVGs exactly right.
So my question is basically whether it's possible to use Chrome instead of Phantom and launch a page from the command line that executes the page as if it was merely opened in the browser but without actually opening the page.
Thanks!

You could use electron. The advantage would be you could very easily save your generated gifs, something you can't do with Chrome unless you also run a server.
Otherwise there are some docs for headless chrome here

Related

Can a browser's dev console continue executing JavaSript after a new page loads?

I'm trying to automate some online work through JavaScript and the Firefox (or Chrome) dev console. The work is mostly inputting the same (or similar) data on the same exact pages for many many people.
Example:
unique id
date 1 and 2
some more numbers
I wrote a very simple script that runs in the console and enters the data just fine.
The Problem
My script stops execution whenever it requires the page to reload or it loads another page. I cannot find any information on how to continue executing a script after a page has loaded.
My Limitations
I'm basically limited to what's on FireFox, Chrome, or Edge. Unfortunately, I cannot download any programs or tools that would make the automation any easier right now. Otherwise, I would just use Selenium and Python.
What I've Tried
First I tried to use the script that I describe above (simple DOM manipulation)
Then I tried to use the Selenium browser add-on, but I had to enter a starting URL for it to run. Selenium was not able to get past the login page of our system which is the only static URL that I can use as a starting point.
I then tried to use the Firefox Browser Console (different from the dev console) because the documentation seemed to suggest that I can use JavaScript on the entire browser (not just one tab). Unfortunately, I cannot find any helpful information on how to use the browser console for DOM manipulation. Everything that I search for points to how you create a browser extension, add-on, or how to use JavaScript on your own website.
What I Want To Do
I want to create a script that runs in a dev console. The script should take all of the data either from a separate page or an array then enter the data on each page for each person. I'll also have it prompt the user to verify the data before submission.
What I'm Looking For
What I'm hoping to get from this question is at least one three things.
An answer to the question's title.
Being directed to documentation or some other solution that can solve any of the above problems.
Being told if this is impossible and why by those who have more experience than me (I don't understand if the problem is just a lack of knowledge or limitations on the tools themselves.)

I think you can create a chrome extension and put your code in the background service worker. or use workers read this link

python + selenium - focus to make infinite scroll load

I am scraping a limited number of items from the top of an infinite-scroll website.
links = driver.find_elements_by_xpath("//div[#class='fixed-recipe-card__info']//a")
while len(links)<100:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
links = driver.find_elements_by_xpath("//div[#class='fixed-recipe-card__info']//a")
This works wonderfully when the window is active. However, if I have the test browser minimized, the new content does not load and the loop runs infinitely. I'm rather new to selenium, so I'm not quite sure why. I suspect there is a Javascript onChange that isn't being triggered. Is there a javascript command I should add to my script, or another selenium command that will cause the new content to load?
I am using Python 2.7, selenium with Chromedriver. An example site is allrecipes.com.

Do you minimize it because you are busy with other things? You can use headless mode once your code is doing what you want visually and avoid this problem.

By the way you should try PhantomJs as the driver if minimizing the window is a big concern. It basically works the same way as the chrome driver but it uses no browser so all your code will run in the backgroud, it worked for me. It may work for you, happy coding! http://phantomjs.org

As you mentioned the test browser minimized, the new content does not load is pretty much expected as Selenium needs focus on the Browsing Window to interact with the DOM elements.
Reason
At this point it is worth to mention that a webpage could change its content when the focus is lost . You need to consider the fact Selenium was mainly designed for testing.
Solution
Ideally, Automated Test Execution or Web Scraping must be performed within an isolated Test Environment preferably in a Test Lab configured with the required Hardware and Software configuration which must be free from Manual Intervention.

Chrome silently embedded in desktop application, and that can save files locally via Javascript

Is it possible to have a Python application with a GUI (such as TkInter or WxPython) with :
a Chrome browser as a widget using the main part of the GUI, displaying a certain .html page
be able to save files locally thanks to Javascript that is run in the embedded Chrome (that is normally impossible with any browser for security reasons)
If it's not possible in Python, I'm open to use another language (C++, etc.).

You mention that you're open to trying platforms other than Python. Have you seen Electron? It's a framework and runtime for building desktop applications in JavaScript with full access to the file system made by GitHub. It's based on Chromium, the same open source project that Chrome is based on.
http://electron.atom.io
http://electron.atom.io/docs/api/file-object/

Yes, it should be doable. In current versions of wxPython there is the wx.html2 module, which provides classes for embedding a fully featured browser in a wx window. It's not Chrome itself, but probably close enough. See https://wxpython.org/Phoenix/docs/html/wx.html2.WebView.html
For your task you can probably have the javascript trigger an action which is caught by event handlers in the application GUI code, which will then save the files or do whatever you need.

Headless node.js javascript browser with screenshot capability?

Are there any headless browsers for node.js that support dumping a rendered page out to a file? I know phantomjs supports rendering to a file, but it doesn't run on node.js. I know zombie.js is a node.js headless browser, but it doesn't support rendering to a file.

I doubt you will find anything that is going to work as well as phantomjs. I would just treat the rendering as an async backend process and execute phantom in a subprocess from your main node.js process and call it a day. Rendering a web page is HARD, and since phantom is based on WebKit, it can actually do it. I don't think there will ever be a node library that can render a web page to a graphic file that isn't built upon an existing browser rendering engine. But maybe one day phantomjs will integrate more seamlessly with node.

Try nightmare, it uses the electron, it is way faster than phantomjs, and it's API easy and uses modern ES6 javascript.

This might look like a solution with a little bit overhead...
You can use the Mozilla Firefox with the MozRepl plugin. Basically this plugin gives you a telnet port to your Firefox which allows you to control the browser from the outside. You can open URLs, take screenshots, etc.
Running the Firefox with the Xvfb server will run it in headless mode.
Now you just have to control the browser from the outside with node.js. I've seen a few examples where someone has implemented a http alike interface inside the chrome.js of Firefox. So you can run a http command to get a screenshot. You can then use http calls from node.js. This might look strange, it actually is but might work well for you.
http://hyperstruct.net/2009/02/05/turning-firefox-into-a-screenshot-server-with-mozrepl/
I'm running a slightly modified version in production with Perl Mojolicious in async mode to trigger the screenshots. However, there is a small problem. When plugins are required they do work, however Flash usually gets activated when it's in the visible area, this won't happen so movies/flash things might not get initialized.

You might find this helpful, though it's not javascript specific.
There is a webkit-based tool called "wkhtmltopdf" that I understand includes javascript support using the QT web-kit widget. It outputs a visual representation ("screenshot" if you will) of the page in PDF format.
FWIW, there are also PHP bindings for it here: php-wkthmltox

The Chrome dev team has released Puppeteer which can be used in node. It uses Chrome with the headless option.

There's a project called Node-Chimera. Although it's not as mature as Phantomjs, it has all the features you have mentioned: it runs on native Nodejs, and allows you to render pages to a file. Repository is here: https://github.com/deanmao/node-chimera. It has examples to do exactly what you need.

How do i find out what is slowing down my page load in IE7

http://shanamccormick.com
The page loads all the images and then says "(1 item remaining) Waiting on http:// shanamccormick com..." How can i see what it is waiting to load here?? and why does it take sooo long?
The index.html file uses a couple small internal JS and one external JS located within my website (jquery.min) The size of the external JS file is 54kb.

IE8 has a Javascript profiler, like Firebug for Firefox.
But I think you need Fiddler to profile the performance of the HTTP request/response.
If you want the Javascript developer tools (including profiler) I recommend moving to IE8, but if you can't,
IE7 has downloadable dev tools, like IE8's built-in capability.

I have a few solutions that might help. I learned of these from another question I asked earlier:
Firebug Lite is a JavaScript file you can insert into your pages to simulate some Firebug features in browsers that are not named "Firefox". Firebug Lite creates the variable "firebug" and doesn't affect or interfere with HTML elements that aren't created by itself.
WebWait is a website timer. Use WebWait to benchmark your website or test the speed of your web connection. Timing is accurate because WebWait pulls down the entire website into your browser, so it takes into account Ajax/Javascript processing and image loading which other tools ignore.
Also, the IE JavaScript Real Performance Tester can test your scripts.
Hope these help!

The page doesn't exist anymore? or is it: shanamccormick.com ? misstype?
Anyway I've found that googles IE8.js (for IE7 and less) (https://code.google.com/p/ie7-js/) takes up to 1 minute to load (on a computer from that time) every page load... could that have been the problem?

We Keep Coding

JavaScript is the programming language of the Web.