Continuous headless page automation with IPC - javascript

I need to make a headless (for a docker container) app that waits for an external signal and then acts on that signal by clicking on several html elements (selectors, buttons, links) and filling in some input fields. All this can be done using jQuery, I know how to do that.
The app needs to keep the page loaded so it can act immediately, reloading the page every time is taking too long. The whole action of receiving a signal and filling in a form and submit it, should be done under one second.
I made an electron app that does all this but I need to make the app headless so it can be run inside a docker container.
It looks like Phantomjs could do this but I see two problems:
The Phantom script needs to keep the web page loaded as the web page I need to automate is very heavy, it can take more than a minute to load.
The Phantom script needs to be able to receive a signal and report back on the progress. HTTP or file based is too slow, I'd like to use websockets for this communication.
I hope someone can point me to the right tools for this and/or point me to some examples how to achieve this.
I would like to use Javascript, but if there is a perfect solution in an another modern language, I have no problem to use that.

I managed to get it working inside a Docker container using Electron.

Related

I am trying to use a locally hosted Django webserver's HTML button to execute a python script that's on my local machine and I'm stumped

So, I'm pretty new to Django, python, and javascript.
I have a partially functional Django webserver that, while developing it, I host locally on my machine.
I've made a button with HTML which I've figured out can be tied to a javascript script. The next step for me was to make this button "execute" a python script that's sitting on my machine, particularly on the /desktop directory. I can of course move the script though, but the important part is that I want that python script (let's call it django_test_script.py) to open its own window on my machine. It's just a file that says "hi" in stdout, then closes after 5 seconds, but due to the plan for this whole project, I want that to happen. I want to have my website open on my machine, click that button, then have the script's console pop up on my desktop and run/finish.
The eventual goal is to control an LED strip that's plugged into my raspberry pi. I want colored buttons on the website that, when clicked, run appropriate python scripts that turn the lights to that color. It wouldn't take me long to write up the python scripts themselves that would change colors, but I need to bridge the gap between "button causes a py script to run" and the python script actually running.
I see a ton of questions similar to this, but they all seem to involve running the python scripts WITHIN the webserver, like internal files that do everything and then return something to the server via an HttpResponse.
I don't need an HttpResponse. Literally all I want to do for right now is figure out how to make a script that's stored on the machine run.
I've done some reading on AJAX and I'm guessing that's involved, however everything I've tried has failed with AJAX in terms of actually getting the server to RUN a script. I've been scouring the internet for over an hour now and have found basically nothing useful (as far as I can tell) so I figured I may as well ask for help. Can someone please point me in the right direction in terms of what I'll need to do?
I'm afraid that's not possible. The browser doesn't allow to execute local scripts since everything is (for security reasons) in a sandbox.
If you want to execute something you could do one of the following:
Use a http server which executes the script and call it via AJAX
You build it into an exectron / nodejs app which opens more possibilities to access the system. See this example
But from your requirement I assume the website will be opened on another device to remote control the Pi led? If you you'll need to use some sort of http server as otherwise you can't run the script on the pi itself.

How to post back to server after Single Page Application release/update

I have a SPA Angular website. Whenever we release a change to the website, the user's browser does not go back to the server to get the new javascript files. The app happily keeps running in the user's browser, and while it will make ajax calls for data, the javascript files do not change. This can cause errors if the signature of the back-end API being called changes, etc. If the user refreshes the page, they get the updated javascript files and everything works fine after that.
Is there a way to tell the browser that the site has been updated and to get the new javascript files, rather than just running the app with the same files?
I use the Angular CLI to build the application, so when the website is released, the javascript files have hashes at the end etc. This isn't an issue with files being cached and not updated... it's an issue with the browser knowing that it needs to request the files or refresh the page.
You could use web workers to poll the server for changes and refresh the browser when changes are found.
An alternative to web workers is using setInteterval just refresh after a given time.
Yet another alternative is to have a version number in your API responses, and the JavaScript handlers would refresh the page when the version numbers are out of sync.
You could write a program in your angular code that:
periodically checks the version of the api if changes where made
does the periodic check to ascertain when the user is idle AND when the user is not in a edit page with dirty fields.
refreshes the page when step 2 condition is met
use this library to watch idleness
https://github.com/shawnmclean/Idle.js
If the file udated have the seame name add this text after the "?" like "?ver1.1" is suppose to tell the browser that there is a new version of the file.
you can use manifest file
https://html.spec.whatwg.org/multipage/offline.html#manifests
another way is with
CacheStorage,clear()

How to have a javascript function running in page context call puppeteer functions?

I am trying to make a bot that automatically plays a game on a web page. As of right now, I can navigate to the website, sign in, and load the game page, but I am stuck here.
I would like to inject a script at the webpage level that uses jQuery to constantly scrape the webpage and determine what state the game is in. When a certain event happens, I want the script to fire a custom event(?) that would notify a function at the Node.js./Puppeteer level to execute. My problem is that I do not understand how to make Puppeteer react when a custom event from the page is fired.
For example, let's say we have a webpage level function that is on a 1 second timeout that scrapes the webpage for X. When X is found, I want to scrape Y data off the webpage and send it up to the node level. When Y is received at the node level, it moves the mouse to Y location, or do something else with Y that cannot be done at the browser level.
I'm not sure if this is the most appropriate way to handle this kind of task, but trying to understand how async and promises work on the Node.js level without being able to use jQuery to select elements is giving me an advanced form of terminal ebola AIDS.
This answer has couple of different ways how to send data back from an injected script to your Node.js code:
Communicate "out" from Chromium via DevTools protocol

Get live html feed from website

When a webpage like https://poloniex.com/exchange#btc_eth is opened in the browser, we see that the browser constantly shows updated buy and sell orders. Also, in the Elements section in the chrome console, these updates are visible in the HTML tables.
Is there a way I can use a nodejs script run on my pc (so not in the browser console) to get these live html table updates from that website, without having to do a GET request every time?
If the chrome browser is able to do it, nodejs / jQuery / ajax should be able to do it as well. I tried the XMLHttpRequest nmp module but no luck yet.
It's possible they are using token authentication which means you wouldn't be able to get all the connection info you need just from their client-side code. Have you downloaded it and looked at it yet?
If you find it's not possible to call their services, there are other free products designed for webscraping. AutoHotKey is one that can open a web page and traverse its DOM. I believe it has the ability to run in the background, but don't quote me.

What's the simplest way to refresh a particular browser tab from a Go application?

I have a client-side Golang application running on my machine. I also have a browser open, and in that browser there might be a tab running my web application (which is completely separate from the Golang app).
From the Golang app, I would like to programmatically refresh the browser tab (and maybe if possible, bring it to front, but that's less important).
I researched quite a lot already, and I concluded this is not possible just by communicating to the browser, there is no standard (especially cross-platform and cross-browser) interface with which we can trigger the refresh of a specific tab of a browser.
So I suppose I'll need to have some custom JS code running on the website with which my Golang application can communicate and trigger the refresh of the tab.
What's the easiest way to do this?
(I was looking at livereload.js and lrserver, but these all start with the premise that there is a folder of content we'd like to watch and automatically reload on any change. But I don't want that, I just programmatically want to trigger the refresh. Also, this Golang app is not hosting the website, it's just a separate client-side application.)
As suggested by some comments, there seems to be no API through which we could connect to a browser from Golang, query the list of tabs, and refresh a particular page (at least not in a cross-browser and cross-platform way).
One possible approach to do this is to host a small WebSocket endpoint in Golang, and connect to it from the site we want to refresh. Then send a message through the WS connection every time we want to reload the site, and in JavaScript call location.reload() when we receive the message.
I described all the details in a blog post, and uploaded a complete working example to GitHub.

Categories