How to "hook in" puppeteer into a running Chrome instance/tab

How to "hook in" puppeteer into a running Chrome instance/tab - javascript

Is it somehow possible to attach puppeteer to a running Chrome instance (manually started browser) and then takeover control within a tab? I'm assuming that it's eventually related to start the Chrome browser using the --no-sandbox flag but don't know how to continue from there.
Thanks for any help

You can use puppeteer.connect(options) (see here):
const puppeteer = require('puppeteer');
const browserWSEndpoint = 'a browser websocket endpoint to connect to';
const browser = await puppeteer.connect({browserWSEndpoint});
//continue from here

Related

how to launch Google Chrome with authenticated Proxies

I want to launch Google Chrome with authenticated proxies so I can connect a puppeteer instance with it. I'm using this cmd line to launch a new instance:
chrome --remote-debugging-port=9222 --user-data-dir="C:\Users\USER\AppData\Local\Google\Chrome\User Data
I managed to use authenticated proxies with Chrome, but it was a bit complicated, especially in my case, as I want to launch multiple Chrome browser, each with its own proxies.
I used this: proxy-login-automator
It worked fine, but as I said, it's bit complicated and it needs a bit of work, so I can integrate it as I want to use it. This is how am connecting to the Chrome instance:
const browserURL = 'http://127.0.0.1:9222';
const browser = await puppeteer.connect({browserURL});
const page = await browser.newPage();

How to get a screenshot/preview of another website

Is there a way in which you can get a screenshot of another websites pages?
e.g: you introduce a url in an input, hit enter, and a script gives you a screenshot of the site you put in. I manage to do it with headless browsers, but I fear that could take too much resources and time, to launch. let's say phantomjs each time the input is used the headless browser would need to get the new data, I investigate HotJar, it does something similar to what I'm looking for, but it gives you a script that you must put into the page header, which is fine by me, afterwards, you get a preview, how does it work?, and how can one replicate it?

Do you want a print screen of your page or someone else's?
Own page
Use puppeteer or phantomJS with Beverly build of your site, this way you will only run it when it changes, and have a screenshot ready at any time.
Foreign page
You have access to it (the owner runs your script)
Either try to get into his build pipeline, and use solution from above.
Or use this solution Using HTML5/Canvas/JavaScript to take in-browser screenshots.
You don't have any access
Use some long-running process that will give you screenshot when asked.
Imagine a server with one URL endpoint: screenshot.example.com?facebook.com.
The long-running server has a puppeteer/phantomJS instance ready to go when given URL, it will flood that page, get the screenshot and send it back. The browser will actually think of it as a slow ping image request.

You can make this with puppeteer
install with: npm i puppeteer
save the following code to example.js
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({path: 'example.png'});
await browser.close();
})();
and run it with:
node example.js

Capturing application screen with JavaScript

Is it possible to capture the entire window as screenshot using JavaScript?
The application might contain many iframes and div's where content are loaded asynchronously.
I have explored canvas2image but it works on an html element, using the same discards any iframe present on the page.
I am looking for a solution where the capture will take care of all the iframes present.

The only way to capture the contents of an iframe using ONLY JavaScript in the webpage (No extensions, or application running outside the browser on a users system) is to use the HTMLIFrameElement.getScreenshot() API in Firefox. This API is non-standard, and ONLY works in Firefox.
For any other browser, no. An iframe is typically sandboxed, and as such it is not accessible by the browser by design.
The best way to get a screenshot of a webpage that I have found and use, is an instance of Headless Chrome or Headless Firefox. These will take a screenshot of everything on the page, just as a user would see it.

Yes, widh Puppeteer it is possible.
1 - Just install the dependency:
npm i puppeteer-core
2 - Create JavaScript file, screenshot.js
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://yourweb.com');
await page.screenshot({path: 'screenshot.png'});
await browser.close();
})();
3 - Run:
node screenshot.js
Source

Web pages are not the best things to be "screenshoted", because of their nature; they can include async elements, frames or something like that, they are usually responsive etc...
For your purpose the best way is to use external api or an external service, I think is not a good idea to try doing that with JS.
You should try https://www.url2png.com/

How to remain Chromium opened using Puppeteer?

I want to launch only one Chromium instance from first script and then attach to it from other scripts. I know about puppeteer.connect() but the problem is that I start the script which is supposed to launch Chromium:
const puppeteer = require('puppeteer');
const fs = require('fs');
const logger = fs.createWriteStream('log.txt', {
flags: 'a' // 'a' means appending (old data will be preserved)
});
(async() => {
const browser = await puppeteer.launch({ headless: false});
logger.write('-----Browser is launched\n');
logger.write(browser.wsEndpoint());
})();
...and it never ends because I didn`t do browser.close(). Thus, I can`t start running other scripts. How can I launch Chromium, obtain its endpoint and end the script remaining Chromium launched.
(This one doesn`t contain an appropriate answer)

Answering question
Basically you can spawn child_process with detached set to true. Then exit your main script with process.exit() to launch Chromium see 1.js.
Script that responsible to launch Chromium and saving the web socket see chromiumLauncher.js
When the web socket are saved, you can connect via puppeteer.launch see 2.js
Here i push it on github (dirty code).

Puppeteer: is there a way to access the DevTools Network API?

I am trying to use Puppeteer for end-to-end tests. These tests require accessing the network emulation capabilities of DevTools (e.g. to simulate offline browsing).
So far I am using chrome-remote-interface, but it is too low-level for my taste.
As far as I know, Puppeteer does not expose the network DevTools features (emulateNetworkConditions in the DevTools protocol).
Is there an escape hatch in Puppeteer to access those features, e.g. a way to execute a Javascript snippet in a context in which the DevTools API is accessible?
Thanks
Edit:
OK, so it seems that I can work around the lack of an API using something like this:
const client = page._client;
const res = await client.send('Network.emulateNetworkConditions',
{ offline: true, latency: 40, downloadThroughput: 40*1024*1024,
uploadThroughput: 40*1024*1024 });
But I suppose it is Bad Form and may slip under my feet at any time?

Update: headless Chrome now supports network throttling!
In Puppeteer, you can emulate devices (https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pageemulateoptions) but not network conditions. It's something we're considering, but headless Chrome needs to support network throttling first.
To emulate a device, I'd use the predefined devices found in DeviceDescriptors:
const puppeteer = require('puppeteer');
const devices = require('puppeteer/DeviceDescriptors');
const iPhone = devices['iPhone 6'];
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
await page.emulate(iPhone);
await page.goto('https://www.google.com');
// other actions...
browser.close();
});

We Keep Coding

JavaScript is the programming language of the Web.

How to "hook in" puppeteer into a running Chrome instance/tab - javascript

Is it somehow possible to attach puppeteer to a running Chrome instance (manually started browser) and then takeover control within a tab? I'm assuming that it's eventually related to start the Chrome browser using the --no-sandbox flag but don't know how to continue from there. Thanks for any help

You can use puppeteer.connect(options) (see here): const puppeteer = require('puppeteer'); const browserWSEndpoint = 'a browser websocket endpoint to connect to'; const browser = await puppeteer.connect({browserWSEndpoint}); //continue from here

Related

how to launch Google Chrome with authenticated Proxies

How to get a screenshot/preview of another website

Capturing application screen with JavaScript

How to remain Chromium opened using Puppeteer?

Puppeteer: is there a way to access the DevTools Network API?

Categories

Resources