Puppeteer Launch Incognito - javascript

I am connected to a browser using a ws endpoint (puppeteer.connect({ browserWSEndpoint: '' })).
When I launch the browser that I ultimately connect to, is there a way to launch this in incognito?
I know I can do something like this:
const incognito = await this.browser.createIncognitoBrowserContext();
But it seems like the incognito session is tied to the originally opened browser. I just want it to be by itself.
I also see you can do this:
const baseOptions: LaunchOptions = { args: ['--incognito']};
But I am not sure if this is the best way or not.
Any advice would be appreciated. Thank you!

The best way to accomplish your goal is to launch the browser directly into incognito mode by passing the --incognito flag to puppeteer.launch():
const browser = await puppeteer.launch({
args: [
'--incognito',
],
});
Alternatively, you can create a new incognito browser context after launching the browser using browser.createIncognitoBrowserContext():
const browser = await puppeteer.launch();
const context = await browser.createIncognitoBrowserContext();
You can check whether a browser context is incognito using browserContext.isIncognito():
if (context.isIncognito()) { /* ... */ }

the solutions above didn't work for me:
an incognito window is created, but then when the new page is created, it is no longer incognito.
The solution that worked for me was:
const browser = await puppeteer.launch();
const context = await browser.createIncognitoBrowserContext();
const page = await context.newPage();
then you can use page and it's an incognito page

For Puppeteer sharp it's rather messy but this seems to work.. Hopefully it helps someone.
using (Browser browser = await Puppeteer.LaunchAsync(options))
{
// create the async context
var context = await browser.CreateIncognitoBrowserContextAsync();
// get the page created by default when launch async ran and close it whilst keeping the browser active
var browserPages = await browser.PagesAsync();
await browserPages[0].CloseAsync();
// create a new page using the incognito context
using (Page page = await context.NewPageAsync())
{
// do something
}
}

Related

Puppeteer change proxy of already opened browser page

Once I open a puppeteer browser page and redirect to a specific site. If the proxy used for the browser is banned, I want to be able to switch it without closing or restarting the browser. refreshing the browser would be the best option. However, I have not yet been able to figure out how to do it. I've tried making variables out of '--proxy-server=xxxx' and switching that variable whenever proxy is banned, but that didnt work out. I've tried many other things too but have yet been able to figure it out. Any kind of help would be much appreciated.
var proxies = get_proxy() // get proxy
var useragent = randomUseragent.getRandom()
let browser = await puppeteer.launch({
headless: false,
args: [
`--proxy-server=${proxies.address}:${proxies.port}`,
`--user-agent="${useragent}"`,
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-infobars',
'--window-position=0,0',
'--ignore-certifcate-errors',
'--ignore-certifcate-errors-spki-list'
]
})
let page = await browser.newPage()
Example of how i want it to be:
const proxy_banned = await check_proxy_ban()
If(proxy_banned){
- Switch puppeteer browser proxy
- refresh
- return function
} else{
- return function
}

Puppeteer and Playwright chrome headful bugs when making a screenshot

I am currently developing a node.js script that needs to launch a headful chromium instance using Puppeteer and then make a screenshot of a page every 3 seconds, this is my code :
const puppeteer = require('puppeteer');
async function init (){
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://example.com');
screenshot(page)
};
async function screenshot(page){
let buffer = await page.screenshot();
let imageBuffer = buffer.toString('base64');
// save imageBuffer to database
setTimeout(screenshot, 3000, page)
}
My current issue is that I need the user to still be able to normally navigate on the browser and on his computer but this impossible as :
The page lags when making the screenshot as you can see on the following video : https://youtu.be/Tl2w-qKckkc
The browser window focuses and goes on top of all the windows when making the screenshot.
I also tried using Playwright but the same bug occurs when using it with chromium. Can someone please help.
In Playwright, do the following:
// Affects all the platforms.
const page = await browser.newPage({ viewport: null });
// Local fix for those using Apple hardware with Retina displays.
const page = await browser.newPage({ deviceScaleFactor: 2 });
I posted a detailed reply at https://github.com/microsoft/playwright/issues/2576. Please feel free to follow up and ask questions / request features there!

Can the browser turned headless mid-execution when it was started normally, or vice-versa?

I want to start a chromium browser instant headless, do some automated operations, and then turn it visible before doing the rest of the stuff.
Is this possible to do using Puppeteer, and if it is, can you tell me how? And if it is not, is there any other framework or library for browser automation that can do this?
So far I've tried the following but it didn't work.
const browser = await puppeteer.launch({'headless': false});
browser.headless = true;
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'hn.pdf', format: 'A4'});
Short answer: It's not possible
Chrome only allows to either start the browser in headless or non-headless mode. You have to specify it when you launch the browser and it is not possible to switch during runtime.
What is possible, is to launch a second browser and reuse cookies (and any other data) from the first browser.
Long answer
You would assume that you could just reuse the data directory when calling puppeteer.launch, but this is currently not possible due to multiple bugs (#1268, #1270 in the puppeteer repo).
So the best approach is to save any cookies or local storage data that you need to share between the browser instances and restore the data when you launch the browser. You then visit the website a second time. Be aware that any state the website has in terms of JavaScript variable, will be lost when you recrawl the page.
Process
Summing up, the whole process should look like this (or vice versa for headless to headfull):
Crawl in non-headless mode until you want to switch mode
Serialize cookies
Launch or reuse second browser (in headless mode)
Restore cookies
Revisit page
Continue crawling
As mentioned, this isn't currently possible since the headless switch occurs via Chromium launch flags.
I usually do this with userDataDir, which the Chromium docs describe as follows:
The user data directory contains profile data such as history, bookmarks, and cookies, as well as other per-installation local state.
Here's a simple example. This launches a browser headlessly, sets a local storage value on an arbitrary page, closes the browser, re-opens it headfully, retrieves the local storage value and prints it.
const puppeteer = require("puppeteer"); // ^18.0.4
const url = "https://www.example.com";
const opts = {userDataDir: "./data"};
let browser;
(async () => {
{
browser = await puppeteer.launch({...opts, headless: true});
const [page] = await browser.pages();
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.evaluate(() => localStorage.setItem("hello", "world"));
await browser.close();
}
{
browser = await puppeteer.launch({...opts, headless: false});
const [page] = await browser.pages();
await page.goto(url, {waitUntil: "domcontentloaded"});
const result = await page.evaluate(() => localStorage.getItem("hello"));
console.log(result); // => world
}
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;
Change const opts = {userDataDir: "./data"}; to const opts = {}; and you'll see null print instead of world; the user data doesn't persist.
The answer from a few years ago mentions issues with userDataDir and suggests a cookies solution. That's fine, but I haven't had any issues with userDataDir so either they've been resolved on the Puppeteer end or my use cases haven't triggered the issues.
There's a useful-looking answer from a reputable source in How to turn headless on after launch? but I haven't had a chance to try it yet.

How can use puppeteer with my current chrome (keeping my credentials)

i'm actually trying to use puppeteer for scraping and i need to use my current chrome to keep all my credentials and use it instead of relogin and type password each time which is a really time lose !
is there a way to connect it ? how to do that ?
i'm actually using node v11.1.0
and puppeteer 1.10.0
let scrape = async () => {
const browser = await log()
const page = await browser.newPage()
const delayScroll = 200
// Login
await page.goto('somesite.com');
await page.type('#login-email', '*******);
await page.type('#login-password', "******");
await page.click('#login-submit');
// Wait to login
await page.waitFor(1000);
}
and now it will be perfect if i do not need to use that and go on page (headless, i dont wan't to see the page opening i'm just using the info scraping in node) but with my current chrome who does not need to login to have information i need. (because at the end i want to use it as an extension of chrome)
thx in advance if someone knows how to do that
First welcome to the community.
You can use Chrome instead of Chromium but sincerely in my case, I get a lot of errors and cause a mess with my personal tabs. So you can create and save a profile, then you can login with a current or a new account.
In your code you have a function called "log" I'm guessing that there you set launch puppeeteer.
const browser = await log()
Into that function use arguments and create a relative directory for your profile data:
const browser = await puppeteer.launch({
args: ["--user-data-dir=./Google/Chrome/User Data/"]
});
Run your application, login with an account and the next time you enter you should see your credentials
Any doubt please add a comment.

Detect and test Chrome Extension using Puppeteer

Is there a way to test a Chrome extension using Puppeteer? For example can an extension detect that Chrome was launched in "test" mode to provide different UI, check content scripts are working, etc?
Passing --user-agent in puppeteer.launch() is a useful way to override the browser's UA with a custom value. Then, your extension can read back navigator.userAgent in its background page and identify that Chrome was launched with Puppeteer. At that point, you can provide different code paths for testing the crx vs. normal operation.
puppeteer_script.js
const puppeteer = require('puppeteer');
const CRX_PATH = '/path/to/crx/folder/';
puppeteer.launch({
headless: false, // extensions only supported in full chrome.
args: [
`--disable-extensions-except=${CRX_PATH}`,
`--load-extension=${CRX_PATH}`,
'--user-agent=PuppeteerAgent'
]
}).then(async browser => {
// ... do some testing ...
await browser.close();
});
Extension background.js
chrome.runtime.onInstalled.addListener(details => {
console.log(navigator.userAgent); // "PuppeteerAgent"
});
Alternatively, if you wanted to preserve the browser's original UA string, it gets tricky.
Launch Chrome and create a blank page in Puppeteer.
Set its title to a custom name.
Detect the tab's title update in your background script.
Set a global flag to reuse later.
background.js
let LAUNCHED_BY_PUPPETEER = false; // reuse in other parts of your crx as needed.
chrome.tabs.onUpdated.addListener((tabId, info, tab) => {
if (!LAUNCHED_BY_PUPPETEER && tab.title.includes('PuppeteerAgent')) {
chrome.tabs.remove(tabId);
LAUNCHED_BY_PUPPETEER = true;
}
});
puppeteer_script.js
const puppeteer = require('puppeteer');
const CRX_PATH = '/path/to/crx/folder/';
puppeteer.launch({
headless: false, // extensions only supported in full chrome.
args: [
`--disable-extensions-except=${CRX_PATH}`,
`--load-extension=${CRX_PATH}`,
]
}).then(async browser => {
const page = await browser.newPage();
await page.evaluate("document.title = 'PuppeteerAgent'");
// ... do some testing ...
await browser.close();
});
Note: The downside is that this approach requires the "tabs" permission in manifest.json.
Testing an extension page
Let's say you wanted to test your popup page UI? One way to do that would be to navigate to its chrome-extension:// URL directly, then use puppeteer to do the UI testing:
// Can we navigate to a chrome-extension page? YES!
const page = await browser.newPage();
await page.goto('chrome-extension://ipfiboohojhbonenbbppflmpfkakjhed/popup.html');
// click buttons, test UI elements, etc.
To create a stable extension id for testing, check out: https://stackoverflow.com/a/23877974/274673

Categories