Once I open a puppeteer browser page and redirect to a specific site. If the proxy used for the browser is banned, I want to be able to switch it without closing or restarting the browser. refreshing the browser would be the best option. However, I have not yet been able to figure out how to do it. I've tried making variables out of '--proxy-server=xxxx' and switching that variable whenever proxy is banned, but that didnt work out. I've tried many other things too but have yet been able to figure it out. Any kind of help would be much appreciated.
var proxies = get_proxy() // get proxy
var useragent = randomUseragent.getRandom()
let browser = await puppeteer.launch({
headless: false,
args: [
`--proxy-server=${proxies.address}:${proxies.port}`,
`--user-agent="${useragent}"`,
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-infobars',
'--window-position=0,0',
'--ignore-certifcate-errors',
'--ignore-certifcate-errors-spki-list'
]
})
let page = await browser.newPage()
Example of how i want it to be:
const proxy_banned = await check_proxy_ban()
If(proxy_banned){
- Switch puppeteer browser proxy
- refresh
- return function
} else{
- return function
}
Related
I am trying to build a simple scraper for this website by using puppeteer.
The code goes as follows:
const browser = await puppeteer.launch({
headless: false
});
const page = await browser.newPage();
let pagelink = "https://www.speisekarte.de/berlin/restaurants?page=1"
await page.waitFor(3 * 1000);
await page.goto(pagelink);
await page.waitFor(3 * 1000);
await page.waitForSelector("#notice")
However, I cannot access the overlay notice for the cookies which should have the Id "notice".
This does not work either for await page.waitForSelector("#notice")
in my puppeteer code.
Nor with document.getElementById("notice") in Chromium, if I use the console of Chromium during the session manually. Also, it does not work, if I use it in Firefox's console. Funnily enough, chunks like
document.querySelectorAll("button")
work as expected. I checked with a colleague and she can access the element using the above mentioned queries in her Chrome and in her Firefox browser. She also uses a Mac. Any idea, what is happening here? Any help would be much appreciated.
Most websites when you load ask you to accept cookies and privacy, I think it's mainly in the EU.
I'm struggling on how to reuse the cookies so, I don't have to keep clicking "accept all", every time I load up chrome.
The way I'm thinking is that if I click on "accept all" the first time and save the cookie, I can write a code that fetches the cookie file and it knows I accepted the website cookie and so, it doesn't pop up again.
The website I'm using for this example is https://finviz.com/
const puppeteer = require('puppeteer')
const fs = require('fs')
;(async () => {
const browser = await puppeteer.launch({ headless: false })
const page = await browser.newPage()
await page.goto('https://finviz.com/')
const cookiesString = await fs.readFile('./cookies.json')
const cookies = JSON.parse(cookiesString)
await page.setCookie(...cookies)
})()
It is at least complicated to write an app that listens for the setting of cookies to copy them to a file and put them back when the browser is restartet. The same applies for the case that you want to save the cookies manually.
But if you do that then deleting the cookies would be unnecessary - so you could simply allow cookies in the settings of your browser.
I am currently developing a node.js script that needs to launch a headful chromium instance using Puppeteer and then make a screenshot of a page every 3 seconds, this is my code :
const puppeteer = require('puppeteer');
async function init (){
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://example.com');
screenshot(page)
};
async function screenshot(page){
let buffer = await page.screenshot();
let imageBuffer = buffer.toString('base64');
// save imageBuffer to database
setTimeout(screenshot, 3000, page)
}
My current issue is that I need the user to still be able to normally navigate on the browser and on his computer but this impossible as :
The page lags when making the screenshot as you can see on the following video : https://youtu.be/Tl2w-qKckkc
The browser window focuses and goes on top of all the windows when making the screenshot.
I also tried using Playwright but the same bug occurs when using it with chromium. Can someone please help.
In Playwright, do the following:
// Affects all the platforms.
const page = await browser.newPage({ viewport: null });
// Local fix for those using Apple hardware with Retina displays.
const page = await browser.newPage({ deviceScaleFactor: 2 });
I posted a detailed reply at https://github.com/microsoft/playwright/issues/2576. Please feel free to follow up and ask questions / request features there!
i'm actually trying to use puppeteer for scraping and i need to use my current chrome to keep all my credentials and use it instead of relogin and type password each time which is a really time lose !
is there a way to connect it ? how to do that ?
i'm actually using node v11.1.0
and puppeteer 1.10.0
let scrape = async () => {
const browser = await log()
const page = await browser.newPage()
const delayScroll = 200
// Login
await page.goto('somesite.com');
await page.type('#login-email', '*******);
await page.type('#login-password', "******");
await page.click('#login-submit');
// Wait to login
await page.waitFor(1000);
}
and now it will be perfect if i do not need to use that and go on page (headless, i dont wan't to see the page opening i'm just using the info scraping in node) but with my current chrome who does not need to login to have information i need. (because at the end i want to use it as an extension of chrome)
thx in advance if someone knows how to do that
First welcome to the community.
You can use Chrome instead of Chromium but sincerely in my case, I get a lot of errors and cause a mess with my personal tabs. So you can create and save a profile, then you can login with a current or a new account.
In your code you have a function called "log" I'm guessing that there you set launch puppeeteer.
const browser = await log()
Into that function use arguments and create a relative directory for your profile data:
const browser = await puppeteer.launch({
args: ["--user-data-dir=./Google/Chrome/User Data/"]
});
Run your application, login with an account and the next time you enter you should see your credentials
Any doubt please add a comment.
I am connected to a browser using a ws endpoint (puppeteer.connect({ browserWSEndpoint: '' })).
When I launch the browser that I ultimately connect to, is there a way to launch this in incognito?
I know I can do something like this:
const incognito = await this.browser.createIncognitoBrowserContext();
But it seems like the incognito session is tied to the originally opened browser. I just want it to be by itself.
I also see you can do this:
const baseOptions: LaunchOptions = { args: ['--incognito']};
But I am not sure if this is the best way or not.
Any advice would be appreciated. Thank you!
The best way to accomplish your goal is to launch the browser directly into incognito mode by passing the --incognito flag to puppeteer.launch():
const browser = await puppeteer.launch({
args: [
'--incognito',
],
});
Alternatively, you can create a new incognito browser context after launching the browser using browser.createIncognitoBrowserContext():
const browser = await puppeteer.launch();
const context = await browser.createIncognitoBrowserContext();
You can check whether a browser context is incognito using browserContext.isIncognito():
if (context.isIncognito()) { /* ... */ }
the solutions above didn't work for me:
an incognito window is created, but then when the new page is created, it is no longer incognito.
The solution that worked for me was:
const browser = await puppeteer.launch();
const context = await browser.createIncognitoBrowserContext();
const page = await context.newPage();
then you can use page and it's an incognito page
For Puppeteer sharp it's rather messy but this seems to work.. Hopefully it helps someone.
using (Browser browser = await Puppeteer.LaunchAsync(options))
{
// create the async context
var context = await browser.CreateIncognitoBrowserContextAsync();
// get the page created by default when launch async ran and close it whilst keeping the browser active
var browserPages = await browser.PagesAsync();
await browserPages[0].CloseAsync();
// create a new page using the incognito context
using (Page page = await context.NewPageAsync())
{
// do something
}
}