How to handle "accept cookies"? - javascript

I am trying to make a scraper that gets the reviews for hotel on tripadvisor.com. I was just working with pagination and testing if the browser would go all the way to the end, where there is no more pages.
Here is my code so far:
const puppeteer = require("puppeteer");
const cheerio = require("cheerio");
async function main() {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('https://www.tripadvisor.com/Hotels-g298656-Ankara-Hotels.html');
while(true) {
await page.click('a[class="nav next ui_button primary"]');
await page.waitForNavigation({waitUntil: 'networkidle0'});
}
}
main();
However, this stops when the 'accept cookies' popup appears. How can I handle this?

Related

chromium always shows "about:blank" and stops working on raspberry pi 3A+

on raspberry pi 3A+ chromium always shows "about:blank" and stops working unless I close the tab manually and enable new tab
const puppeteer = require("puppeteer");
async function test() {
const browser = await puppeteer.launch({
headless: false,
executablePath: "chromium-browser",
});
const [page] = await browser.pages();
await page.evaluate(() => window.open("https://www.example.com/"));
const page1 = await browser.newPage();
page.close();
await page1.goto("https://allegro.pl/");
await page1.screenshot({ path: "hello.png" });
await browser.close();
}
test();
enter image description here
try to working code on my raspberry pi 3A+

How do I open a new window page from a button in Puppeteer?

I'm trying to open a new window page from a button in Puppeteer.
An example given: I'm logging to a website and the moment I click the button for the login a new fresh window page will pop-up, redirecting to the site the button is meant to be going. How can I do it?
You can do that by simply pressing Shift button while doing page.click
And to catch the newly opened window you can use waitForTarget.
const puppeteer = require('puppeteer')
;(async () => {
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
})
const context = browser.defaultBrowserContext()
const page = (await context.pages())[0]
await page.goto('https://www.amazon.com/gp/product/B093GQSVPX/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1', {waitUntil: 'load'})
await page.waitForSelector('a[title="Add to List"]', {visible: true})
await page.keyboard.down('Shift')
await page.click('a[title="Add to List"]')
await page.keyboard.up('Shift')
const popup = await browser.waitForTarget(
(target) => target.url().includes('www.amazon.com/ap/signin')
)
const popupPage = await popup.page()
await popupPage.waitForSelector('a.a-link-expander[role="button"]')
await popupPage.click('a.a-link-expander[role="button"]')
await popupPage.click('input#continue[type="submit"]')
await browser.close()
})()

Puppeteer for scraping a page (with authentication)

I am using puppeteer for scraping a page (load test application) and I cannot add username and password into this page. Does anyone of you know puppeteer and may help me? This is the code:
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto(“https://d22syekf1i694k.cloudfront.net/”, {waitUntil: ‘networkidle2’});
await page.waitForSelector(‘input[name=username]’);
await page.type(‘input[name=username]’, ‘Adenosine’);
await page.$eval(‘input[name=username]’, el => el.value = ‘Adenosine’);
await browser.close();
})(); ```

Puppeteer: Grabbing html from page that doesn't refresh after input tag button is clicked

I am trying to grab some html after a input tag button is clicked. I am clicking the button with page.evaluate() since page.click() does not seem to work for an input tag button. I have tried visual debugging with headless:false in the puppeteer launch options to verify that the browser indeed navigated to the point after the button is clicked. I am unsure as to why page.content() returns the html before the button is clicked rather than the html after the event happens.
const puppeteer = require('puppeteer');
const url = 'http://www.yvr.ca/en/passengers/flights/departing-flights';
const fs = require('fs');
const tomorrowSelector = '#flights-toggle-tomorrow'
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
await page.goto(url);
await page.evaluate((selector)=>document.querySelector(selector).click(),tomorrowSelector);
let html = await page.content();
await fs.writeFile('index.html', html, function(err){
if (err) console.log(err);
console.log("Successfully Written to File.");
});
await browser.close();
});
You can click on the label for the radio. Also, you need to wait for some sign of changed state (for XHR/fetch response or new selectors). For example, this code works for me, but you can use any other condition or just wait for some seconds.
const fs = require('fs');
const puppeteer = require('puppeteer');
const url = 'http://www.yvr.ca/en/passengers/flights/departing-flights';
const tomorrowLabelSelector = 'label[for=flights-toggle-tomorrow]';
const tomorrowLabelSelectorChecked = '.yvr-form__toggle:checked + label[for=flights-toggle-tomorrow]';
puppeteer.launch({ headless: false }).then(async (browser) => {
const page = await browser.newPage();
await page.goto(url);
await Promise.all([
page.click(tomorrowLabelSelector),
page.waitForSelector(tomorrowLabelSelectorChecked),
]);
const html = await page.content();
await fs.writeFile('index.html', html, (err) => {
if (err) console.log(err);
console.log('Successfully Written to File.');
});
// await browser.close();
});

Not able to capture image while generating pdf using puppeteer API

Node- v8.11.1 Headless Chrome
Im trying to generate PDF but somehow the background image is not captured in the PDF.
Below is the code. Any help is appreciated
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('http://54.201.139.151/', {waitUntil : 'networkidle0'});
await page.pdf({path: 'hn40.pdf', printBackground: true, width: '1024px' , height: '768px'});
await browser.close();
})();
Update: page.emulateMedia() is dropped in favor of page.emulateMediaType()
As Rippo mentioned, you require page.emulateMedia("screen") for this to work properly. I have updated your script below, but I changed the page to google for testing.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://google.ca/', {waitUntil : 'networkidle2'});
await page.emulateMedia('screen');
await page.pdf({path: 'hn40.pdf', printBackground: true, width: '1024px' , height: '768px'});
await browser.close();
})();

Categories