How to click on popup contents in Puppeteer? - javascript

I open the 'deliver to' popup but am not able to click on the input field and enter information.
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
const url = 'https://www.tntsupermarket.com/eng/store-flyer';
await page.goto(url, {waitUntil: 'networkidle0'});
const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page())));
await page.evaluate(()=> {
document.querySelector('span[class="deliverCss-city-FJJ"]').click();
});
const popup = await newPagePromise;
await popup.waitForSelector('input[aria-label="Enter your Postal Code"]');
await popup.focus('input[aria-label="Enter your Postal Code"]');
await popup.click('input[aria-label="Enter your Postal Code"]');
await popup.keyboard.type('a2b');
})();

The pop-up isn't a new page, just a modal element that's shown with JS and without navigation. Removing the navigation promise gives a pretty clear result:
const puppeteer = require("puppeteer"); // ^13.5.1
let browser;
(async () => {
browser = await puppeteer.launch({headless: false});
const [page] = await browser.pages();
const url = "https://www.tntsupermarket.com/eng/store-flyer";
await page.goto(url, {waitUntil: "networkidle0", timeout: 90000});
const cityEl = await page.waitForSelector('span[class="deliverCss-city-FJJ"]');
await cityEl.evaluate(el => el.click());
const postalSel = 'input[aria-label="Enter your Postal Code"]';
const postalEl = await page.waitForSelector(postalSel);
await postalEl.type("a2b");
await page.waitForTimeout(30000); // just to show that the state is as we wish
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;
This is a bit slow; there's an annoying pop-up you might wish to click off instead of using "networkidle0":
// ... same code
await page.goto(url, {waitUntil: "domcontentloaded", timeout: 90000});
const closeEl = await page.waitForSelector("#closeActivityPop");
await closeEl.click();
const cityEl = await page.waitForSelector('span[class="deliverCss-city-FJJ"]');
// same code ...
On quick glance, if the page is cached, the pop-up might not show, so you might want to abort page.waitForSelector("#closeActivityPop"); after 30 seconds or so and continue with the code without clicking on it, depending on how flexible you want the script to be.

Related

Capture a screenshot as a table using Puppeteer

I am learning to scrape items from a website using Puppeteer. I am using table data from Basketball reference.com to practice. What I have done so far is use the puppeteer to Search the stats of my favorite player (Stephen Curry), access the table page, and take a screenshot of the page which then finishes the scraping process and closes the browser. However, I cannot seem to scrape the table I need and I am completely stuck.
The following is the code I have written so far:
const puppeteer = require("puppeteer");
async function run() {
const browser = await puppeteer.launch({
headless: false,
ignoreHTTPSErrors: true,
});
const page = await browser.newPage();
await page.goto(`https://www.basketball-reference.com/`);
await page.waitForSelector("input[name=search]");
await page.$eval("input[name=search]", (el) => (el.value = "Stephen Curry"));
await page.click('input[type="submit"]');
await page.waitForSelector(`a[href='${secondPageLink}']`, { visible: true });
await page.click(`a[href='${secondPageLink}']`);
await page.waitForSelector();
await page.screenshot({
path: `StephenCurryStats.png`,
});
await page.close();
await browser.close();
}
run();
I am trying to scrape the PER GAME table on the following link and take its screenshot. However, I cannot seem to find the right selector to pick and scrape and I am very confused.
The URL is https://www.basketball-reference.com/players/c/curryst01.html
There seems to be at least a couple of issues here. I'm not sure what secondPageLink refers to or the intent behind await page.waitForSelector() (throws TypeError: Cannot read properties of undefined (reading 'startsWith') on my version). I would either select the first search result with .search-item-name a[href] or skip that page entirely by clicking on the first autocompleted name in the search after using page.type(). Even better, you can build the query string URL (e.g. https://www.basketball-reference.com/search/search.fcgi?search=stephen+curry) and navigate to that in your first goto.
The final page loads a video and a ton of Google ad junk. Best to block all requests that aren't relevant to the screenshot.
const puppeteer = require("puppeteer"); // ^16.2.0
let browser;
(async () => {
browser = await puppeteer.launch({headless: true});
const [page] = await browser.pages();
const url = "https://www.basketball-reference.com/";
await page.setViewport({height: 600, width: 1300});
await page.setRequestInterception(true);
const allowed = [
"https://www.basketball-reference.com",
"https://cdn.ssref.net"
];
page.on("request", request => {
if (allowed.some(e => request.url().startsWith(e))) {
request.continue();
}
else {
request.abort();
}
});
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.type('input[name="search"]', "Stephen Curry");
const $ = sel => page.waitForSelector(sel);
await (await $(".search-results-item")).click();
await (await $(".adblock")).evaluate(el => el.remove());
await page.waitForNetworkIdle();
await page.screenshot({
path: "StephenCurryStats.png",
fullPage: true
});
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
If you just want to capture the per game table:
// same boilerplate above this line
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.type('input[name="search"]', "Stephen Curry");
const $ = sel => page.waitForSelector(sel);
await (await $(".search-results-item")).click();
const table = await $("#per_game");
await (await page.$(".scroll_note"))?.click();
await table.screenshot({path: "StephenCurryStats.png"});
But I'd probably want a CSV for maximum ingestion:
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.type('input[name="search"]', "Stephen Curry");
const $ = sel => page.waitForSelector(sel);
await (await $(".search-results-item")).click();
const btn = await page.waitForFunction(() =>
[...document.querySelectorAll("#all_per_game-playoffs_per_game li button")]
.find(e => e.textContent.includes("CSV"))
);
await btn.evaluate(el => el.click());
const csv = await (await $("#csv_per_game"))
.evaluate(el => [...el.childNodes].at(-1).textContent.trim());
const table = csv.split("\n").map(e => e.split(",")); // TODO use proper CSV parser
console.log(table);

Puppeteer: line of code being executed before others

I have this code:
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://www.sisal.it/scommesse-matchpoint/quote/calcio/serie-a");
const [button1] = await
page.$x('//div[#class="marketBar_changeMarketLabel__l0vzl"]/p');
button1.click();
const [button2] = await page.$x('//div[#class="listItem_container__2IdVR white
marketList_listItemHeight__1aiAJ marketList_bgColorGrey__VdrVK"]/p[text()="1X2
ESITO FINALE"]');
button2.click();
})();
The proble is that after clicking button1 the page change and puppeteer executes immediately the following line of code, instead I want it to wait for the new page to be loaded becuase otherwise It will throw an error since It can't find button2.
I found this solution on stackoverflow:
const puppeteer = require("puppeteer");
function delay(time) {
return new Promise(function (resolve) {
setTimeout(resolve, time);
});
}
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://www.sisal.it/scommesse-matchpoint/quote/calcio/serie-a");
const [button1] = await
page.$x('//div[#class="marketBar_changeMarketLabel__l0vzl"]/p');
button1.click();
await delay(4000);
const [button2] = await page.$x('//div[#class="listItem_container__2IdVR white
marketList_listItemHeight__1aiAJ
marketList_bgColorGrey__VdrVK"]/p[text()="1X2
ESITO FINALE"]');
button2.click();
})();
But of course this in't the best solution.
I think you have to modify a bit in your code:
await button1.click();
await page.waitForNavigation({waitUntil: 'networkidle2'});
For reference, see the documentation.
I found a solution, here's the code:
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://www.sisal.it/scommesse
matchpoint/quote/calcio/serie-a");
await page.waitForXPath('//div[#class="marketBar_changeMarketLabel__l0vzl"]/p');
const [button1] = await page.$x('//div[#class="marketBar_changeMarketLabel__l0vzl"]/p');
await button1.click();
await page.waitForXPath('//div[#class="listItem_container__2IdVR white marketList_listItemHeight__1aiAJ marketList_bgColorGrey__VdrVK"]/p[text()="1X2 ESITO FINALE"]');
const [button2] = await page.$x('//div[#class="listItem_container__2IdVR white marketList_listItemHeight__1aiAJ marketList_bgColorGrey__VdrVK"]/p[text()="1X2 ESITO FINALE"]');
button2.click();
})();

Is puppeteer supplying real time data

I'm trying to web scrape a live scores every score change. Can puppeteer do this? If it can what should I add in this code so it returns live data.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('site to go');
await page.waitForSelector('input[name="username"]');
await page.type('input[name="username"]', 'username');
await page.type('input[name="password"]', 'password');
await page.click('button[type="submit"]');
let score = await page.evaluate(() => document.getElementById("scores").innerHTML);
})();
You could use exposeFunction to register a callback function:
await page.exposeFunction('newScore', s => console.log(s));
Then you can call that function on the DOMSubtreeModified event:
page.evaluate(() => document.getElementById('scores')
.addEventListener('DOMSubtreeModified', () => newScore(element.innerHTML)));

I'm having trouble loggin in using puppeteer . Also can't find the selector for sign in button in the site provided bellow

Bellow is the code i'm trying, please help!
i'm facing two problems,
1. the browser is opening at http://164.52.197.129/signin but after a certain time it goes back to http://164.52.197.129
2. I can't locate the sign in button selector. the selector i'm choosing is not working , maybe because it's nested .
const puppeteer = require('puppeteer');
// const URL = 'http://164.52.197.129/signin';
const chromeOptions = {
headless:false,
defaultViewport: null};
(async function main() {
const browser = await puppeteer.launch(chromeOptions);
const page = await browser.newPage();
//await page.setDefaultNavigationTimeout(0);
console.log("Opening page");
await page.goto(('http://164.52.197.129/signin'), { waitUntil: 'networkidle2' , timeout: 60000 });
console.log("Page opened");
await page.waitForSelector('#email', {timeout: 60000});
console.log("Inputting username");
await page.type('#email', 'guest#gmail.com');
console.log("Username input completed");
await page.waitForSelector('#password', {timeout: 60000});
console.log("Inputting password");
await page.type('#password', 'sdah1234');
console.log("Password input completed");
await page.click('#app > div > main > div > div > div > form > div > div.v-card__text > div > div.text-xs-center.col > button');
await page.waitForNavigation({waitUntil: 'networkidle2'});
})()
I would suggest such algorithm:
Open the page.
Wait for the redirection (carousel appears).
Require signing form again (clicking on the link by page.click() does not work, so we are using page.evaluate()).
Wait for the form.
As the form is autocompleted before page.type() and the input is doubled, we are using page.evaluate() again.
Click and wait for navigation in Promise.all() to avoid a race condition.
const puppeteer = require('puppeteer');
const chromeOptions = {
headless:false,
defaultViewport: null};
(async function main() {
const browser = await puppeteer.launch(chromeOptions);
const page = await browser.newPage();
await page.goto(('http://164.52.197.129/signin'), { waitUntil: 'networkidle2' , timeout: 60000 });
await page.waitForSelector('.carousel-3d-container');
await page.waitForSelector('a[href="/signin"]');
await page.evaluate(() => { document.querySelector('a[href="/signin"]').click(); });
await page.waitForSelector('#email', {timeout: 60000});
await page.waitForSelector('#password', {timeout: 60000});
await page.evaluate(() => {
document.querySelector('#email').value = 'guest#gmail.com';
document.querySelector('#password').value = 'sdah1234';
});
await Promise.all([
page.click('#app form button'),
page.waitForNavigation({waitUntil: 'networkidle2'}),
]);
console.log("Done");
})();

Puppeteer not working as expected when clicking button

My problem is that I need to set the comment selector to "all comments" whit puppeteer but the comments don't render after that puppeteer clicks on the correct button, "all the comments", the comment section just disappears, I will provide the code and a video of the browser in action.
const $ = require('cheerio');
const puppeteer = require('puppeteer');
const url = 'https://www.facebook.com/pg/SamsungGlobal/posts/';
const main = async () => {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.setViewport({
width: 1920,
height: 1080
});
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 0
});
page.mouse.click(50, 540, {});
for (var a = 0; a < 18; a++) {
setTimeout(() => {}, 16);
await page.keyboard.press('ArrowDown');
}
let bodyHTML = await page.evaluate(() => document.body.innerHTML);
var id = "#" + $("._427x ._4-u2.mbm._4mrt", bodyHTML).attr('id'); // selects id of first post
try {
var exp = await page.$(`${id} a._21q1`); // clicks on "most relevant" from the first post
await exp.evaluate(exp => exp.click());
await page.click('div[data-ordering="RANKED_UNFILTERED"]'); // selects "all the comments"
var exp = await page.$(`${id} a._42ft`); // should click on "more comments" but it doesn't load
await exp.evaluate(exp => exp.click());
await page.waitForSelector(`${id} a._5v47.fss`); // wait for the "others" in facebook comments
var exp = await page.$$(`${id} a._5v47.fss`);
await exp.evaluate(exp => exp.click());
await page.screenshot({
path: "./srn4.png"
});
// var post = await page.$eval(id + " .userContentWrapper", el => el.innerHTML);
// console.log("that's the post " + post);
} catch (e) {
console.log(e);
}
setTimeout(async function() {
await browser.close(); //close after some time
}, 1500);
};
main();
That's the video of the full execution process: https://youtu.be/jXpSOBfVskg
That's a slow motion of the moment it click on the menu: https://youtu.be/1OgfFNokxsA
You can try a variant with selectors:
'use strict';
const puppeteer = require('puppeteer');
(async function main() {
try {
const browser = await puppeteer.launch({ headless: false });
const [page] = await browser.pages();
await page.goto('https://www.facebook.com/pg/SamsungGlobal/posts/');
await page.waitForSelector('[data-ordering="RANKED_THREADED"]');
await page.click('[data-ordering="RANKED_THREADED"]');
await page.waitForSelector('[data-ordering="RANKED_UNFILTERED"]');
await page.click('[data-ordering="RANKED_UNFILTERED"]');
} catch (err) {
console.error(err);
}
})();
page.mouse.click(50, 540, {});
This is not going to work necessarily. What are you trying to click? You need to use CSS selectors to find elements that you want to click.
Also, dynamic elements might not appear in the page right away. You should use waitForSelector as needed.

Categories