I want to type into 2 inputs at the same time but in fact both texts go to the second input.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://example.com");
await Promise.all([
page.type("#user", "user"),
page.type("#password", "password"),
]);
await browser.close();
})();
The second input looks like upsaesrsword
The behavior is intended.
Related issue on GitHub:
https://github.com/puppeteer/puppeteer/issues/1958
Alternative solution:
page.$eval(
'#user',
(handle, text) => {
handle.value = text;
handle.dispatchEvent(new Event('change', {bubbles}));
},
'user'
);
Related
I have a string which of value changes, and I need to put this string into an input field.
(async () => {
let pageNum = 1004327;
let browser = await puppeteer.launch({
headless: true,
});
let page = await browser.newPage();
while (1) {
await page.goto(`${link}${pageNum}`);
page.setDefaultTimeout(0);
let html = await page.evaluate(async () => {
let mail = document.getElementById(
"ctl00_phWorkZone_DbLabel8"
).innerText;
let obj = {
email: mail,
link: window.location.href,
};
return obj;
});
if (Object.values(html)[0]) {
await staff.type(
"textarea[name=mail_address]",
Object.values(html).join("\n"),
{
delay: 0,
}
);
console.log(pageNum);
pageNum++;
staff.click("input[name=save]", { delay: 0 });
}
}
})();
I used .type() method, and it works, but I need something faster.
.type() method will allow you to fill the input like human typing. Instead of using that, just try .keyboard.sendCharacter() method (source). It will allow you to fill in the input instantly without typing.
Example how to use it :
const puppeteer=require('puppeteer');
const browser=await puppeteer.launch({ headless: false });
const page=await browser.newPage()
await page.goto("https://stackoverflow.com/q/75395199/12715723")
let input=await page.$('input[name="q"]');
await input.click();
await page.keyboard.sendCharacter('test');
I'm not sure how the input works on the page since it requires auth, but in many cases (depending on how the site listens for changes on the element) you can set the value using browser JS:
staff.$eval(
'textarea[name="mail_address"]',
(el, value) => el.value = value,
evalObject.values(html).join("\n")
);
Runnable, minimal example:
const puppeteer = require("puppeteer"); // ^19.6.3
const html = `<!DOCTYPE html><html><body>
<textarea name="mail_address"></textarea></body></html>`;
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.setContent(html);
const sel = 'textarea[name="mail_address"]';
await page.$eval(
sel,
(el, value) => el.value = value,
"testing"
);
console.log(await page.$eval(sel, el => el.value)); // => testing
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
If this doesn't work, you can try triggering a change handler on the element with a trusted Puppeteer keyboard action or manually with an untrusted JS browser event. You can also use keyboard.sendCharacter when focus is on the element, as pointed out in this answer:
await page.focus(selector);
await page.keyboard.sendCharacter("foobar");
Note that in your original code, .setDefaultTimeout(0); is risky because it can cause the script to hang forever without raising an error. staff.click and page.authenticate need await since they return promises.
I am learning to scrape items from a website using Puppeteer. I am using table data from Basketball reference.com to practice. What I have done so far is use the puppeteer to Search the stats of my favorite player (Stephen Curry), access the table page, and take a screenshot of the page which then finishes the scraping process and closes the browser. However, I cannot seem to scrape the table I need and I am completely stuck.
The following is the code I have written so far:
const puppeteer = require("puppeteer");
async function run() {
const browser = await puppeteer.launch({
headless: false,
ignoreHTTPSErrors: true,
});
const page = await browser.newPage();
await page.goto(`https://www.basketball-reference.com/`);
await page.waitForSelector("input[name=search]");
await page.$eval("input[name=search]", (el) => (el.value = "Stephen Curry"));
await page.click('input[type="submit"]');
await page.waitForSelector(`a[href='${secondPageLink}']`, { visible: true });
await page.click(`a[href='${secondPageLink}']`);
await page.waitForSelector();
await page.screenshot({
path: `StephenCurryStats.png`,
});
await page.close();
await browser.close();
}
run();
I am trying to scrape the PER GAME table on the following link and take its screenshot. However, I cannot seem to find the right selector to pick and scrape and I am very confused.
The URL is https://www.basketball-reference.com/players/c/curryst01.html
There seems to be at least a couple of issues here. I'm not sure what secondPageLink refers to or the intent behind await page.waitForSelector() (throws TypeError: Cannot read properties of undefined (reading 'startsWith') on my version). I would either select the first search result with .search-item-name a[href] or skip that page entirely by clicking on the first autocompleted name in the search after using page.type(). Even better, you can build the query string URL (e.g. https://www.basketball-reference.com/search/search.fcgi?search=stephen+curry) and navigate to that in your first goto.
The final page loads a video and a ton of Google ad junk. Best to block all requests that aren't relevant to the screenshot.
const puppeteer = require("puppeteer"); // ^16.2.0
let browser;
(async () => {
browser = await puppeteer.launch({headless: true});
const [page] = await browser.pages();
const url = "https://www.basketball-reference.com/";
await page.setViewport({height: 600, width: 1300});
await page.setRequestInterception(true);
const allowed = [
"https://www.basketball-reference.com",
"https://cdn.ssref.net"
];
page.on("request", request => {
if (allowed.some(e => request.url().startsWith(e))) {
request.continue();
}
else {
request.abort();
}
});
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.type('input[name="search"]', "Stephen Curry");
const $ = sel => page.waitForSelector(sel);
await (await $(".search-results-item")).click();
await (await $(".adblock")).evaluate(el => el.remove());
await page.waitForNetworkIdle();
await page.screenshot({
path: "StephenCurryStats.png",
fullPage: true
});
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
If you just want to capture the per game table:
// same boilerplate above this line
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.type('input[name="search"]', "Stephen Curry");
const $ = sel => page.waitForSelector(sel);
await (await $(".search-results-item")).click();
const table = await $("#per_game");
await (await page.$(".scroll_note"))?.click();
await table.screenshot({path: "StephenCurryStats.png"});
But I'd probably want a CSV for maximum ingestion:
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.type('input[name="search"]', "Stephen Curry");
const $ = sel => page.waitForSelector(sel);
await (await $(".search-results-item")).click();
const btn = await page.waitForFunction(() =>
[...document.querySelectorAll("#all_per_game-playoffs_per_game li button")]
.find(e => e.textContent.includes("CSV"))
);
await btn.evaluate(el => el.click());
const csv = await (await $("#csv_per_game"))
.evaluate(el => [...el.childNodes].at(-1).textContent.trim());
const table = csv.split("\n").map(e => e.split(",")); // TODO use proper CSV parser
console.log(table);
I am trying to scrape the key features part of the website with the URL of: "https://www.alpinestars.com/products/stella-missile-v2-1-piece-suit-1" using puppeteer - however, whenever I try to use a selector that works on the chrome console for the website the output for my code is always an empty array or object. For example both document.querySelectorAll("#key\ features > p") and document.getElementById('key features') both return as empty arrays or objects when I output it through my code but work via chrome console.
I have attached my code below:
const puppeteer = require('puppeteer');
async function getDescripData(url) {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto(url);
const descripFeatures = await page.evaluate(() => {
const tds = Array.from(document.getElementById('key features'))
console.log(tds)
return tds.map(td => td.innerText)
});
console.log(descripFeatures)
await browser.close();
return {
features: descripFeatures
}
}
How should I go about overcoming this issue?
Thanks in advance!
Your problem is in Array.from you are passing a non-iterable object and return null.
This works for me:
const puppeteer = require('puppeteer');
const url = 'https://www.alpinestars.com/products/stella-missile-v2-1-piece-suit-1';
(async () => {
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
args: ['--start-maximized'],
devtools: true
});
const page = (await browser.pages())[0];
await page.goto(url);
const descripFeatures = await page.evaluate(() => {
const tds = document.getElementById('key features').innerText;
return tds.split('• ');
});
console.log(descripFeatures)
await browser.close();
})();
My problem is that I need to set the comment selector to "all comments" whit puppeteer but the comments don't render after that puppeteer clicks on the correct button, "all the comments", the comment section just disappears, I will provide the code and a video of the browser in action.
const $ = require('cheerio');
const puppeteer = require('puppeteer');
const url = 'https://www.facebook.com/pg/SamsungGlobal/posts/';
const main = async () => {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.setViewport({
width: 1920,
height: 1080
});
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 0
});
page.mouse.click(50, 540, {});
for (var a = 0; a < 18; a++) {
setTimeout(() => {}, 16);
await page.keyboard.press('ArrowDown');
}
let bodyHTML = await page.evaluate(() => document.body.innerHTML);
var id = "#" + $("._427x ._4-u2.mbm._4mrt", bodyHTML).attr('id'); // selects id of first post
try {
var exp = await page.$(`${id} a._21q1`); // clicks on "most relevant" from the first post
await exp.evaluate(exp => exp.click());
await page.click('div[data-ordering="RANKED_UNFILTERED"]'); // selects "all the comments"
var exp = await page.$(`${id} a._42ft`); // should click on "more comments" but it doesn't load
await exp.evaluate(exp => exp.click());
await page.waitForSelector(`${id} a._5v47.fss`); // wait for the "others" in facebook comments
var exp = await page.$$(`${id} a._5v47.fss`);
await exp.evaluate(exp => exp.click());
await page.screenshot({
path: "./srn4.png"
});
// var post = await page.$eval(id + " .userContentWrapper", el => el.innerHTML);
// console.log("that's the post " + post);
} catch (e) {
console.log(e);
}
setTimeout(async function() {
await browser.close(); //close after some time
}, 1500);
};
main();
That's the video of the full execution process: https://youtu.be/jXpSOBfVskg
That's a slow motion of the moment it click on the menu: https://youtu.be/1OgfFNokxsA
You can try a variant with selectors:
'use strict';
const puppeteer = require('puppeteer');
(async function main() {
try {
const browser = await puppeteer.launch({ headless: false });
const [page] = await browser.pages();
await page.goto('https://www.facebook.com/pg/SamsungGlobal/posts/');
await page.waitForSelector('[data-ordering="RANKED_THREADED"]');
await page.click('[data-ordering="RANKED_THREADED"]');
await page.waitForSelector('[data-ordering="RANKED_UNFILTERED"]');
await page.click('[data-ordering="RANKED_UNFILTERED"]');
} catch (err) {
console.error(err);
}
})();
page.mouse.click(50, 540, {});
This is not going to work necessarily. What are you trying to click? You need to use CSS selectors to find elements that you want to click.
Also, dynamic elements might not appear in the page right away. You should use waitForSelector as needed.
I want to take a screenshot with puppeteer and it's working for one post. But I want to make it iterate.
If it's normal function I can just wrote the function name in the last side of the code so that it can iterate. But this is async function so I don't know how to iterate it.
const puppeteer = require('puppeteer');
let postNumber = 1;
let by;
(async () => {
const browser = await puppeteer.launch({
executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
userDataDir: 'C:\\Users\\{computerName}\\AppData\\Local\\Google\\Chrome\\User Data',
headless: false
}); // default is true
const page = await browser.newPage();
await page.goto(`https://band.us/band/{someNumbers}/post/${postNumber}`, {
waitUntil: 'networkidle2'
});
let element = await page.$('.boardList');
by = await page.evaluate(() => document.getElementsByClassName('text')[0].textContent);
console.log(by);
await element.screenshot({
path: `./image/${postNumber}-${by}.png`
});
console.log(`SAVED : ${postNumber}-${by}.png`)
postNumber++;
await browser.close();
})();
After the function is finished, the postNumber variable should be increase by one. And then run the code again by new URLs.
As you want to run the code one iteration after another, a normal for (or while) loop can be used. async/await code works fine with these.
You can use a for in your case like this:
(async () => {
const browser = await puppeteer.launch(/* ... */);
const page = await browser.newPage();
for (let postNumber = 1; postNumber < 10; postNumber++) {
await page.goto(/* ... */);
let element = await page.$('.boardList');
// ...
}
await browser.close();
})();
You can use any appropriate loop, like while-loop:
'use strict';
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',
userDataDir: 'C:\\Users\\{computerName}\\AppData\\Local\\Google\\Chrome\\User Data',
headless: false
}); // default is true
const page = await browser.newPage();
let postNumber = 1;
while (postNumber <= 10) {
await page.goto(`https://band.us/band/{someNumbers}/post/${postNumber}`, {
waitUntil: 'networkidle2'
});
const element = await page.$('.boardList');
const by = await page.evaluate(() => document.getElementsByClassName('text')[0].textContent);
console.log(by);
await element.screenshot({
path: `./image/${postNumber}-${by}.png`
});
console.log(`SAVED : ${postNumber}-${by}.png`)
postNumber++;
}
await browser.close();
})();