Puppeteer Open Dropdown Menu and then click on [1st 2nd 3rd...] option - javascript

https://www.nhtsa.gov/ratings
I have been trying to make puppeteer select an option from the dropdown menu, so I may scrape information from the website above.
1.I'm having issues declaring a correct selector for puppeteer to understand.
What I mean is after telling puppeteer to click "manufacturer" at the end of the paragraph. I can't seem to click( or maybe select??) an option.
The default option in this dropdown menu is select a manufacturer
2.I also would like to know how I may select the 2nd 3rd and 4th option without hard coding it in.
I haven't even begun scraping information /sad ;(
const puppeteer = require('puppeteer');
async function spider() {
try {
let browser = await puppeteer.launch({ headless: false});
let page = await browser.newPage();
await page.goto('https://www.nhtsa.gov/ratings');
await page.click('a[data-target=".manufacturer-search-modal"]');
await page.click('select');
await page.click('option[value="AUDI"]');
} catch(error) {
console.log(error)
await browser.close();
}
}
export default spider

I think there are a few issues. One, sometimes that page has a popup for a survey, which you may need to close first. Two, you need to wait a bit between clicking the .manufacturer-search-modal link and trying to interact with the select box, because the options in the box aren't populated immediately (maybe it's making a request to the server to get the list of options). Three, I think select boxes are a little special and clicking on them doesn't work, but you can use page.select instead. Putting that all together, plus some ugly code for selecting items by number:
async function spider() {
try {
let browser = await puppeteer.launch({ headless: false});
let page = await browser.newPage();
await page.goto('https://www.nhtsa.gov/ratings');
try {
// Give the 'take our survey' box a chance to pop up, and close it if it does
await page.waitFor('.acsCloseButton', { timeout: 1000 });
await page.click('.acsCloseButton')
} catch {
}
await page.click('a[data-target=".manufacturer-search-modal"]');
// wait for the options to be populated
await page.waitFor('option:nth-child(2)');
// search for AUDI
await page.select('select', 'AUDI');
await page.click('.manufacturer-search-submit');
// select third element in drop-down
// there's probably a better way to do this
const options = await page.$$('option');
const properties = await options[2].getProperties();
const value = await properties.get('_value').jsonValue();
await page.select('select', value);
await page.click('.manufacturer-search-submit');
} catch(error) {
console.log(error)
await browser.close();
}
}

Related

Puppeteer React/NodeJS form submissions automation with a database

So I'm trying to test a recaptcha implementation by writing my own spambot using React and Puppeteer. I got the script ready to do a single form submission after executing the script, but what I'm actually hoping for is to have my script loop through a database with form submission details, and then reiterate every row of the csv file until it's depleted the database.
So far I have the following script:
const puppeteer = require('puppeteer');
// Server Authentication
const username = "username";
const password = 'password';
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Pass server side authentication needed for website when accessing url
await page.authenticate({ username, password});
// Load page with form and take screenshot to confirm page has been reached
await page.goto('https://pagewiththeformimtryingtosubmit.com');
await page.screenshot({path: 'example2.png'});
// Click away cookie pop-up banner, take screenshot after
page.keyboard.press('Escape');
await page.screenshot({path: 'cookiebotcleared.png', fullPage: true});
// Fill in first form fields, screenshot to see if it works
await page.type('#firstname', 'Somename', {delay:500});
await page.type('#lastname', 'Somelastname', {delay:500});
await page.type('#telephone', '123456789', {delay:500});
await page.type('#email_address', 'somename.lastname#gmail.com', {delay:500});
await page.type('#password', 'Chooseapassword', {delay:500});
await page.type('#password-confirmation', 'Chooseapassword', {delay:500});
await page.click('.consent', {delay:500});
await page.click('.submit', {delay: 500});
// Take screenshot after form submission to see if it has worked
await page.screenshot({path: 'formfillout.png', fullPage: true}, {delay: 500});
await browser.close();
})();
What I'm trying to do, is take a CSV with all the data I've randomized, and then have this script run but take elements from the csv for the various form inputs and loop that over.
I've tried working with CSVToJSON in order to process the csv database into objects that I should then be able to use in my code:
const CSVToJSON = require('csvtojson')
CSVToJSON().fromFile('formbot_database.csv')
.then(users => {
console.log(users);
console.log(users.firstname);
}).catch(err => {
console.log(err);
});
Here's where my first troubles start: I want to take the row headers of my database to map them to variables, so I can process those variables within my script. I first tried users.firstname, but when I console log that, it gives me undefined.
If anyone has any suggestion on how I can work this through, that'd be great. I've tried visiting multiple resources but can't figure it out I'm afraid.
Thanks in advance!

Playwright Javascript skip fill selector in Google Form

I use Playwright framework on JS to autofill unknow Google form (which means i dont know Xpath to specify the answer, i just know to question. In my situation, form ask about address, name, size, phone number).
const { webkit } = require('playwright');
const URL = 'https://forms.gle/B4r6qZKdyxZCApTWA';
(async () => {
const browser = await webkit.launch({ headless: false });
const page = await browser.newPage();
await page.goto(URL);
await page.fill('input:below(:has-text("Họ và tên"))','name');
await page.fill('input:below(:has-text("Số điện thoại"))','phone number');
await page.fill('input:below(:has-text("Địa chỉ"))','Address');
await page.fill('input:below(:has-text("CMND"))','id');
await page.fill('input:below(:has-text("Game"))','LOL');
await page.pause();
await browser.close();
})();
URL: https://forms.gle/B4r6qZKdyxZCApTWA
The name and number field is fine but in the address field, things get mess up. It skip and jump to the id field and fill 'address'->'LOL'->'id'
The answer field of Google Form has 2 kind: input and textarea. I just need to change it. But any better way to do more "general" to fit that kind of GForm?

Image shows 1 thing, but queried data shows another when loading a website

I was trying to query a website: const url = "https://personal.vanguard.com/us/FixedIncomeHome" with the hope to automate some functionality within puppeteer.
I noticed if i create a screen shot: page.screenshot("preclick.png") it will show the page data with tabs. When i try to follow it up with a query, it seems to not return the second tab (denoted by the following selector: a[container="CD"]
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(url, {waitUntil: 'networkidle2'})
page.screenshot("start.png")
page.evaluate( () => {
document.querySelectorAll("a[container='CD']")[0].click()
})
///...
and i dont really know why this is the case. Ideally, i am trying to click CD and then click an empty search. I noticed that since session ids are tracked, I wanted to do this as a sort of E2E test in order to get the resulting table data.
I see that the Content of tab etc is dynamically loaded, so somehow there is an issue with the page being able to query.
I was attempting something else to see what would occur, waiting for the tag to appear, BUT it would just timeout after 30 seconds:
await page.waitForSelector("a[container='CD']").then( async resolve => {
page.execute( () => document.querySelector("a[container='CD']").click() );
});
I dont know why the screenshot shows the HTML, but when attempting to query for it from within execute it fails. It doesnt make sense to me why this occurs. Ideally, I want to click CD tab, then i want to click Search, then i want to loop through the 20 results in the table.
EDIT I was noticing that evaluate was not querying the component correctly because of an iframe. If i want to develop e2e testing though, i assumed there was a way to somehow get a reference to the button and click it, or simulate a click.*
You can get the iframe from a selector. As the iframe has the ID TWRIFrame, you can wait for that selector, then get the contentFrame from that element.
Once you have the frame, the frame class has almost the same functions as the page class, e.g. click.
Notice that, as that iframe is from other domain, with the --disable-features=site-per-process flag.
const browser = await puppeteer.launch({headless: false, args: ['--disable-features=site-per-process']});
const page = await browser.newPage();
await page.goto('https://personal.vanguard.com/us/FixedIncomeHome', {waitUntil: 'networkidle2'});
await page.screenshot("start.png");
await page.waitForSelector('#TWRIFrame');
const frameElement = await page.$('#TWRIFrame');
const frame = await frameElement.contentFrame();
await frame.click("a[container='CD']");

Click on an dropdown menu

Hello,
I´m currently working on an Javascript based on Nodejs where I use Puppeteer as an help to scrape the web. As you can read in the title I´m trying to click on a dropdown menu Item, where the dropdown changes if you type in something diffrent. First here is my code:
// Navigate to the Homepage
await page.goto('https://www.futbin.com/');
await page.click('#player_search');
await page.keyboard.type(playerName);
await page.keyboard.press('ArrowDown');
await page.keyboard.press('ArrowDown');
// Create a screenshot
await page.screenshot({
path: 'screenshot.png'
});
So Basically i do the screenshot just for proving that the headless-browser does the right thing.
The website is futbin, if you want to see how their website works and take a look at the inspect, i think that could help.
But my real problem is that normally when you press enter you directly go to the player page (where i want to get). but after my script there comes always the error "no target". So the keyboard.press('Enter') don´t works. Also other Suggestions from SO didn´t work for me as the dropdown isn´t native and don´t hast counting indexes.
I would really appreciate some Suggestions !
at the end i wanted leave the html code from the first row of the dropdown as well, but i never worked so i would appreciate if you´d take a look at the website please !
You were very close, you dont need to use the arrows as you can find the item to click another way. The secret is to add a waitForSelector as the dropdown makes a call to a api endpoint. Also notice the waitForSelector on the final page to render before we take the screenshot.
Therefore just do this:-
const puppeteer = require('puppeteer');
async function run() {
const browser = await puppeteer.launch( {
headless: false
});
const page = await browser.newPage();
await page.goto('https://www.futbin.com/');
await page.type('#player_search', "Dave");
await page.waitForSelector("ul li a[data-id]");
await page.click("ul li a[data-id]");
await page.waitForSelector('#cal');
await page.screenshot( { path: "./dave.png"});
await browser.close();
};
run();
EDIT ADDTIONAL
To select any index use:-
let index = 3;
let selector = "ul li:nth-child(" + index +") a[data-id]"
await page.click(selector);
This goes to the third item in the dropdown. HTH

Puppeteer: How to submit a form?

Using puppeteer, how could you programmatically submit a form? So far I've been able to do this using page.click('.input[type="submit"]') if the form actually includes a submit input. But for forms that don't include a submit input, focusing on the form text input element and using page.press('Enter') doesn't seem to actually cause the form to submit:
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://stackoverflow.com/', {waitUntil: 'load'});
console.log(page.url());
// Type our query into the search bar
await page.focus('.js-search-field');
await page.type('puppeteer');
// Submit form
await page.press('Enter');
// Wait for search results page to load
await page.waitForNavigation({waitUntil: 'load'});
console.log('FOUND!', page.url());
// Extract the results from the page
const links = await page.evaluate(() => {
const anchors = Array.from(document.querySelectorAll('.result-link a'));
return anchors.map(anchor => anchor.textContent);
});
console.log(links.join('\n'));
browser.close();
})();
If you are attempting to fill out and submit a login form, you can use the following:
await page.goto('https://www.example.com/login');
await page.type('#username', 'username');
await page.type('#password', 'password');
await page.click('#submit');
await page.waitForNavigation();
console.log('New Page URL:', page.url());
Try this
const form = await page.$('form-selector');
await form.evaluate(form => form.submit());
For v0.11.0 and laters:
await page.$eval('form-selector', form => form.submit());
I was scraping a SPA, and I had to use waitForNetworkIdle since the form submit was not triggering a page navigation event. Instead it submitted data to the server, and updated the DOM of the page which was already loaded.
const [response] = await Promise.all([
page.waitForNetworkIdle(),
page.click('#form-submit-button'),
]);
When to use waitForNetworkIdle
I suspect that if you open a normal web browser, submit the form, and look to see if the page URL has changed or not. If it has not changed, you should use waitForNetworkIdle.
Also, take this advice with a grain of salt, I've only been using puppeteer for an hour.

Categories