I am trying to parse a website using puppeteer, everything works well but iframe does not load properly.
This is how it loads
Here is my code
args: [
"--no-sandbox",
],
});
const page = await browser.newPage();
await page.setViewport({ width: 1440, height: 600 })
// await waitForFrame(page);
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 0,
});
await page.evaluate(({ applicationUrl}: any ) => {
// Here i want to evaluate iframes
})
When i try to log iframes, i don't get the actual iframe link on my parsed website
Also i don't see the iframe tag in parsed website
But When i look into the actual page, i can see the iframe link
and also the iframe tag
Here is the link to actual page which i am trying to parse
https://leza.notion.site/Trade-log-721b1ebb4cc74175abb55408b6f2d0c3
Any help would be highly appreciated
it's lazy loaded
try scrolling around and then wait some for iframe to load, and then get the iframe:
await page.goto(url, {
waitUntil: 'networkidle2',
timeout: 0,
});
await page.evaluate(async() => {
const scroll = document.querySelector('.notion-scroller');
scroll.scrollBy(0, 900);
document.querySelector('.notion-video-block').scrollIntoView();
});
// wait some for iframe to load..
await page.waitFor(5000);
const iframeSelector = 'iframe';
const frameHandle = await page.$(iframeSelector);
const src = await page.evaluate(el => el.src, frameHandle);
console.log(src);
Related
I am trying to send PUT request to the final URL but before final URL, there is a redirect. Also sending a fetch request inside final URL page is also fine. when I go to devtools console, write fetch from there also works but I need to do it inside the code, of course.
When I set await page.setRequestInterception(true); and page.once('request', (req) => {...}) it sends put request to the first page which I dont want it to do that.
Let's say first URL is https://example.com/first --> this redirects to final URL
final URL https://example.com/final --> this is where I want to send PUT request and retrieve status code. I have tried setting a timer or getting current url with page.url() and trying some if else statements, but did not work.
here is my current code;
app.get('/cookie', async (req, res) => {
puppeteer.use(StealthPlugin());
const browser = await puppeteer.launch({
headless: false,
executablePath: `C:/Program Files (x86)/Google/Chrome/Application/chrome.exe`,
defaultViewport: null,
args: ['--start-maximized'],
slowMo: 150,
});
const page = await browser.newPage();
await page.setUserAgent(randomUserAgent.getRandom());
page.setDefaultNavigationTimeout(0);
page.setJavaScriptEnabled(true);
await page.goto(
'finalURL',
{ waitUntil: 'load', timeout: 0 }
);
await delay(5000);
await page.setRequestInterception(true);
page.once('request', (request) => {
request.continue({
method: 'PUT',
});
page.setRequestInterception(false);
});
let statusCode;
await page.waitForResponse((response) => {
statusCode = response.status();
return true;
});
res.json(statusCode);
});
I have tried Puppeteer locally and it worked for other websites but mine has a few Iframes and it doesn't capture them.
const { chromium } = require("playwright");
(async () => {
let browser = await chromium.launch();
let page = await browser.newPage();
await page.setViewportSize({ width: 1280, height: 1080 });
await page.goto("https://raddy.dev/blog/build-news-website-with-node-js-express-ejs-wp-rest-api/");
await page.screenshot({ path: `nyt-playwright-chromium.png` });
await browser.close();
})();
How can I capture the Iframes in the screenshot as well? It would be great if I could just press a button on my website and get a full screenshot without any missing parts.
I have used puppeteer to capture the screenshot of my page in React JS. But it is taking a blank screenshot instead of the actual charts present on the page. Here is my code.
const puppeteer = require('puppeteer');
const url = process.argv[2];
if (!url) {
throw "Please provide URL as a first argument";
}
async function run () {
return new Promise(async (resolve, reject) => {
try {
const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox'],headless: true, ignoreHTTPSErrors:true});
const page = await browser.newPage();
await page.goto(url, {
timeout: 30000,
waitUntil: "networkidle0"
});
await page.content();
let imgDataBase64 = await page.screenshot({quality:100, fullPage: true, encoding: "base64", type: "jpeg"});
await browser.close();
return resolve(imgDataBase64);
} catch (e) {
return reject(e);
}
})
}
run().then(console.log).catch(console.error);
The reason for the same could be document is getting loaded first before the chart loads. And puppeteer takes the screenshot as soon as the document loads. Can anyone please help me with this? We have to be sure that there is no delay in chart loading after the document is loaded so that screenshot can be captured properly. Please help. Thanks in advance.
const browser = await puppeteer.launch({headless: false, args: ['--window-size=950,340', '--window-position=970,700']});
const page = await browser.newPage();
const url = "https://www.qimai.cn/rank/index/brand/grossing/device/iphone/country/us/genre/6014/date/" + today;
await page.goto(url, {waitUntil: 'load'});
await page.setViewport({
width: 1200,
height: 1000
});
Right now when I minimise the chronium browser, the script will not run and it will pause there until i reopen the browser. I want it to still be able to work and when its minimised so that I could do my own tasks also.
I'm using Puppeteer for doing some web scraping and I'm having troubles. The website I'm trying to scrape is this one and I'm trying to create a screenshot of a calendar that appears after clicking the button "Reserve now" > "Dates".
const puppeteer = require('puppeteer');
const fs = require('fs');
void (async () => {
try {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto('https://www.marriott.com/hotels/travel/reumd-le-meridien-ra-beach-hotel-and-spa');
await page.setViewport({ width: 1920, height: 938 });
await page.waitForSelector('.m-hotel-info > .l-container > .l-header-section > .l-m-col-2 > .m-button');
await page.click('.m-hotel-info > .l-container > .l-header-section > .l-m-col-2 > .m-button');
await page.waitForSelector('.modal-content');
await page.waitFor(5000);
await page.waitForSelector('.js-recent-search-inputs .js-datepick-container .l-h-field-input')
await page.click('.js-recent-search-inputs .js-datepick-container .l-h-field-input');
await page.waitFor(5000);
await page.screenshot({ path: 'myscreenshot.png'});
await browser.close();
} catch (error) {
console.log(error);
}
})()
This is what myscreenshot.png should contain:
but I'm getting this instead:
As you can see, myscreenshot.png doesn't contain the calendar. I don't understand what I'm doing wrong since I click on the right node and I even give time enough to it for loading everything.
Thank you in advance!
Edit: I forgot to say that I have also tried Puppeteer recorder in order to achieve this and I haven't had luck either.
As you have many .l-h-field-input elements, I would try being more specific there.
This worked for me:
await page.click('.js-recent-search-inputs .js-datepick-container .l-h-field-input');