Testcafe scripts mostly fails over Mozilla Firefox browser due page load issue - javascript

The page is very slow to load when run via TestCafe over Firefox. If I manually load the page in the browser, the page loads reasonably fast. So I'm not sure what's causing the page to load slowly. Even while executing basic testcafe scripts in Firefox we can easily observe this Slowness issue. We have a longer Script execution time because of this issue. If the same script is executed over the chrome browser we are not facing a page load issue and script execution time is almost half as compared to chrome.
To quantify the number of scripts passed ratio in chrome is 70-75% but in Firefox it's hardly around 20-25%. I'm not sure what causes this issue.
Any help or suggestion will be much appreciated. Thank you in advance for your help!
Sample Code :
import { Role } from "testcafe";
import { Selector } from "testcafe";
fixture`FirefoxSlowness`.beforeEach(async t => {
await t.wait(8000)
await t.navigateTo("https://dxc.com/us/en");
await t.maximizeWindow();
});
test('DXCTestWebsite_Test_01', async t => {
const Logo = Selector('#Logo_and_Copy').exists;
await t.expect(Logo).ok();
await t.click('[class="nav-link mega-menu-toggle collapsed"]');
await t.wait(1000)
await t.click('#onetrust-accept-btn-handler')
await t.click('[class="mega-menu__nav-item nav-item"]')
await t.click('[class="card-link"]')
await t.click('#contact-us-link')
const contactus = Selector('[class="cmp-accordion__title"]')
await t.click(contactus.withExactText("Careers"))
await t.wait(2000)
await t.click('[href="/us/en/careers"]')
await t.click('#typehead')
await t.typeText('#typehead', '42A - Human Resources Specialist')
await t.click('#ph-search-backdrop')
await t.click('[data-ph-id="ph-page-element-page24-AA0w8h"]')
});

Related

Await, catch and assign to variables - fetch/xhr responses node js, puppeteer

I am using Puppeteer to get page data, but unfortunately there is no way to make all requests.
Therefore, the question arose - How, after opening the site, get from all Fetch / XHR requests with the name v2 JSON contained in their responses?
In this case, as I understand it, need to use waiting.
It is not possible to peep into the request and the body and repeat a similar request, since the body uses code that is generated randomly each time - therefore this is not an option, it was in connection with this that it became necessary to simply display all json responses from requests with names v2.
I am attaching a screenshot and my code, I beg you - point me in the right direction, I will be grateful for any help!
// puppeteer-extra is a drop-in replacement for puppeteer,
// it augments the installed puppeteer with plugin functionality
import puppeteer from "puppeteer-extra";
// add stealth plugin and use defaults (all evasion techniques)
import StealthPlugin from 'puppeteer-extra-plugin-stealth'
export async function ProductAPI() {
try {
puppeteer.use(StealthPlugin())
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('here goes link for website');
const pdata = await page.content() // this just prints HTML
console.log(pdata)
browser.close();
} catch (err) {
throw err
}
}(ProductAPI())
link for image: https://i.stack.imgur.com/ZR6T1.png
I know that the code I wrote just returns html. I'm just trying to figure out how to get the data I need, I googled for a very long time, but could not find the answer I needed.
It is very important that the execution is on node js (javscript) and it doesn’t matter if it’s a puppeteer or something else.
This works!
import puppeteer from "puppeteer-extra";
import StealthPlugin from 'puppeteer-extra-plugin-stealth'
async function SomeFunction () {
puppeteer.use(StealthPlugin())
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
page.on('response', async (response) => {
if(response.url().includes('write_link_here')){
console.log('XHR response received');
const HTMLdata = await response.text()
console.log(HTMLdata)
};});
await page.goto('some_website_link');}

HTTPRequests are lost/missing when using Puppeteer with setRequestInterception enabled

Here's the scoop.
I'm trying to use Puppeteer v18.0.5 with the bundled chromium browser against a specific website. I'm using Node v16.16.0 However, when I enable request interception via page.setRequestInterception(true), all of the HTTPRequests for any image resources are lost. My handler is invoked far less while intercepting than when not intercepting. The page never fires any requests for images. But when I disable the interception, the page loads normally. Yes, I know about invoking continue() on all requests. I'm currently doing that in the request handler on the page.
I've also poured over the Puppeteer issues pages and have found similar symptoms on some of the earlier Puppeteer versions, but they were all different issues that have all been resolved since those early versions. This seems unique.
I've looked through Puppeteer source code as well as CDP events to try and find any explanation, but have found none.
As an important note for anyone trying to reproduce this, you must be proxied through a server in the London general area in order to successfully load this site.
Here's my code to reproduce:
const puppeteer = require('puppeteer');
(async () => {
const options = {
browserWidth: 1366,
browserHeight: 983,
intercepting: false
};
const browser = await puppeteer.launch(
{
args: [`--window-size=${options.browserWidth},${options.browserHeight}`],
defaultViewport: {width: options.browserWidth, height: options.browserHeight},
headless: false
}
);
const page = (await browser.pages())[0];
page.on('request', async (request) => {
console.log(`Request: ${request.method()} | ${request.url()} | ${request.resourceType()} | ${request._requestId}`);
if (options.intercepting) await request.continue();
});
await page.setRequestInterception(options.intercepting);
await page.goto('https://vegas.williamhill.com', {waitUntil: 'networkidle2', timeout: 65000});
// To give a moment to view the page in headful mode before closing browser.
await new Promise(resolve => setTimeout(resolve, 5000));
await browser.close();
})();
Here's what the page looks like with intercepting disabled:
Expected Page Load
Here's what the page looks like with intercepting enabled and continuing all requests.
Page load while intercepting and continuing all requests
With request interception disabled my handler is invoked for 104 different requests. But with the interception enabled it's only invoked 22 times. I'm not hitting a navigation timeout as the .goto() method returns before my timeout each time.
Any insight into what configuration/strategy I'm missing would be immensely appreciated.
Maybe you are incepting some javascript files that initiate the requests that you are not seeing?

Browser console and Puppeteer cannot find certain selector

I am trying to build a simple scraper for this website by using puppeteer.
The code goes as follows:
const browser = await puppeteer.launch({
headless: false
});
const page = await browser.newPage();
let pagelink = "https://www.speisekarte.de/berlin/restaurants?page=1"
await page.waitFor(3 * 1000);
await page.goto(pagelink);
await page.waitFor(3 * 1000);
await page.waitForSelector("#notice")
However, I cannot access the overlay notice for the cookies which should have the Id "notice".
This does not work either for await page.waitForSelector("#notice")
in my puppeteer code.
Nor with document.getElementById("notice") in Chromium, if I use the console of Chromium during the session manually. Also, it does not work, if I use it in Firefox's console. Funnily enough, chunks like
document.querySelectorAll("button")
work as expected. I checked with a colleague and she can access the element using the above mentioned queries in her Chrome and in her Firefox browser. She also uses a Mac. Any idea, what is happening here? Any help would be much appreciated.

How can use puppeteer with my current chrome (keeping my credentials)

i'm actually trying to use puppeteer for scraping and i need to use my current chrome to keep all my credentials and use it instead of relogin and type password each time which is a really time lose !
is there a way to connect it ? how to do that ?
i'm actually using node v11.1.0
and puppeteer 1.10.0
let scrape = async () => {
const browser = await log()
const page = await browser.newPage()
const delayScroll = 200
// Login
await page.goto('somesite.com');
await page.type('#login-email', '*******);
await page.type('#login-password', "******");
await page.click('#login-submit');
// Wait to login
await page.waitFor(1000);
}
and now it will be perfect if i do not need to use that and go on page (headless, i dont wan't to see the page opening i'm just using the info scraping in node) but with my current chrome who does not need to login to have information i need. (because at the end i want to use it as an extension of chrome)
thx in advance if someone knows how to do that
First welcome to the community.
You can use Chrome instead of Chromium but sincerely in my case, I get a lot of errors and cause a mess with my personal tabs. So you can create and save a profile, then you can login with a current or a new account.
In your code you have a function called "log" I'm guessing that there you set launch puppeeteer.
const browser = await log()
Into that function use arguments and create a relative directory for your profile data:
const browser = await puppeteer.launch({
args: ["--user-data-dir=./Google/Chrome/User Data/"]
});
Run your application, login with an account and the next time you enter you should see your credentials
Any doubt please add a comment.

Using page.getMetrics() to get page load time in puppeteer

I am trying to use puppeteer to measure how fast a set of web sites loads in my environment. My focus is on the quality of network connection and network speed, so I am happy to know the the time taken for a page to load, for a layman's definition of load, when all images and html is downloaded by browser.
By using puppeteer I can run the test repeatedly and measure the difference in load times precisely.
I can see that in 64.0.3240.0 (r508693) page.getMetrics and event: 'metrics' have landed, which should help me in getting what I am looking for.
But being a newbie in node and js I am not sure how to read the page.getMetrics and which of the different key/value pairs give a useful information in my context.
My current pathetic attempt at reading metrics is as follows:
const puppeteer = require('puppeteer');
async function run() {
const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});
const page = await browser.newPage();
page.on('load', () => console.log("Loaded: " + page.url()));
await page.goto('https://google.com');
const metrics = page.getMetrics();
console.log(metrics.Documents, metrics.Frames, metrics.JSEventListeners);
await page.goto('https://yahoo.com');
await page.goto('https://bing.com');
await page.goto('https://github.com/login');
browser.close();
}
run();
Any help in getting this code to some thing more respectable is much appreciated :)
in recent versions you have page.metrics() available:
It will return an object with a bunch of numbers including:
The timestamp when the metrics sample was taken
Combined durations of all page layouts
Combined duration of all tasks performed by the browser.
Check out the docs for the full list
You can use it like this:
await page.goto('https://github.com/login');
const gitMetrics = await page.metrics();
console.log(gitMetrics.Timestamp)
console.log(gitMetrics.TaskDuration)

Categories