Scraping table from iFrame with playwright - javascript

hi i'm triying to create a $$eval function on a website that work with iFrames using playwrite. but I am not getting the information from the table that Im trying to scrape.
here is the snippet of my code.
tableData = await frame.$$eval('body > app-root > document-list > div.container-document-list > div.table-document-list > mat-card > mat-card-content > mat-table > mat-row ',(users) => {
return users.map(user => {
const folio = user.querySelector('mat-cell:nth-child(1)')
const fecha = user.qu.querySelector('mat-cell:nth-child(2)')
const cliente = user.querySelector('mat-cell:nth-child(3)')
const vendedor = user.querySelector('mat-cell:nth-child(4)')
const cierre = user.querySelector('mat-cell:nth-child(5)')
const estado = user.querySelector('mat-cell:nth-child(6)')
const total = user.querySelector('mat-cell:nth-child(7)')
return {
Folio: folio ? folio.textContent.trim() : '',
Fecha: fecha ? fecha.textContent.trim(): '',
Cliente: cliente ? cliente.textContent.trim(): '',
Vendedor: vendedor ? vendedor.textContent.trim(): '',
Cierre: cierre ? cierre.textContent.trim(): '',
Estado: estado ? estado.textContent.trim():'',
Total: total ? total.textContent.trim():'',
}
})
})
frame is defined this way:
const frame = page.mainFrame()
now the full path to get a reading of 1 value on the row is this
const valores1 = page.frameLocator('iframe[name="main"]').locator('body > app-root > document-list > div.container-document-list > div.table-document-list > mat-card > mat-card-content > mat-table > mat-row:nth-child(2) > mat-cell.mat-cell.cdk-column-IDNumCot.mat-column-IDNumCot.ng-star-inserted');
texts1 = await valores1.allTextContents();
i am not sure what the problem is that the $$eval code it's not reading the table and getting the information. if someone can help orientate me to the solution i would be thankful.

Related

Apify - Extract url in html colde

I would like to extract url in html code with Apify scrapper.
Here is the html code :
<a class="app-aware-link profile-rail-card__profile-link t-16 t-black t-bold tap-target" href="https://www.linkedin.com/in/benjaminejzenberg?miniProfileUrn=urn%3Ali%3Afs_miniProfile%3AACoAAAj58zYBTN8loEzvrFJhh-16iFZ8gnfPSGU" data-test-app-aware-link="">
<div class="single-line-truncate t-16 t-black t-bold mt2">
Voir le profil complet
</div>
</a>
And this my input code in Apify :
async function pageFunction(context) {
const $ = context.jQuery;
const pageTitle = $('title').first().text();
const h1 = $('h1').first().text();
const first_h2 = $('h2').first().text();
const random_text_from_the_page = $('p').first().text();
const author_profile_link = $('div.scaffold-layout.scaffold-layout--breakpoint-xl.scaffold-layout--sidebar-main-aside.scaffold-layout--reflow > div > div > div > div > div > div > div.pt3.ph3.pb4.break-words > a:nth-child(5) a[href]').text();
context.log.info(`URL: ${context.request.url}, TITLE: ${pageTitle}`);
await context.enqueueRequest({ url: 'http://www.example.com' });
return {
url: context.request.url,
pageTitle,
h1,
first_h2,
random_text_from_the_page,
author_profile_link
};
}
Regards :)

How can I get the img src from this page with puppeteer?

I am trying to get some data from this wikipedia page: https://en.wikipedia.org/wiki/List_of_mango_cultivars
img src that I need
I can get everything that I need except the img src with this code
const recordList = await page.$$eval(
'div#mw-content-text > div.mw-parser-output > table > tbody > tr',
(trows) => {
let rowList = []
trows.forEach((row) => {
let record = { name: '', image: '', origin: '', notes: '' }
record.image = row.querySelector('a > img').src
const tdList = Array.from(row.querySelectorAll('td'), (column) => column.innerText)
const imageSrc = row.querySelectorAll('a > img').getAttribute('src')
record.name = tdList[0]
record.origin = tdList[2]
record.notes = tdList[3]
rowList.push(record)
})
return rowList
}
)
The error I am getting: Evaluation failed: TypeError: Cannot read properties of null (reading 'src')
You can wrap your record.image line in a conditional like this
if(row.querySelector('a > img')){
record.image = row.querySelector('a > img').src
}
This will ask if an img inside of an a tag exists, and if it does, then add it to the object.

Socket hang up when i use Axios

I am trying to use Axios with proxy server to get "css selector" on differents pages.
But i have the error: "Error: socket hang up"
async function get_condition(url:string) {
const random = Math.floor(Math.random() * proxys.length);
const proxy = proxys[random]
var agent = new HttpsProxyAgent(proxy)
const get = await axios.get(url,{
headers: {
'user-agent': new UserAgent().toString()
},
proxy: false,
httpsAgent: agent
});
const cheerio_load = cheerio.load(get.data);
const condition = cheerio_load('body > main > div.site-wrapper > section > div > div.row.u-position-relative > main > aside > div.box.box--item-details > div.details-list.details-list--main-info > div.details-list.details-list--details > div:nth-child(5) > div.details-list__item-value').text()
const condition_trim = condition.trim();
console.log(condition_trim);
return condition_trim;
}
Any idea ?
how can I solve this problem ?

How to import data to table on js?

enter image description here
Please help me
I'm new on electron.js and also on web development
I want to ad table items on index.js page. But I'm facing the problem like in the picture.
Error code is: ts(2657)
Here is code:
'''
const sideMenu = document.querySelector("aside");
const menuBtn = document.querySelector("#menu-btn");
const closeBtn = document.querySelector("#close-btn");
const themeToggler = document.querySelector(".theme-toggler");
// show sidebar
menuBtn.addEventListener('click', () => {
sideMenu.style.display = 'block';
})
// close sidebar
closeBtn.addEventListener('click', () => {
sideMenu.style.display = 'none';
})
// change theme
themeToggler.addEventListener('click', () => {
document.body.classList.toggle('dark-theme-variables');
themeToggler.querySelector('span:nth-child(1)').classList.toggle('active');
themeToggler.querySelector('span:nth-child(2)').classList.toggle('active');
})
// fill orders in table
Orders.forEach(order => {
const tr = document.createElement('tr');
const trContent =
'
<td>${order.productName}</td>
<td>${order.productNumber}</td>
<td>${order.paymentStatus}</td>
<td class="${order.shipping ===
'Declined' ? 'danger' : order.
'shipping' === 'pending' ? 'warning'
: 'primary'}">${order.shipping}</td>
<td class="primary">Details</td>
';
})
'''
enter image description hereI find the problem
After "const trContent=" I use 1st key. But actually, I must use the 2nd key. Look at the picture
If you want to insert data as shown in the picture using pure text, first of all, you should use backticks (``); This allows you to interpolate data from variables and write multiline text.
After that, you might want to put that text inside the element of course. The code will look something like this:
const tr = document.createElement('tr');
const trContent = `
<td>${order.productName}</td>
<td>${order.productNumber}</td>
<td>${order.paymentStatus}</td>
<td class="${order.shipping === 'Declined' ? 'danger' : (order.shipping === 'pending' ? 'warning' : ' primary')}">
${order.shipping}
</td>
<td class="primary">Details</td>
`;
tr.innerHTML = trContent;

How to use "multiple arrays" in "each" cycle in javascript

i want the product pages marked as PageType = 'item' (inside only one category) to display different HTML code according to language mutation of a webpage. What i've achieved so far is that on every mutation page is the same content X times (x = object items such as "eng": "categoryname" )
var html = `
<div class="">
<a class="" href="#" target="_blank">
<img src="different images with site language mutation" alt="banner">
</a>
</div>
`;
var langcode = $('html').attr('lang');
var maincat = [];
$(".breadcrumbclass").each(function() {
var vat = $(this).text();
maincat.push(vat);
});
var mycategory = maincat[1];
$.each(langmutations, function(key, val) {
if (((PageType == 'item') || (PageType === 'category')) && (mycategory === langmutations[langcode])) {
$('.classForPastingMyHtml').after(html);
}
});
//This is what i have in JS
var langmutations0 = {
eng: 'categoryname',
de: 'kategoriename',
ru: 'categorija'
};
//or
var langmutations1 = [
["eng", "categoryname"],
["de", "kategoriename"],
["ru", "categorija"]
];
//I believe this is PHP style
var langmutations2 = ['eng' => 'categoryname', 'de' => 'kategoriename', 'ru' => 'categorija'];
//This could be multiple array in PHP style ? I want to have this in JS
var multiple = [eng => [“cat” => “categoryname”, “banner” => “link”], de => [“cat” => “kategoriename”, “banner” => “link”], ru => [“cat” => “categorija”, “banner” => “link”]];
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
I think that i should use something like multiple array, but don't know if that exists in JS or how to structure it. Or maybe javascript object that would respond to that PHP style.
Here is some simple code to get you started:
var multiple = {
eng : {"cat" : "categoryname", "banner" : "link"},
de : {"cat" : "Kategoriename", "banner" : "link"},
ru : {"cat" : "categorija", "banner" : "link"}
};
var language = 'de';
var translations = multiple[language];
var cat = translations.cat;
console.log(cat);
//Show all translations
$.each(multiple, function(language,translations){
var cat = translations.cat;
console.log(cat);
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
It uses objects inside objects with language as keys.

Categories