How can I get the img src from this page with puppeteer? - javascript

I am trying to get some data from this wikipedia page: https://en.wikipedia.org/wiki/List_of_mango_cultivars
img src that I need
I can get everything that I need except the img src with this code
const recordList = await page.$$eval(
'div#mw-content-text > div.mw-parser-output > table > tbody > tr',
(trows) => {
let rowList = []
trows.forEach((row) => {
let record = { name: '', image: '', origin: '', notes: '' }
record.image = row.querySelector('a > img').src
const tdList = Array.from(row.querySelectorAll('td'), (column) => column.innerText)
const imageSrc = row.querySelectorAll('a > img').getAttribute('src')
record.name = tdList[0]
record.origin = tdList[2]
record.notes = tdList[3]
rowList.push(record)
})
return rowList
}
)
The error I am getting: Evaluation failed: TypeError: Cannot read properties of null (reading 'src')

You can wrap your record.image line in a conditional like this
if(row.querySelector('a > img')){
record.image = row.querySelector('a > img').src
}
This will ask if an img inside of an a tag exists, and if it does, then add it to the object.

Related

Alpine JS - VM11705:16 Uncaught TypeError: Cannot read properties of undefined (reading 'title') Error Issue

I made a widget using alpine js however, getting an error like and finding solution to my research. What do you think is the error in my code? When I pull the data, the header part is said to be incorrect. But I can't see a problem. Or I couldn't figure it out.
[Errors i see in console][1]
[1]: https://i.stack.imgur.com/ofmW1.png
<div class="text-base w-56 px-5 rounded-full overflow-hidden" x-data="{
textArray: [],
text: '',
textIndex: 0,
charIndex: 0,
pauseEnd: 2750,
cursorSpeed: 420,
pauseStart: 20,
typeSpeed: 50,
direction: 'forward'
}" x-init="(() => {
fetch('/wp-json/wp/v2/pages/219/?_fields=acf.positions&acf_format=standard')
.then(response => response.json())
.then(data => textArray = data.acf.positions );
let typingInterval = setInterval(startTyping, $data.typeSpeed);
function startTyping(){
let current = $data.textArray[$data.textIndex].title;
if($data.charIndex > current.length){
$data.direction = 'backward';
clearInterval(typingInterval);
setTimeout(function(){
typingInterval = setInterval(startTyping, $data.typeSpeed);
}, $data.pauseEnd);
}
$data.text = current.substring(0, $data.charIndex);
if($data.direction == 'forward'){
$data.charIndex += 1;
} else {
if($data.charIndex == 0){
$data.direction = 'forward';
clearInterval(typingInterval);
setTimeout(function(){
$data.textIndex += 1;
if($data.textIndex >= $data.textArray.length){
$data.textIndex = 0;
}
typingInterval = setInterval(startTyping, $data.typeSpeed);
}, $data.pauseStart);
}
$data.charIndex -= 1;
}
}
setInterval(function(){
if($refs.cursor.classList.contains('hidden')){
$refs.cursor.classList.remove('hidden');
} else {
$refs.cursor.classList.add('hidden');
}
}, $data.cursorSpeed);
})()">
<span x-text="text"></span>
<span class="opacity-70" x-ref="cursor">|</span>
</div>
Thanks in advance for your suggestions and help.
It's complaining that $data.textArray[$data.textIndex] is undefined, and you're trying to read .title from it.
when you first load your widget you do a fetch to populate textArray, but until that returns your textArray is empty so when you're trying to call let current = $data.textArray[$data.textIndex].title it's returning the error
You basically need to ensure textArray has data before you try to access it:
you could move everything into your .then(data => textArray = data.acf.positions ); so it's only started when the fetch has been called.
or you could put a simple if (textArray.length === 0) return; line at the start of your startTyping function so it doesn't try to do anything until textArray is populated
might not be your only issue, but this is the cause of the error you've posted

Scraping table from iFrame with playwright

hi i'm triying to create a $$eval function on a website that work with iFrames using playwrite. but I am not getting the information from the table that Im trying to scrape.
here is the snippet of my code.
tableData = await frame.$$eval('body > app-root > document-list > div.container-document-list > div.table-document-list > mat-card > mat-card-content > mat-table > mat-row ',(users) => {
return users.map(user => {
const folio = user.querySelector('mat-cell:nth-child(1)')
const fecha = user.qu.querySelector('mat-cell:nth-child(2)')
const cliente = user.querySelector('mat-cell:nth-child(3)')
const vendedor = user.querySelector('mat-cell:nth-child(4)')
const cierre = user.querySelector('mat-cell:nth-child(5)')
const estado = user.querySelector('mat-cell:nth-child(6)')
const total = user.querySelector('mat-cell:nth-child(7)')
return {
Folio: folio ? folio.textContent.trim() : '',
Fecha: fecha ? fecha.textContent.trim(): '',
Cliente: cliente ? cliente.textContent.trim(): '',
Vendedor: vendedor ? vendedor.textContent.trim(): '',
Cierre: cierre ? cierre.textContent.trim(): '',
Estado: estado ? estado.textContent.trim():'',
Total: total ? total.textContent.trim():'',
}
})
})
frame is defined this way:
const frame = page.mainFrame()
now the full path to get a reading of 1 value on the row is this
const valores1 = page.frameLocator('iframe[name="main"]').locator('body > app-root > document-list > div.container-document-list > div.table-document-list > mat-card > mat-card-content > mat-table > mat-row:nth-child(2) > mat-cell.mat-cell.cdk-column-IDNumCot.mat-column-IDNumCot.ng-star-inserted');
texts1 = await valores1.allTextContents();
i am not sure what the problem is that the $$eval code it's not reading the table and getting the information. if someone can help orientate me to the solution i would be thankful.

How to loop through HTML elements and populate a Json-object?

I'm looping through all the html tags in an html-file, checking if those tags match conditions, and trying to compose a JSON-object of a following schema:
[
{ title: 'abc', date: '10.10.10', body: ' P tags here', href: '' },
{ title: 'abc', date: '10.10.10', body: ' P tags here', href: '' },
{ title: 'abc', date: '10.10.10', body: ' P tags here', href: '' }
]
But I'd like to create the new entry only for elements, classed "header", all the other elements have to be added to earlier created entry. How do I achieve that?
Current code:
$('*').each((index, element) => {
if ( $(element).hasClass( "header" ) ) {
jsonObject.push({
title: $(element).text()
});
};
if( $(element).hasClass( "date" )) {
jsonObject.push({
date: $(element).text()
});
}
//links.push($(element))
});
console.log(jsonObject)
Result is:
{
title: 'TestA'
},
{ date: '10.10.10' },
{
title: 'TestB'
},
{ date: '10.10.11' }
I'd like it to be at this stage something like:
{
title: 'TestA'
,
date: '10.10.10' },
{
title: 'TestB'
,
date: '10.10.11' }
UPD:
Here's the example of HTML file:
<h1 class="header">H1_Header</h1>
<h2 class="date">Date</h2>
<p>A.</p>
<p>B.</p>
<p>С.</p>
<p>D.</p>
<a class="source">http://</a>
<h1 class="header">H1_Header2</h1>
<h2 class="date">Date2</h2>
<p>A2.</p>
<p>B2.</p>
<p>С2.</p>
<p>D2.</p>
<a class="source">http://2</a>
Thank you for your time!
Based on your example Html, it appears everything you are trying to collect is in a linear order, so you get a title, date, body and link then a new header with the associated items you want to collect, since this appears to not have the complication of having things being ordered in a non-linear fasion, you could do something like the following:
let jsonObject = null;
let newObject = false;
let appendParagraph = false;
let jObjects = [];
$('*').each((index, element) => {
if ($(element).hasClass("header")) {
//If newObject is true, push object into array
if(newObject)
jObjects.push(jsonObject);
//Reset the json object variable to an empty object
jsonObject = {};
//Reset the paragraph append boolean
appendParagraph = false;
//Set the header property
jsonObject.header = $(element).text();
//Set the boolean so on the next encounter of header tag the jsobObject is pushed into the array
newObject = true;
};
if( $(element).hasClass( "date" )) {
jsonObject.date = $(element).text();
}
if( $(element).prop("tagName") === "P") {
//If you are storing paragraph as one string value
//Otherwise switch the body var to an array and push instead of append
if(!appendParagraph){ //Use boolean to know if this is the first p element of object
jsonObject.body = $(element).text();
appendParagraph = true; //Set boolean to true to append on next p and subsequent p elements
} else {
jsonObject.body += (", " + $(element).text()); //append to the body
}
}
//Add the href property
if( $(element).hasClass("source")) {
//edit to do what you wanted here, based on your comment:
jsonObject.link = $(element).next().html();
//jsonObject.href= $(element).attr('href');
}
});
//Push final object into array
jObjects.push(jsonObject);
console.log(jObjects);
Here is a jsfiddle for this: https://jsfiddle.net/Lyojx85e/
I can't get the text of the anchor tags on the fiddle (I believe because nested anchor tags are not valid and will be parsed as seperate anchor tags by the browser), but the code provided should work in a real world example. If .text() doesn't work you can switch it to .html() on the link, I was confused on what you are trying to get on this one, so I updated the answer to get the href attribute of the link as it appears that is what you want. The thing is that the anchor with the class doesn't have an href attribute, so I'll leave it to you to fix that part for yourself, but this answer should give you what you need.
$('*').each((index, element) => {
var obj = {};
if ( $(element).hasClass( "header" ) ) {
obj.title = $(element).text();
};
if( $(element).hasClass( "date" )) {
obj.date = $(element).text()
}
jsonObject.push(obj);
});
I don't know about jQuery, but with JavaScript you can do with something like this.
const arr = [];
document.querySelectorAll("li").forEach((elem) => {
const obj = {};
const title = elem.querySelector("h2");
const date = elem.querySelector("date");
if (title) obj["title"] = title.textContent;
if (date) obj["date"] = date.textContent;
arr.push(obj);
});
console.log(arr);
<ul>
<li>
<h2>A</h2>
<date>1</date>
</li>
<li>
<h2>B</h2>
</li>
<li>
<date>3</date>
</li>
</ul>
Always use map for things like this. This should look something like:
let objects = $('.header').get().map(el => {
return {
date: $(el).attr('date'),
title: $(el).attr('title'),
}
})

Gatsby - Trying to render image with URL - Cannot read property '0' of undefined

This is my code:
const Image = () => {
const data = useStaticQuery(graphql`
query {
facebook {
posts {
data {
full_picture
}
}
}
}
`)
// console.log(data.facebook.posts.data)
const images = data.facebook.posts.data;
const sources = images.filter(function(img) {
if (img.full_picture == null) {
return false;
}
return true
}).map(function (img) {
return (
<Img src={img.full_picture} />
)
})
return (
<div>
{sources}
</div>
)
}
The error I'm receiving is
"TypeError: Cannot read property '0' of undefined"
If I remove the HTML from the return (Img) then it will display all the URLs on the page, but as soon as I add the Img tag it doesn't run.
It is possible that the images are not loaded when you try to display it in Img you could do :
<Img src={img?img.full_picture:''} />
This should help.

Access the data attribute while instantiating multiple tool-tips using tippyjs

i am creating a multiple dynamic tooltips using tippyjs library on a page that fetches content using fetch api.
how do i access the data attribute on each of the selector while the initialisation of the tooltip.
here is what i have
Code
<span class="order-tooltip" data-orderid="123456">Order ID 123456</span>
<span class="order-tooltip" data-orderid="454515">Order ID 454515</span>
<span class="order-tooltip" data-orderid="487848">Order ID 487848</span>
<span class="order-tooltip" data-orderid="154214">Order ID 154214</span>
<div id="tooltipTemplate" style="display: none;">
Loading data...
</div>
<script>
const template = document.querySelector('#tooltipTemplate')
const initialText = template.textContent
const tip = tippy('.order-tooltip', {
animation: 'shift-toward',
arrow: true,
html: '#tooltipTemplate',
onShow() {
// `this` inside callbacks refers to the popper element
const content = this.querySelector('.tippy-content')
if (tip.loading || content.innerHTML !== initialText) return
tip.loading = true
console.log($('.order-tooltip').data('orderid')) // This is not working
var orderid = $(this).data('orderid');
var url = "/fetch_position_tooltip?" + $.param({orderid: orderid})
fetch(url).then(resp => resp.json()).then (responseJSON =>{
content.innerHTML = responseJSON
tip.loading = false
}).catch(e => {
console.log(e)
content.innerHTML = 'Loading failed'
tip.loading = false
})
},
onHidden() {
const content = this.querySelector('.tippy-content')
content.innerHTML = initialText
},
// prevent tooltip from displaying over button
popperOptions: {
modifiers: {
preventOverflow: {
enabled: false
},
hide: {
enabled: false
}
}
}
})
</script>
i need to access the data attribute for each of the span element when instantiating the toolitip .
How could i do this?
Contacted the maintainer of the library
Any one looking for this can use.
this._reference

Categories