I am trying to use Axios with proxy server to get "css selector" on differents pages.
But i have the error: "Error: socket hang up"
async function get_condition(url:string) {
const random = Math.floor(Math.random() * proxys.length);
const proxy = proxys[random]
var agent = new HttpsProxyAgent(proxy)
const get = await axios.get(url,{
headers: {
'user-agent': new UserAgent().toString()
},
proxy: false,
httpsAgent: agent
});
const cheerio_load = cheerio.load(get.data);
const condition = cheerio_load('body > main > div.site-wrapper > section > div > div.row.u-position-relative > main > aside > div.box.box--item-details > div.details-list.details-list--main-info > div.details-list.details-list--details > div:nth-child(5) > div.details-list__item-value').text()
const condition_trim = condition.trim();
console.log(condition_trim);
return condition_trim;
}
Any idea ?
how can I solve this problem ?
Related
I would like to extract url in html code with Apify scrapper.
Here is the html code :
<a class="app-aware-link profile-rail-card__profile-link t-16 t-black t-bold tap-target" href="https://www.linkedin.com/in/benjaminejzenberg?miniProfileUrn=urn%3Ali%3Afs_miniProfile%3AACoAAAj58zYBTN8loEzvrFJhh-16iFZ8gnfPSGU" data-test-app-aware-link="">
<div class="single-line-truncate t-16 t-black t-bold mt2">
Voir le profil complet
</div>
</a>
And this my input code in Apify :
async function pageFunction(context) {
const $ = context.jQuery;
const pageTitle = $('title').first().text();
const h1 = $('h1').first().text();
const first_h2 = $('h2').first().text();
const random_text_from_the_page = $('p').first().text();
const author_profile_link = $('div.scaffold-layout.scaffold-layout--breakpoint-xl.scaffold-layout--sidebar-main-aside.scaffold-layout--reflow > div > div > div > div > div > div > div.pt3.ph3.pb4.break-words > a:nth-child(5) a[href]').text();
context.log.info(`URL: ${context.request.url}, TITLE: ${pageTitle}`);
await context.enqueueRequest({ url: 'http://www.example.com' });
return {
url: context.request.url,
pageTitle,
h1,
first_h2,
random_text_from_the_page,
author_profile_link
};
}
Regards :)
I am trying to get some data from this wikipedia page: https://en.wikipedia.org/wiki/List_of_mango_cultivars
img src that I need
I can get everything that I need except the img src with this code
const recordList = await page.$$eval(
'div#mw-content-text > div.mw-parser-output > table > tbody > tr',
(trows) => {
let rowList = []
trows.forEach((row) => {
let record = { name: '', image: '', origin: '', notes: '' }
record.image = row.querySelector('a > img').src
const tdList = Array.from(row.querySelectorAll('td'), (column) => column.innerText)
const imageSrc = row.querySelectorAll('a > img').getAttribute('src')
record.name = tdList[0]
record.origin = tdList[2]
record.notes = tdList[3]
rowList.push(record)
})
return rowList
}
)
The error I am getting: Evaluation failed: TypeError: Cannot read properties of null (reading 'src')
You can wrap your record.image line in a conditional like this
if(row.querySelector('a > img')){
record.image = row.querySelector('a > img').src
}
This will ask if an img inside of an a tag exists, and if it does, then add it to the object.
I made a widget using alpine js however, getting an error like and finding solution to my research. What do you think is the error in my code? When I pull the data, the header part is said to be incorrect. But I can't see a problem. Or I couldn't figure it out.
[Errors i see in console][1]
[1]: https://i.stack.imgur.com/ofmW1.png
<div class="text-base w-56 px-5 rounded-full overflow-hidden" x-data="{
textArray: [],
text: '',
textIndex: 0,
charIndex: 0,
pauseEnd: 2750,
cursorSpeed: 420,
pauseStart: 20,
typeSpeed: 50,
direction: 'forward'
}" x-init="(() => {
fetch('/wp-json/wp/v2/pages/219/?_fields=acf.positions&acf_format=standard')
.then(response => response.json())
.then(data => textArray = data.acf.positions );
let typingInterval = setInterval(startTyping, $data.typeSpeed);
function startTyping(){
let current = $data.textArray[$data.textIndex].title;
if($data.charIndex > current.length){
$data.direction = 'backward';
clearInterval(typingInterval);
setTimeout(function(){
typingInterval = setInterval(startTyping, $data.typeSpeed);
}, $data.pauseEnd);
}
$data.text = current.substring(0, $data.charIndex);
if($data.direction == 'forward'){
$data.charIndex += 1;
} else {
if($data.charIndex == 0){
$data.direction = 'forward';
clearInterval(typingInterval);
setTimeout(function(){
$data.textIndex += 1;
if($data.textIndex >= $data.textArray.length){
$data.textIndex = 0;
}
typingInterval = setInterval(startTyping, $data.typeSpeed);
}, $data.pauseStart);
}
$data.charIndex -= 1;
}
}
setInterval(function(){
if($refs.cursor.classList.contains('hidden')){
$refs.cursor.classList.remove('hidden');
} else {
$refs.cursor.classList.add('hidden');
}
}, $data.cursorSpeed);
})()">
<span x-text="text"></span>
<span class="opacity-70" x-ref="cursor">|</span>
</div>
Thanks in advance for your suggestions and help.
It's complaining that $data.textArray[$data.textIndex] is undefined, and you're trying to read .title from it.
when you first load your widget you do a fetch to populate textArray, but until that returns your textArray is empty so when you're trying to call let current = $data.textArray[$data.textIndex].title it's returning the error
You basically need to ensure textArray has data before you try to access it:
you could move everything into your .then(data => textArray = data.acf.positions ); so it's only started when the fetch has been called.
or you could put a simple if (textArray.length === 0) return; line at the start of your startTyping function so it doesn't try to do anything until textArray is populated
might not be your only issue, but this is the cause of the error you've posted
hi i'm triying to create a $$eval function on a website that work with iFrames using playwrite. but I am not getting the information from the table that Im trying to scrape.
here is the snippet of my code.
tableData = await frame.$$eval('body > app-root > document-list > div.container-document-list > div.table-document-list > mat-card > mat-card-content > mat-table > mat-row ',(users) => {
return users.map(user => {
const folio = user.querySelector('mat-cell:nth-child(1)')
const fecha = user.qu.querySelector('mat-cell:nth-child(2)')
const cliente = user.querySelector('mat-cell:nth-child(3)')
const vendedor = user.querySelector('mat-cell:nth-child(4)')
const cierre = user.querySelector('mat-cell:nth-child(5)')
const estado = user.querySelector('mat-cell:nth-child(6)')
const total = user.querySelector('mat-cell:nth-child(7)')
return {
Folio: folio ? folio.textContent.trim() : '',
Fecha: fecha ? fecha.textContent.trim(): '',
Cliente: cliente ? cliente.textContent.trim(): '',
Vendedor: vendedor ? vendedor.textContent.trim(): '',
Cierre: cierre ? cierre.textContent.trim(): '',
Estado: estado ? estado.textContent.trim():'',
Total: total ? total.textContent.trim():'',
}
})
})
frame is defined this way:
const frame = page.mainFrame()
now the full path to get a reading of 1 value on the row is this
const valores1 = page.frameLocator('iframe[name="main"]').locator('body > app-root > document-list > div.container-document-list > div.table-document-list > mat-card > mat-card-content > mat-table > mat-row:nth-child(2) > mat-cell.mat-cell.cdk-column-IDNumCot.mat-column-IDNumCot.ng-star-inserted');
texts1 = await valores1.allTextContents();
i am not sure what the problem is that the $$eval code it's not reading the table and getting the information. if someone can help orientate me to the solution i would be thankful.
Hi i'm trying to inject html code from a String to a view and i'm getting some a error trying to, im stuck:
This is the Code on Node.js route:
router.get('/profile/:page', isLoggedIn, async (req, res) => {
// Get current page from url (request parameter)
let page_id = parseInt(req.params.page);
let currentPage = 0;
if (page_id > 0) currentPage = page_id;
//Change pageUri to your page url without the 'page' query string
pageUri = '/profile/';
/*Get total items*/
await pool.query('SELECT COUNT(id) as totalCount FROM user where user_type="Client"', async (err, result,) => {
// Display 10 items per page
const perPage = 10,
totalCount = result[0].totalCount;
console.log("Estos son los datos",totalCount, currentPage, pageUri, perPage);
// Instantiate Pagination class
const Paginate = new Pagination(totalCount, currentPage, pageUri, perPage);
/*Query items*/
const data = {
users: await pool.query('SELECT * FROM user where user_type="Client" LIMIT ' + 10 + ' OFFSET ' + Paginate.offset),
pages: Paginate.links()// Paginate.lins()->return a variable with all html
}
res.render('profile', { data });
});
});
This is Links() function
class Pagination{
constructor(totalCount,currentPage,pageUri,perPage=2){
this.perPage = perPage;
this.totalCount =parseInt(totalCount);
this.currentPage = parseInt(currentPage);
this.previousPage = this.currentPage - 1;
this.nextPage = this.currentPage + 1;
this.pageCount = Math.ceil(this.totalCount / this.perPage);
this.pageUri = pageUri;
this.offset = this.currentPage > 1 ? this.previousPage * this.perPage : 0;
this.sidePages = 4;
this.pages = false;
}
links(){
this.pages='<ul class="pagination pagination-md">';
if(this.previousPage > 0)
this.pages+='<li class="page-item"><a class="page-link" href="'+this.pageUri + this.previousPage+'">Previous</a></li>';
/*Add back links*/
if(this.currentPage > 1){
for (var x = this.currentPage - this.sidePages; x < this.currentPage; x++) {
if(x > 0)
this.pages+='<li class="page-item"><a class="page-link" href="'+this.pageUri+x+'">'+x+'</a></li>';
}
}
/*Show current page*/
this.pages+='<li class="page-item active"><a class="page-link" href="'+this.pageUri+this.currentPage+'">'+this.currentPage+'</a></li>';
/*Add more links*/
for(x = this.nextPage; x <= this.pageCount; x++){
this.pages+='<li class="page-item"><a class="page-link" href="'+this.pageUri+x+'">'+x+' </a></li>';
if(x >= this.currentPage + this.sidePages)
break;
}
/*Display next buttton navigation*/
if(this.currentPage + 1 <= this.pageCount)
this.pages+='<li class="page-item"><a class="page-link" href="'+this.pageUri+this.nextPage+'">Next</a></li>';
this.pages+='</ul>';
return this.pages;
}
}
module.exports = Pagination;
In the HTML:
<div id="pages">
{{ data.pages }}
</div>
Finally I am getting an error in my browser which does not allow the html that I send from the path to read correctly.
PLZZ HELP ME. IM STUCK
This is pool:
const pool = require('../database');
And this is database.js:
const mysql = require('mysql');
const { promisify } = require ('util');
const { database } = require('./keys');
const pool= mysql.createPool(database);
pool.getConnection((err, connection)=>{
if(err){
if(err.code === 'PROTOCOL_CONNECTION_LOST'){
console.error('DATABASE CONNECTION WAS CLOSED');
}
if(err.code === 'ER_CON_COUNT_ERROR'){
console.error('DATABASE HAS TO MANY CONNECTIONS');
}
if(err.code === 'ECONNREFUSED'){
console.error('DATABASE CONNECTION WAS REFUSED');
}
}
if (connection) connection.release();
console.log('DB is CONNECTED');
return;
});
//Promisify Pool Querys
pool.query = promisify(pool.query);
module.exports = pool;
Also the browser detect only text, not code.
View Source in the browser
enter image description here
Which exact database library are you using?
require('mysql')
And, what exactly is the error in your browser?
enter image description here
Your template engine by default escapes any text that you insert into the page so it will be rendered as text and not accidentally interpreted as HTML. This is why the HTML you inject is displaying as plain text.
If you want to inject actual HTML, then you have to tell the template engine that you don't want it to escape this particular insertion. When you tell us what template engine you're using, we can help you with how you do that.
To stop Handlebars from escaping your HTML, just use triple braces like this:
<div id="pages">
{{{ data.pages }}}
</div>
Here's the relevant doc page that describes it.
Also, await does not work with pool.query() in either of the places you're using it because the mysql module does not support promises and thus await on something other than a promise does nothing useful. You can use the mysql2 module as in require('mysql2/promise') to get built-in promise support with mysql2. Then, don't pass a callback and just use the returned promise.