How can I get <a> objects inside <div> when using puppeteer? - javascript

Using puppeteer, I tried getting two objects in but failed.
My test code is like this
const btnUp = await page.$('div#ember322 > a:nth-child(0)');
const btnDown = await page.$('div#ember322 > a:nth-child(1)');
How can I solve this problem?
This is my example codes for the test.
<div id="ember322" class="ember-view">
<div class="order-btn-group">
<div class="order-value"><span>0.8</span></div>
<div class="row">
<div class="col-xs-6">
<a class="btn-order btn-down txt-left" data-ember-action="" data-ember-action-323="323">
<span class="btn-order-text">down</span>
<span class="btn-order-value txt-center">
<small>300</small>
</span>
<i class="btn-order-status"></i>
</a>
</div>
<div class="col-xs-6">
<a class="btn-order btn-up txt-right" data-ember-action="" data-ember-action-324="324">
<span class="btn-order-text">up</span>
<span class="btn-order-value txt-center">
<small>500</small>
</span>
<i class="btn-order-status"></i>
</a>
</div>
</div>
</div>
</div>

Try the selectors without the >. It would require the <a> elements to be direct children of the <div> (see for example W3Schools). Like so:
const btnUp = await page.$('div#ember322 a:nth-child(0)');
const btnDown = await page.$('div#ember322 a:nth-child(1)');
And, maybe instead of using nth-child, why not try the btn-up and btn-down classes?

Related

Associating Multiple IDs in HTML CSS to Script/Javascript

From SO threads Does ID have to be unique in the whole page? and Is multiple ids allowed in html and javascript? thread, I understand that while HTML/CSS may not allow for same ID to be linked to Javascript/Script on a webpage.
However, I am looking for an efficient & less complicated solution than simply copying over the large-sized Javascript and adding a progressive number to each id.
I have this submit button with a spinner:
<button class="submit-button" id="SubmitButton">
<div class="spinner hidden" id="spinner"></div>
<span id="buttonText">Submit Now</span>
</button>
and that is linked to a LARGE_SIZED SCRIPT as follows:
<script>
const myConst = MyNUM('<?php echo SOME_DETAILS; ?>');
// Select submit button
const subBtn = document.querySelector("#SubmitButton");
// Submit request handler
.......
.......
.......
// Several hundred lines of script code,
// including functions and other processing logic for spinners and whatnot
.......
.......
</script>
I need to have multiple such SubmitButton on the same webpage, so one way is to suffix the id with an incrementing number (like id="SubmitButton1", id="SubmitButton2" and so on)
and copy-paste the same <script></script> part for each button id.
However, that will make the webpage very bulky and lengthy.
Is there any way to use minimal code without repeating the whole block again and again, yet achieve the desired (multiple submit buttons)?
You really should delegate. If you then navigate the DOM of the target using the class names, then you have no need of IDs
document.addEventListener("click", function(e) {
const tgt = e.target.closest("button");
if (tgt.matches(".submit-button")) {
const spinner = tgt.querySelector("div.spinner");
tgt.querySelector("span.buttonText").hidden = true;
spinner.hidden = false;
console.log(spinner.textContent)
}
})
<button class="submit-button">
<div class="spinner" hidden>Spinner 1</div>
<span class="buttonText">Submit Now</span>
</button>
<button class="submit-button">
<div class="spinner" hidden>Spinner 2</div>
<span class="buttonText">Submit Now</span>
</button>
<button class="submit-button">
<div class="spinner" hidden>Spinner 3</div>
<span class="buttonText">Submit Now</span>
</button>
Maybe this helps.
// Get all elements with the same class in an array
const submitButtons = document.getElementsByClassName("submit-button") // or document.querySelectorAll(".submit-button");
// Loop through the array
for (let i = 0; i < submitButtons.length; i++) {
// Get each individual element including the element's children
const submitButton = submitButtons[i]
const spinner = submitButton.querySelector(".spinner");
const submitText = submitButton.querySelector(".buttonText");
console.log(submitButton, spinner, submitText)
}
// if you want a specific button you use the index
console.log(submitButtons[3])
<button class="submit-button">
<div class="spinner hidden"></div>
<span class="buttonText">Submit Now</span>
</button>
<button class="submit-button">
<div class="spinner hidden"></div>
<span class="buttonText">Submit Now</span>
</button>
<button class="submit-button">
<div class="spinner hidden"></div>
<span class="buttonText">Submit Now</span>
</button>
<button class="submit-button">
<div class="spinner hidden"></div>
<span class="buttonText">Submit Now</span>
</button>
<button class="submit-button">
<div class="spinner hidden"></div>
<span class="buttonText">Submit Now</span>
</button>

How to use a CSS selector that gets all matching elements except those inside a specific child?

Assume an HTML structure as shown:
<div id="container">
<div class="A">
<div id="excludedElement">
<p>
<span class="MyClass">1</span>
<span class="MyClass">2</span>
<span class="MyClass">3</span>
</p>
</div>
</div>
<div class="B">
<p>
<span class="MyClass">4</span>
<span class="MyClass">5</span>
<span class="MyClass">6</span>
</p>
</div>
</div>
I want all elements inside of the "container" div that have the class "MyClass" except for those inside of the "excludedElement" div. In this case, the result contains only the spans 4, 5, and 6.
My current solution is to first get all elements with "MyClass", then get all elements inside of excludedElement with "MyClass". For each element in the first list, we check if it's in the second, and skip over it if so. This is O(n^2) running time, so I'd like to avoid this. Psuedocode for reference:
const allElements = container.querySelectorAll('.MyClass');
const excludedElements = container.querySelectorAll('#excludedElement .MyClass');
var result = [];
for (const element in allElements)
{
if (!excludedElements.Contains(element))
{
result.Add(element);
}
}
Is there a way to craft a CSS selector in querySelectorAll() that can retrieve this particular set of elements?
One way is to temporarily remove excludedElement from the tree, query for "MyClass", then replace the excludedElement, but I want to avoid modifying the DOM.
If the structure is predictable and already known:
container.querySelectorAll('div:not(#excludedElement) > p .MyClass');
If the structure is not known and you're okay with adding classes in order to avoid O(n^2):
const excludes = [...container.querySelectorAll('#excludedElement .MyClass')];
excludes.forEach(element => element.classList.add('excluded'));
const filteredMyClass = [...container.querySelectorAll('.MyClass:not(.excluded)')];
You can select all .MyClass descendants, then .filter the collection by whether the current item being iterated over has a #excludedElement ancestor with .closest:
const classes = [...container.querySelectorAll('.MyClass')]
.filter(span => !span.closest('#excludedElement'));
for (const span of classes) {
span.style.backgroundColor = 'yellow';
}
<div id="container">
<div class="A">
<div id="excludedElement">
<p>
<span class="MyClass">1</span>
<span class="MyClass">2</span>
<span class="MyClass">3</span>
</p>
</div>
</div>
<div class="B">
<p>
<span class="MyClass">4</span>
<span class="MyClass">5</span>
<span class="MyClass">6</span>
</p>
</div>
</div>
Unless you know in advance the exact sort of structure of the descendants of #container, I don't think there's an elegant way to do this with a single query string; :not accepts simple selectors only.
Just for informational purposes, a silly and repetitive method that you shouldn't use would be to use the query string:
:scope > .MyClass,
:scope > *:not(#excludedElement) > .MyClass,
:scope > *:not(#excludedElement) > *:not(#excludedElement) > .MyClass
...
const selector = `
:scope > .MyClass,
:scope > *:not(#excludedElement) > .MyClass,
:scope > *:not(#excludedElement) > *:not(#excludedElement) > .MyClass
`;
const classes = container.querySelectorAll(selector);
for (const span of classes) {
span.style.backgroundColor = 'yellow';
}
<div id="container">
<div class="A">
<div id="excludedElement">
<p>
<span class="MyClass">1</span>
<span class="MyClass">2</span>
<span class="MyClass">3</span>
</p>
</div>
</div>
<div class="B">
<p>
<span class="MyClass">4</span>
<span class="MyClass">5</span>
<span class="MyClass">6</span>
</p>
</div>
</div>
I have this....
const Excludes = [...container.querySelectorAll('#excludedElement .MyClass')]
, noExcludes = [...container.querySelectorAll('.MyClass')].filter(el=>(!Excludes.includes(el)))
;
noExcludes.forEach(element => element.style.backgroundColor = 'lightgreen');
<div id="container">
<div class="A">
<div id="excludedElement">
<p>
<span class="MyClass">1</span>
<span class="MyClass">2</span>
<span class="MyClass">3</span>
</p>
</div>
</div>
<div class="B">
<p>
<span class="MyClass">4</span>
<span class="MyClass">5</span>
<span class="MyClass">6</span>
</p>
</div>
</div>
You can use this precise selector in .querySelectorAll():
:not(#excludedElement) > p > .MyClass
Working Example:
const includedSpans = [... document.querySelectorAll(':not(#excludedElement) > p > .MyClass')];
includedSpans.forEach((includedSpan) => console.log(includedSpan.textContent));
<div id="container">
<div class="A">
<div id="excludedElement">
<p>
<span class="MyClass">1</span>
<span class="MyClass">2</span>
<span class="MyClass">3</span>
</p>
</div>
</div>
<div class="B">
<p>
<span class="MyClass">4</span>
<span class="MyClass">5</span>
<span class="MyClass">6</span>
</p>
</div>
</div>

How parse fetch text from html tags in nodejs?

I have a html as text in nodejs as follow:
var htmlText = `<div class="X7NTVe">
<a class="tHmfQe" href="/link1">
<div class="am3QBf">
<div>
<span>
<div class="BNeawe deIvCb AP7Wnd">
<span dir="rtl">My First Text</span>
</div>
</span>
</div>
</div>
</a>
<div class="HBTM6d XS7yGd">
<a href="/anotherLink1">
<div class="BNeawe mAdjQc uEec3 AP7Wnd">></div>
</a>
</div>
</div>
<div class="x54gtf"></div>
<div class="X7NTVe">
<a class="tHmfQe" href="/link2">
<div class="am3QBf">
<div>
<span>
<div class="BNeawe deIvCb AP7Wnd">
<span dir="rtl">My Second Text</span>
</div>
</span>
</div>
</div>
</a>
<div class="HBTM6d XS7yGd">
<a href="/anotherLink2">
<div class="BNeawe mAdjQc uEec3 AP7Wnd">></div>
</a>
</div>
</div>
<div class="x54gtf"></div>`
Now I Want to fetch text form it as array. In abow example it must return My First Text and My Second Text . How can I do it?
Note: I want to do it in nodejs note in javascript.
With cheerio:
let $ = cheerio.load(html)
let strings = $('div[class="BNeawe deIvCb AP7Wnd"]>span[dir]')
.get().map(span => $(span).text())
method#1
replace all tags with regex /<[^>]*>/g.
method#2
parse html with jsdom, and access html node via js document api.
method#2 is much more flexible.

Get all href values from a string which is filled with HTML code from a textarea

Solved :)
var test = $('textarea[name=extract]').val();
var hh = $.parseHTML(test) ;
$.each($(test).find('.tile__link'),function(i,b){
var reff = $(this).attr('href');
$('.links').append("link/" +reff + "<br><br>");
})
I have HTML code copied from an website. And I want all href values with the class .tile_link in a String.
I did not find an solution, how I can get the value of href with the class .tile_link without the divs and text just the link?
Here's an example:
var test = $('textarea[name=extract]').val();
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea name="extract">
<span class="txt-raise">8min</span>
</div></div>
<div class="js-hunq-badge fit-tr pr-- pt--"></div>
<div class="tile__footprint">
</div>
</a>
</div><div class="tile grid-tile tile--bordered"> <a href="#/profile//grid" class="tile__link">
<div role="image" aria-label="HSHBerl" style="background-image:url()" class="tile__image"></div>
<div class="bg-raise tile__info">
<div class="info info--middle txt-raise">
<div class="txt-truncate layout-item--consume">
<div class="typo-small lh-heading txt-truncate">
8 km <span class="icon icon-small icon-gps-needle icon-badge"></span>
</div>
<div class="lh-heading txt-truncate">
<div class="info__main-data">
<div class="info__username">
</div>
<div class="js-romeo-badge"></div>
<div class="info__icon-set">
</div>
</div>
</div>
</div>
</div>
</div>
<div class="tile__onlinestate js-online-state"><div>
<span class="icon icon-online-status ui-status--online icon-raise" title="Online"></span>
</textarea>
But I don't know how to extract it to get only the values of href.
You can do someting like this:
$(".tile_link").attr('href');
If you have more than one element with that class, you can do forEach or map.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map
Example:
$('.tile_link').map(function (element) { return $(this).attr('href'); });

Button works only after second click

Close button which remove the elements from DOM, work only on the second click.
Here is HTML part of button: That is closeBtn.
function removeHeader() {
var list = document.getElementById("main");
list.removeChild(list.childNodes[0]);
}
<div id="main">
<div class="nanoSaturnBanner">
<p>teteasdasdasdsadasds sad asdasdasdasdasdas</p>
<div class="banner-buttons">
<label class="showme">Ads by Google</label>
<a class="infoLink" href="https://support.google.com/adsense/#topic=3373519" target="_blank">
<i class="fas fa-info-circle"></i>
</a>
<div class="closeBtn" onclick="removeHeader()">
closeBtn
</div>
</div>
</div>
</div>
You should use list.childNodes[1] because the list.childNodes[0] represent the #text node that is the whitespaces after <div id="main">. So, in first click it was removing that node and in second click it was removing the actual node with <div class="nanoSaturnBanner">
function removeHeader() {
var list = document.getElementById("main");
list.removeChild(list.childNodes[1]);
}
<div id="main">
<div class="nanoSaturnBanner">
<p>teteasdasdasdsadasds sad asdasdasdasdasdas</p>
<div class="banner-buttons">
<label class="showme">Ads by Google</label>
<a class="infoLink" href="https://support.google.com/adsense/#topic=3373519" target="_blank">
<i class="fas fa-info-circle"></i>
</a>
<div class="closeBtn" onclick="removeHeader()">
<i class="far fa-window-close">close</i>
</div>
</div>
</div>
Note: Whitespace inside elements is considered as text, and text is considered as nodes. Comments are also considered as nodes.
As childNodes get none element nodes as well, like text and comment, use e.g. children instead to get the first actual element.
Note, with that you also make sure getting the element no matter how many "none element nodes" their might be in your markup.
Stack snippet
function removeHeader() {
var list = document.getElementById("main");
list.removeChild(list.children[0]);
}
<div id="main">
<div class="nanoSaturnBanner">
<p>teteasdasdasdsadasds sad asdasdasdasdasdas</p>
<div class="banner-buttons">
<label class="showme">Ads by Google</label>
<a class="infoLink" href="https://support.google.com/adsense/#topic=3373519" target="_blank">
<i class="fas fa-info-circle"></i>
</a>
<div class="closeBtn" onclick="removeHeader()">
<i class="far fa-window-close">close</i>
</div>
</div>
</div>
function removeHeader() {
var list = document.getElementById("main");
list.remove(list.childNodes[0]); // replacing removeChild with remove worked
}
Check the fiddle.

Categories