Unable to grab an image with cheerio/node.js - javascript

My problem is pretty straightforward. I am trying to console log the URL of an image from the amazon link bellow. Either from a more precise selection
So I've spent the majority of the time attempting t select the id/class of the link but seem to only get the as close as #imgTagWrapperId which returns a lot of redundant information. In theory I should be able to grab the links with regex narrowing things down, but for the life of me I can only seem to replace the text i return and not simply grab it. Alternatively I have as stated attempted to grab the img src itself, only to return a nonsensical string of code. When I view page source the same ball of text appears there but not when I inspect elements directly.
const request = require('request');
const cheerio = require('cheerio');
request(`http://amazon.com/dp/B079H6RLKQ`, (error,response,html) =>{
if (!error && response.statusCode ==200) {
const $ = cheerio.load(html);
const productTitle = $("#productTitle").text().replace(/\s\s+/g, '');
const prodImg = $(`#imgTagWrapperId`).html();
console.log(productTitle);
console.log(prodImg);
} else {
console.log(error);
}
})
This current code returns the product title faithfully but returns this for the prodImg output:
<img alt="Samsung Galaxy S9 G960U 64GB Unlocked 4G LTE Phone w/ 12MP Camera - Midnight Black" src="

...(this nonsense goes on for a mile) ....
" data-old-hires="https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SL1500_.jpg" class="a-dynamic-image a-stretch-horizontal" id="landingImage" data-a-dynamic-image="{"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX522_.jpg":[564,522],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX342_.jpg":[369,342],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX679_.jpg":[733,679],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX425_.jpg":[459,425],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX466_.jpg":[503,466],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX569_.jpg":[615,569],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX385_.jpg":[416,385]}" style="max-width:679px;max-height:733px;">
</div>
Thank you in advance for any help and guidance with this. I Have exhausted all other usual sources and am ready to be called an idiot.
EDIT:
Someone wanted the html for before and after selection, Ill oblige but it might be better to just view the page source in the link and ctrl+ f. Wall of text bellow.
<div class="variationUnavailable unavailableExp">
<div class="inner">
<div class="a-box a-alert a-alert-error" aria-live="assertive" role="alert"><div class="a-box-inner a-alert-container"><h4 class="a-alert-heading">Image Unavailable</h4><i class="a-icon a-icon-alert"></i><div class="a-alert-content">
<span class="a-text-bold">
Image not available for<br/>Color:
<span class="unvailableVariation"></span>
</span>
</div></div></div>
</div>
</div>
<!-- Append onload function to stretch image on load to avoid flicker when transitioning from low res image from Mason to large image variant in desktop -->
<!-- any change in onload function requires a corresponding change in Mason to allow it pass in /mason/amazon-family/gp/product/features/embed-features.mi -->
<!-- and /mason/amazon-family/gp/product/features/embed-landing-image.mi -->
<ul class="a-unordered-list a-nostyle a-horizontal list maintain-height">
<span id="imageBlockEDPOverlay"></span>
<li class="image item itemNo0 selected maintain-height"><span class="a-list-item">
<span class="a-declarative" data-action="main-image-click" data-main-image-click="{}">
<div id="imgTagWrapperId" class="imgTagWrapper">
<img alt="Samsung Galaxy S9 G960U 64GB Unlocked 4G LTE Phone w/ 12MP Camera - Midnight Black" src="

" data-old-hires="https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SL1500_.jpg" class="a-dynamic-image a-stretch-horizontal" id="landingImage" data-a-dynamic-image="{"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX522_.jpg":[564,522],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX342_.jpg":[369,342],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX679_.jpg":[733,679],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX425_.jpg":[459,425],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX466_.jpg":[503,466],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX569_.jpg":[615,569],"https://images-na.ssl-images-amazon.com/images/I/81%2Bh9mpyQmL._SX385_.jpg":[416,385]}" style="max-width:679px;max-height:733px;">
</div>
</span>
</span></li>
<li class="mainImageTemplate template"><span class="a-list-item">
<span class="a-declarative" data-action="main-image-click" data-main-image-click="{}">
<div class="imgTagWrapper">
<span class="placeHolder"></span>
</div>
</span>
</span></li>

Thanks to Rishi Raj for giving a quickfix solution. $('#landingImage').attr('data-old-hires'). I was also adding a unnecessary .html() to the const which had gotten in the way. Thanks again everyone!

Couldn't you simply target the image directly and get the url with .attr('src')?
const request = require('request');
const cheerio = require('cheerio');
request('http://amazon.com/dp/B079H6RLKQ', (error,response,html) => {
if (!error && response.statusCode === 200) {
const $ = cheerio.load(html);
const productTitle = $('#productTitle').text().replace(/\s\s+/g, '');
const prodImg = $('#landingImage').attr('data-old-hires');
console.log(productTitle);
console.log(prodImg);
} else {
console.log(error);
}
});

Related

One div input element will not pull information from sessionStorage using javascript

enter image description hereI am trying to develop a ticketing tool for work and its a template site to make taking notes for helpdesk more easily and when something is an escalation and you click the button for escalation template. When the escalation template loads I want it to pull information from session storage to avoid copying and pasting notes already taken.
This is how I am storing the information in session storage and I checked via Chrome browser that information is being stored in the session storage.
if(callInbound)
{
sessionStorage.setItem("ACTIVITYTYPE", document.form1.activity.value);
sessionStorage.setItem("CONTACTNAME", document.form1.name.value);
sessionStorage.setItem("VIPSTATUS", document.form1.vip.value);
sessionStorage.setItem("CONTACTNUMBER", document.form1.phone.value);
sessionStorage.setItem("CALLERLOCATION", document.form1.location.value);
sessionStorage.setItem("ISSUEDESCRIPTION",document.form1.issueDescription.value); <--PROBLEM ITEM
sessionStorage.setItem("ERRORMESSAGE", document.form1.errorMessage.value);
sessionStorage.setItem("TROUBLESHOOTING", document.form1.troubleshooting.value);
sessionStorage.setItem("KNOWLEDGEUSED", document.form1.knowledgeUsed.value);
sessionStorage.setItem("SEARCHTERMSTRIED", document.form1.searchtermsTried.value);
sessionStorage.setItem("SCREENSHOTSATTACHED", document.form1.screenshotsAttached.value);
sessionStorage.setItem("SURVEYOFFERED", document.form1.surveyOffered.value);
sessionStorage.setItem("SURVEYTAKEN", document.form1.surveyTaken.value);
}
This is the page where the information is being loaded and all but the one identified as problem is being populated into the template.
<!DOCTYPE html>
<html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"><title>DSI Call Outbound</title>
<script type="text/javascript" src="DSI.js"></script>
<script language="JavaScript" type="text/javascript">
window.addEventListener('load', ()=>{
let callInboundCallDropped = sessionStorage.getItem("CALLINBOUNDCALLDROPPED");
let callOutboundCallDropped = sessionStorage.getItem("CALLOUTBOUNDCALLDROPPED")
let additionalIssue = sessionStorage.getItem("ADDITIONALISSUE");
let voicemailCallBack = sessionStorage.getItem("VOICEMAILRECEIVED");
if (callInboundCallDropped == "true")
{
document.form1.issueDescription.value = sessionStorage.getItem("ISSUEDESCRIPTION");<--Problem Item
document.form1.name.value = sessionStorage.getItem("CONTACTNAME");
document.form1.phone.value = sessionStorage.getItem("CONTACTNUMBER");
document.form1.errorMessage.value = sessionStorage.getItem("ERRORMESSAGE");
}
else if (callOutboundCallDropped == "true")
{
document.form1.name.value = sessionStorage.getItem("CONTACTNAME");
document.form1.phone.value = sessionStorage.getItem("CONTACTNUMBER");
document.form1.issueDescription.value = sessionStorage.getItem("ISSUEDESCRIPTION");<--Problem item
document.form1.errorMessage.value = sessionStorage.getItem("ERRORSMESSAGE");
}
else if(additionalIssue == "true")
{
document.form1.activity.value = "***ADDITIONAL ISSUE***";
document.form1.name.value = sessionStorage.getItem("CONTACTNAME");
document.form1.phone.value = sessionStorage.getItem("CONTACTNUMBER");
}
else if(voicemailCallBack == "true");
{
document.form1.name.value = sessionStorage.getItem("CONTACTNAME");
document.form1.phone.value = sessionStorage.getItem("CONTACTNUMBER");
document.form1.issueDescription.value = sessionStorage.getItem("VOICEMAILSUBJECT");
document.form1.errorMessage.value = sessionStorage.getItem("ERRORMESSAGE");
}
//sessionStorage.clear();
})
</script>
All other items load properly from session storage when the page is loaded. I confirmed that the information is in session storage via the application section in the inspection section of Chrome browser and by adding alert("ISSUEDESCRIPTION"); below the problem item line and it works as expected.
I have tried the following:
// Store your value from one page sessionStorage.setItem("values",
"input_text");
// Retrieve the value from another page var value =
sessionStorage.getItem("values");
//replacing the line with and assigning and id to the div element in the html document.getElementById("issueDescription").value = sessionStorage.getItem("ISSUEDESCRIPTION");
This is the problem element:
<div class="w3-row w3-section">
<div class="w3-col" style="width:50px"><i class="w3-xxlarge w3-animate-zoom"></i></div>
<div class="w3-rest">
<p>Issue Description</p>
<input class="w3-input w3-border" name="issueDescription" type="text" placeholder="[Issue Description Here]">
</div>
</div>
This is an element on the same page that is correctly being filled with session storage information:
<div class="w3-row w3-section">
<div class="w3-col" style="width:50px"><i class="w3-xxlarge w3-animate-zoom"></i></div>
<div class="w3-rest">
<p>Phone Number Used</p>
<input class="w3-input w3-border" name="phone" type="text" placeholder="(xxx)xxx-xxxx">
</div>
</div>
please tell me if I'm just referencing something incorrectly like a typo although I've spent multiple hours doublechecking everything. Wish there were a troubleshooting/debugging feature in Visual Studio Code for this kind of stuff to step through it. If there is and I'm unaware please let me know.
Unsure of why this didn't work the first few times I tried it, but this time when I copied and pasted the elements name in the "documents.form1.issueDescription.value = sessionStorage.getItem("ISSUEDESCRIPTION");"
the issue was resolved, swear I did this on repeat late last night, must've been a spelling error.

Puppeter delete node inside element

I want to scrape a page with some news inside.
Here it's an HTML simplified version of what I have :
<info id="random_number" class="news">
<div class="author">
Name of author
</div>
<div class="news-body">
<blockquote>...<blockquote>
Here it's the news text
</div>
</info>
<info id="random_number" class="news">
<div class="author">
Name of author
</div>
<div class="news-body">
Here it's the news text
</div>
</info>
I want to get the author and text body of each news, without the blockquote part.
So I wrote this code :
let newsPage = await newsPage.$$("info.news");
for (var news of newsPage){ // Loop through each element
let author = await news.$eval('.author', s => s.textContent.trim());
let textBody = await news.$eval('.news-body', s => s.textContent.trim());
console.log('Author :'+ author);
console.log('TextBody :'+ textBody);
}
It works well, but I don't know how to remove the blockquote part of the "news-body" part, before getting the text, how can I do this ?
EDIT : Sometimes there is blockquote exist, sometime not.
You can use optional chaining with ChildNode.remove(). Also you may consider innerText more readable.
let textMessage = await comment.$eval('.news-body', (element) => {
element.querySelector('blockquote')?.remove();
return element.innerText.trim();
});

Trouble loading upload image previews, Angular 1.5.8 on IOS 11 Safari

OK, weird error. This code works in browsers and in Safari on IOS 10 and previous.
I've got a page where we're building an item to submit to our DB, and this item is made up of multiple line items have images arrays that store images per line item. When the file is uploaded (using ng-file-upload), we analyze the file to make sure it matches some size and type restrictions and then stuff it in an array, at which point there is another control on the page that gives the user an image preview of the item and a button to remove the image if they no longer want to upload it.
HTML:
<div ng-repeat="item in items">
<div class="row">
<div class="col-xs-12 col-md-2">
<div class="form-group">
<div class="btn btn-default" ngf-select ngf-change="UploadFile(item, $file)" ngf-multiple="false">Upload Image</div>
</div>
</div>
</div>
<div class="row">
<div class="col-md-12 col-xs-12">
<ul class="thumbnail-list">
<li class="image-thumbnail" ng-repeat="image in item.images">
<span ng-click='removeImage(image, item, $index)' class='remove'>X</span>
<img ngf-src="image.file" ngf-no-object-url="true" ngf-accept="'image/*'">
</li>
</ul>
</div>
</div>
</div>
JS Controller code:
$scope.fileReader = new FileReader();
$scope.fileReader.onload = function () {
$scope.CurrentItem.images[$scope.CurrentItem.images.length - 1].base64 = this.result;
$scope.selectedImage.base64 = this.result;
};
$scope.UploadFile = function (item, file) {
if (file != null && file != "") {
$scope.error = '';
$scope.CurrentItem = item;
// Validation Logic here removed
if (valid) {
$scope.selectedImage = {};
$scope.selectedImage = {
'size': file.size,
'type': file.type,
'name': file.name,
'file': file
};
$scope.CurrentItem.images.push($scope.selectedImage);
$scope.fileReader.readAsDataURL(file);
} else {
// Handle Validation Failure
}
}
}
On IOS 11, the image is successfully added to the array, an element for the image added to the page, and the red X for the delete operation is positioned correctly on it. However, the actual image preview does not appear. However, if you change the orientation of the iPad from Portrait to Landscape or vice versa, it seems like some refresh is invoked on orientation change and the image preview fills in.
On IOS 10 and below, as well as on desktop browsers (including Safari 11 on OSX), this just renders the image out at the same time as the element is added to the page as intended.
Anyone have any ideas?

html parse jquery function

I;ve got troubles coding a function.
Here's the situation :
I've got divs like that :
<div class='sound'>
<img src='$artwork' class='artwork' />
<div>
<p class='genre'>$genre</p>
<p class='title'>$title</p>
<i href ="$link" class='link'></i>
</div>
<div class="sound' ... ...
and many others like that.
I'd like to make a button that get all the divs the content with the classname 'sound'
and use this with this function of the player's API :
$.fullwidthAudioPlayer.addTrack(trackUrl, title, meta, cover, linkUrl);
I tried this function in jquery, it gets the datas not parsed :
$('.sound').each(function() {
$.fullwidthAudioPlayer.addTrack($('.content',this).text());
So, I'd like to know the right way to do it !
Thank you so much in advance !
You have to query for each one separately:
var trackUrl = $('.link', this).attr('href'),
title = $('.title', this).text(),
meta = $('.genre', this).text(),
cover = $('.artwork', this).attr('src'),
linkUrl = null;
$.fullwidthAudioPlayer.addTrack(trackUrl, title, meta, cover, linkUrl);

Inserting Anchors in a Javascript Photo Gallery - Not Working?

I am using jAlbum(with the lightflow skin) to create a photo gallery for my website. The gallery loads and is in a nice carousel format. I would like to add anchors that way I can link directly to a certain photo within the gallery. I tried to add an anchor in the HTML yet it does not work. I assume this is because when the page loads the gallery takes a few seconds to load and thus does not redirect to the anchor. I easily could be wrong and need some advice on what I should try to get anchors to work. Here is an example code for the anchor and the photo itself:
<div class="item">
<a name="anchor3" id="anchor3"></a>
<img class="content hidden" src="thumbs/tree-w-sun.jpg" alt="Gifts" />
<div class="ref hidden">item8</div>
<div class="caption"><h3>Gifts</h3></div>
<div class="comment hidden"></div>
<div class="author hidden"></div>
<div class="params hidden"></div>
<div class="info hidden"><div><p>Artist: UBhapE2</p></div></div>
<div class="thumbWidth hidden">261</div>
<div class="thumbHeight hidden">350</div>
<a id="item8" class="lightwindow hidden" title="<h3>Gifts</h3>"
rel="gal[cat]" href="slides/tree-w-sun.jpg" ></a>
</div>
I have tried linking to the anchor I inserted (anchor3) and to the id inserted by jAlbum (item8) and neither work.
There are a few scripts that control the gallery and will put them here:
Script 1 - "Lightflow JS"
var LightFlowGlobal = {};
function getParam( name ){
name = name.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]");
var regexS = "[\\?&]"+name+"=([^&#]*)";
var regex = new RegExp( regexS );
var results = regex.exec( window.location.href );
if( results == null )
return "";
else
return results[1];
}
Script 2 - "ContentFlow JS" This JS is long and for sake of space I put the link directly to the JS file here
Script 3 - This script is in the page:
<script language="JavaScript" type="text/javascript">
var startItem = getParam('p');
if(startItem == "") startItem = "first";
if(startItem.isNaN) startItem = "'"+startItem+"'";
new ContentFlow('contentFlow', {
reflectionColor: "#000000",
maxItemHeight: 350,
marginTop: 50,
reflectionHeight: 0.25,
endOpacity: 1,
startItem: startItem,
circularFlow: false,
stretchThumbs: false
});
function lightWindowInit() {
LightFlowGlobal.myLightWindow = new lightwindow({
infoTabName : "More Info",
rootPath: "res/lightwindow/",
loadingTxt: "loading or ",
cancelTxt: "cancel",
playTxt: "start slideshow",
stopTxt: "stop slideshow",
slowerTxt: "slower by 1 second",
fasterTxt: "faster by 1 second",
downloadSlideTxt: "Download",
downloadSlide: false,
showSlideshow: false,
slideshowDuration: 5000,
circular: false,
animationDuration: 0.25
});
}
LightFlowGlobal.readyJS=true;
var rootPath = ".";
</script>
I am unsure what other scripts or css is needed. I link to the test-gallery I am working with here if you need to view the page. I will post additional info if requested.
So now how do I get anchors to work with this? I am not that great at javascript so please explain the answer vs "you need to add this function to the script" without explaining.
Thank Your for any and all assistance!
On the ContentFlow site, under Documentation --> items as links, the developer specifically states that "no element within the item may contain any anchors". maybe someone can offer a way around this restriction.
I figured out a way answer was provided by the Photo Gallery Creater:
It's not only the js. You'd need to pass a parameter to AddThis in order to
identify the image. Without it, you wouldn't know which image has been clicked.
The best would be to use LightFlow's query paramter p=index, where index is the
number of the image of the current web page.
For example, the following link would focus the 4th image of the gallery
(index begins at 0): http://your-domain.com/album/index.html?p=3

Categories