Web Scrapy!! How can I crawl using Click event data? - javascript

I try to crawl this page : http://www.11st.co.kr/html/main.html
but there are some problems.
First, Scrapy cannot interpret javascript.
I want to get some 'href' data to crawl again in that button(red square one)
Site screencapture
even I cannot use selenium.
Because button code is in script.
so xpath can't find.
<script id="headerNavigationTemplate" type="text/x-handlebars-template">
{{#ifCond templateType '===' 'main'}}
<nav class="header_gnb" id="gnbNavArea">
{{else}}
<div class="header_gnb" id="gnbNavArea">
{{/ifCond}}
<div class="inner">
<h1 class="hide">대메뉴</h1>
<div class="gnb_l">
<div class="gnb_nav gnb_nav_category" id="gnbCategoryArea">
<p name="gnbNavBtn"><button type="button" class="gnb_btn_all" data-ga-event-category="PC_GNB" data-ga-event-action="전체보기 버튼" data-ga-event-label=""><span class="in_btn"><span class="ico"></span>전체보기</span></button></p>
<div class="gnb_nav_category_layer">
<div class="gnb_total_category">
<div class="row" id="navCtgrRow1"></div>
<div class="row" id="navCtgrRow2"></div>
<div class="row" id="navCtgrRow3"></div>
<div class="row" id="navCtgrRow4"></div>
<div class="row" id="navCtgrRow5"></div>
<div class="row" id="navCtgrRow6"></div>
<div class="row" id="navCtgrRow7"></div>
<div class="row" id="navCtgrRow8"></div>
<div class="row" id="navCtgrRow9"></div>
I want to get data that hide in
//div[#class = "gnb_total_category"]/div
how can I crawl.
Please help me.

Please try following script to get required data:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('http://www.11st.co.kr/html/main.html')
driver.find_element_by_xpath("//span[contains(text(), '전체보기')]").click()
print(driver.find_element_by_xpath('//div[#class="gnb_total_category"]/div').text)

Related

Using Xpath how to select multiple parent classes of an element that has a specific innerText?

Using Puppeteer I need to select the hour and the minutes to schedule a post from a widget
The HTML code of the widget is this:
<div class="vdatetime-time-picker">
<div class="vdatetime-time-picker__list vdatetime-time-picker__list--hours">
<div class="vdatetime-time-picker__item vdatetime-time-picker__item--selected">00</div>
<div class="vdatetime-time-picker__item">01</div>
<div class="vdatetime-time-picker__item">02</div>
<div class="vdatetime-time-picker__item">03</div>
<div class="vdatetime-time-picker__item">04</div>
<div class="vdatetime-time-picker__item">05</div>
<div class="vdatetime-time-picker__item">06</div>
<div class="vdatetime-time-picker__item">07</div>
<div class="vdatetime-time-picker__item">08</div>
<div class="vdatetime-time-picker__item">09</div>
<div class="vdatetime-time-picker__item">10</div>
<div class="vdatetime-time-picker__item">11</div>
<div class="vdatetime-time-picker__item">12</div>
<div class="vdatetime-time-picker__item">13</div>
<div class="vdatetime-time-picker__item">14</div>
<div class="vdatetime-time-picker__item">15</div>
<div class="vdatetime-time-picker__item">16</div>
<div class="vdatetime-time-picker__item">17</div>
<div class="vdatetime-time-picker__item">18</div>
<div class="vdatetime-time-picker__item">19</div>
<div class="vdatetime-time-picker__item">20</div>
<div class="vdatetime-time-picker__item">21</div>
<div class="vdatetime-time-picker__item">22</div>
<div class="vdatetime-time-picker__item">23</div>
</div>
<div class="vdatetime-time-picker__list vdatetime-time-picker__list--minutes">
<div class="vdatetime-time-picker__item">00</div>
<div class="vdatetime-time-picker__item">05</div>
<div class="vdatetime-time-picker__item">10</div>
<div class="vdatetime-time-picker__item">15</div>
<div class="vdatetime-time-picker__item">20</div>
<div class="vdatetime-time-picker__item">25</div>
<div class="vdatetime-time-picker__item">30</div>
<div class="vdatetime-time-picker__item">35</div>
<div class="vdatetime-time-picker__item vdatetime-time-picker__item--selected">40</div>
<div class="vdatetime-time-picker__item">45</div>
<div class="vdatetime-time-picker__item">50</div>
<div class="vdatetime-time-picker__item">55</div>
</div>
</div>
Let's say I need to select 15:15.
I know with Xpath I can select the inner text with
const xpathHour = "//div[text()='15']";
the problem is that when selecting the minutes, being a multiple of 5, it would select the hour (again) because is the first element Puppeteer would find with the text of 15.
Their parent elements are different so how can I get in Xpath the same result as this one?
document.querySelector('.vdatetime-time-picker__list--hours .vdatetime-time-picker__item').innerText === "15"
You're probably looking for:
const xpathMinute = "(//div[text()='15'])[2]";
That is the second div with "15" text.
() are important, because [] operator has higher precedence.

click event is not working. For few elements it is for few its not

In my code for cmd and normal its working, but for #aboutme its not. Don't know why such event is getting ignored while its working for the above snippet.
var normal = $(".normal");
var cmd = $(".cmd");
normal.on("click", function(){
$(".UI").show();
$(".console").hide();
});
cmd.on("click", function(){
$(".console").show();
$(".UI").hide();
});
$("#aboutme").on("click",function(){
console.log("okay");
});
My Html Code: class dashboard acts as a wrapper.
<div class="dashboard ">
<div class="option">
<div class="normal">Normal</div>
<div class="cmd">Terminal</div>
</div>
<hr style="background-color: white;">
<div class="console">
</div>
<div class="UI ">
<div class="showcase">
<div id="aboutme">
<h2><span>»About</span></h2>
<p>Self-motivated fresher seeking a career in recognized organization to prove my skills and utilize my knowledge and intelligence in the growth of organization.</p>
</div>
<div id="Skills">
<h2><span>Skills</span></h2>
<p> <kbd>Programming Languages</kbd> : Python, Node.js, C++</p>
<p> <kbd>Platform & Development Tools</kbd> : VS Code , Spyder and Jupiter Notebook</p>
</div>
</div>
</div>
Seems to work, can you provide your full HTML
$("#aboutme").on("click",function(){
console.log("okay");
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<button id="aboutme">Test</button>

AngularJS - JQuery News Ticker - After moving news the javascript function doesn't get called

This is an awkward problem, which will be better explained by looking at the live demo. Basically I have a news box populated with ng-repeat using Angular. Then I am using a jquery plugin called news ticker which allows the news headlines to move around. The first time you land on the page the news items are in their proper spots and they call the function accordingly. After moving them up or down, they stop calling the function, set a breakpoint in the code and after moving the breakpoint is never hit.
Edit - After researching this further I have found that it is because the elements need to be compiled using $compile after they're re added. However following some examples I have not gotten this to work. I need a way of compiling after a move event with this plugin.
Live demo
Angular Script for with function that should be called.
$scope.UpdateNews = function (item,number) {
console.log(item + ' ' + number);
switch (item)
{
case 'General':
$scope.NewsCast.General.Body = $scope.News.GeneralNews[number].Text;
break;
}
};
What the HTML looks like with the news boxes
<script src="http://www.jqueryscript.net/demo/Responsive-jQuery-News-Ticker-Plugin-with-Bootstrap-3-Bootstrap-News-Box/scripts/jquery.bootstrap.newsbox.min.js" type="text/javascript"></script>
<div class="row">
<div class="col-md-6 col-lg-3">
<div class="panel panel-default">
<div class="panel-heading"> <span class="glyphicon glyphicon-list-alt logo-inverse pull-left"></span><b> General News</b></div>
<div class="panel-body">
<div class="row">
<div class="col-xs-12">
<ul class="demo1" style="overflow-y: hidden; height: 280px;" ng-model="Idk">
<li style="" class="news-item text-left" ng-repeat="item in News.GeneralNews"><strong>{{item.DateValue | date: "MMM dd yyyy"}}</strong> {{item.Preview}}<a class="FakeClickable" ng-click="UpdateNews('General',item.index)" data-target="#GeneralModal" data-toggle="modal">Read more...</a></li>
</ul>
<div ng-if="Width>768">
<div ng-include="'../pages/Modals/General/GENERAL_INLINE.html'"></div>
</div>
</div>
</div>
</div>
<div class="panel-footer"> </div>
</div>
</div>

Navigation html with Cheerio web page scraper

I'm following this tutorial how to screen scrape with cheerio for Node.js and I'm 2-seconds away from just downloading the entire page and using Javascript to extract the information I need, which I'm sure is much more difficult than actually using Cheerio, but I'm having difficulty understanding now to navigate the HTML with Cheerio.
How do I extract the number '2' "blind-cow-white-number"?
Here is the HTML:
<div id="mainCows" class="row-fluid">
<div class="zone zone-content">
<article class="projection-page content-item">
<article class="post post-page content-item blind-cow">
<h1>blind cow</h1>
<div>
<div class="blind-cow-header" style="margin-bottom:15px">
<div class="blind-cow-list"> my list </div>
<div style="margin: 0 auto; width: 90%; text-wrap: none; text-align: center;">
<div class="blind-cow-white-number"> 1 </div>
<div class="blind-cow-white-number"> 2 </div>
</div>
<div class="blind-cow-died"> 3 </div>
<table class="blind-cow-table">
<table class="blind-cow-table">
<div class="blind-cow-Locations"> </div>
<br>
<div class="blind-cow-footer">
</div>
</article>
</article>
</div>
</div>
How do I achieve this with cheerios?
Is there a web screen scraper for node.js that allows me to use xpath instead?
Cheerio uses the same syntax and almost everything else as jQuery.
$(".blind-cow-white-number").eq(1).html();

Angular JS binding working only once in the page

I have a page where I do multiple binding to the same object (item, item2, message, messageType)
Though I placed the binding in several part of the page it works only the first time i placed. The objects are filled with ajax calls that return correctly the value (I logged them in the console)
What sounds even strange to me is that the <infomessage> directive has been used in several other places in the app (twice in the same page) and worked perfectly.
Do you have any idea on why these bindings don't work?
I even tried to $watch the objects and they are properly changes but seems that the view use the updated value only the first time
<div class="container" ng-app="MyApp" >
<div class="row" ng-controller="MyCtrl" >
<div class="col-lg-10">
<h3>...</h3>
</div>
<div class="col-lg-10">
<infomessage type="{{messageType}}" message="{{message}}"></infomessage>
</div>
<div class="col-lg-10">
Item: {{item.idbene_ext}} / {{item.id}} / {{item.img}}<br>
Ubicazione: {{item2.id}} {{item2.code}}
</div>
</div>
<div class="row" style="margin-top:20px">
<div class="col-xs-1"></div>
<div class="col-xs-4" ng-class="{'ubiBox':true,'ausilio-enabled':(item!=null),'ausilio-disabled':(item==null), 'boxfocus':(item==null)}">
<div ng-show="item==null">
<div class="number">1</div>
<img src="assets/images/disabled-128.png" width="100" class="img_none"/>
<h4> {{item.idbene_ext}} Select an item</h4>
</div>
<div ng-show="ausilio!=null">
<h4>Item:{{item.idbene_ext}}</h4>
</div>
</div>
<div class="col-xs-2"></div>
<div class="col-xs-4" ng-class="{'ubiBox':true,'ubi-enabled':(item!=null),'ubi-disabled':(item==null),'boxfocus':(item!=null) }">
<div ng-show="item==null">
<div class="number">2</div>
<img src="assets/images/Office-disabled-128.png" width="100" class="img_none"/>
<h4> Select the second item</h4>
</div>
<div ng-show="item!=null">
<h4>Item2 {{item2.code}}</h4>
</div>
</div>
<div class="col-xs-1"></div>
</div>
<div class="row">
<div class="col-lg-10">
<infomessage type="{{messageType}}" message="{{message}}"></infomessage>
</div>
<div class="col-lg-10">
Item: {{item.idbene_ext}} / {{item.id}} / {{item.img}}<br>
Ubicazione: {{item2.id}} {{item2.code}}
</div>
</div>
Here's the angularJS code
MyApp.controller("MyCtrl",function($scope,$http,Config,BarcodeService){
$scope.iditem2=-1
$scope.iditem=-1
$scope.item2=null
$scope.item=null
$scope.message=""
$scope.messageType=""
$scope.$on(BarcodeService.handleitem2,function(){
$scope.message=""
$scope.messageType=""
if($scope.item==null){
$scope.message="select an item before"
$scope.messageType="error"
}
$scope.iditem2=BarcodeService.id
$http
.post(Config.aj,{call:"item2.getitem2",id:$scope.iditem2})
.success(function(data){
$scope.item2=data.payload
})
})
$scope.$watch("item",function(){
console.log("---->",$scope.item)
},true)
$scope.$on(BarcodeService.handleitem,function(){
$scope.message="loading item"
$scope.messageType="info"
$scope.iditem=BarcodeService.id
$http
.post(Config.aj,{call:"item.getArticoloByIdMin",id:BarcodeService.id})
.success(function(data){
$scope.message="item loaded!!"
$scope.item=data.payload
})
})
})
Ok guys, got it!
The problem was lying in the fact that I have two in my application but i placed the ng-controller only on the first one.
This caused all the binding in the second div to fail because they were outside the controller.
I moved the ng-controller in the parent div that contains both and now everything works like a charm.
Thanks anyway to #wachme and #ivarni for taking care of my question.
Happy coding to *

Categories