How to load HTML after content has been loaded - javascript

I am trying to get a list of the content on a website (this one if anyone is interested). The layout has changed recently and now they do not load the content all at once, but with magic (js probably). I'm currently using JSoup to analyze the HTML, but im open to suggestions.
This is what i am getting:
<div class="row" data-v-6e4dbe9e>
<div class="col-17 podcasts-group" data-v-6e4dbe9e>
<div class="loading-spinner" data-v-6e4dbe9e> //the devil himself
<div class="spinner" data-v-ac3cb376 data-v-6e4dbe9e>
<div class="rect1" data-v-ac3cb376></div>
<div class="rect2" data-v-ac3cb376></div>
<div class="rect3" data-v-ac3cb376></div>
<div class="rect4" data-v-ac3cb376></div>
<div class="rect5" data-v-ac3cb376></div>
</div>
</div>
<div mode="in-out" class="transition-group row" data-v-6e4dbe9e>
//Here should be stuff!
</div>
</div>
</div>
the code that achieves this:
String selector = "div.podcasts-items";
Elements elem = Jsoup.connect(link).get().select(selector)
System.out.println("html: "+elem.html());
This is what i would like to see (copied from inspect element after the page has loaded all the content):
<div class="row" data-v-6e4dbe9e>
<div class="col-17 podcasts-group" data-v-6e4dbe9e>
<!----> //begone evil!
<div mode="in-out" class="transition-group row" data-v-6e4dbe9e>
<div class="col-17 col-md-8 center-margin" data-v-6e4dbe9e="">...</div>
<div class="col-17 col-md-8 center-margin" data-v-6e4dbe9e="">...</div>
<div class="col-17 col-md-8 center-margin" data-v-6e4dbe9e="">...</div>
<div class="col-17 col-md-8 center-margin" data-v-6e4dbe9e="">...</div>
</div>
</div>
</div>
Google doesn't help much, because every content related to spinners etc. is about javascript.
solution:
due to the fact that JSoup only loads the HTML and does not execute any javascript the page never had a chance to load the content. You would have to use an actual browser engine or a webdriver like selenium to get the data to load.
For this specific problem I was able to get the content directly via loading the Json data through this webpage's API.

If I understood your question then your best bet is to use Selenium driver. Link to similar question

Related

Replacing a Block of HTML code with new html

SO I am trying to replace a header section with a new header section on a a couple of pages.
The reason I am doing this using jquery or Java Script is because it is build using PL/SQL and I cannot get into that part only the template that calls in the pl/sql. SO I am wanting to replace the with an entire new block of code as well as change the header to
Hopefully this is making sense.
Currently I have tried the following with now luck
$('header'). replaceWith("new html");
and I have also tried
$('header').html("my new html");
Still didn't work.
If anyone has any ideas I am open now this is not just a single line of HTML it's the entire nav menu and and logo and all of that so it maybe I an just not writing the html properly to be called in through the replace method.
Any help would be appreciated.
Here is an example
<header class="header-account">
<nav class="navbar navbar-static-top">
<div class="container">
<div class="row">
<div class="col-sm-12 col-md-3">
<div class="navbar-header">
<a class="navbar-brand" href="http://aae.org/specialty" alt=""></a>
</div>
</div>
<div class="col-sm-12 col-md-9">
<div class="row">
<div class="col-sm-6">
<div class="welcome-name">Welcome,
<!--#include object="CUST_DISPLAY_NM"-->
</div>
</div>
to be replaced with
<header class="fl-page-header fl-page-header-fixed fl-page-nav-right">
<div class="fl-page-header-wrap">
<div class="fl-page-header-container container">
<div class="fl-page-header-row row">
<div class="fl-page-logo-wrap col-md-3 col-sm-12">
<div class="fl-page-header-logo">
<a href="https://aaendo.wpengine.com/patients/"><img
class="fl-logo-img" itemscope itemtype="http://schema.org/ImageObject"
src="https://aaendo.wpengine.com/patients/wp-
content/uploads/sites/3/2017/08/American-Association-of-Endodontists-1.png"
data-retina="https://aaendo.wpengine.com/patients/wp-
content/uploads/sites/3/2017/08/American-Association-of-
Endodontists#2x.png" alt="Endodontists: Specialists in Saving Teeth" /><img
class="sticky-logo fl-logo-img" itemscope
itemtype="http://schema.org/ImageObject"
src="https://aaendo.wpengine.com/patients/wp-
content/uploads/sites/3/2017/08/American-Association-of-Endodontists-1.png"
alt="Endodontists: Specialists in Saving Teeth" /><meta itemprop="name"
content="Endodontists: Specialists in Saving Teeth" /></a>
</div>
</div>
<div class="fl-page-fixed-nav-wrap col-md-9 col-sm-12">
<div class="fl-page-nav-wrap">
<nav class="fl-page-nav fl-nav navbar navbar-default">
<div class="fl-page-nav-collapse collapse navbar-c
I managed to get it to work... see below. I removed the formatting on the code you wanted to change it to. It's all on one line.
<script>
var newhtml = '<header class="fl-page-header fl-page-header-fixed fl-page-nav-right"><div class="fl-page-header-wrap"><div class="fl-page-header-container container"><div class="fl-page-header-row row"><div class="fl-page-logo-wrap col-md-3 col-sm-12"><div class="fl-page-header-logo"><img class="fl-logo-img" itemscope itemtype="http://schema.org/ImageObject" src="https://aaendo.wpengine.com/patients/wp-content/uploads/sites/3/2017/08/American-Association-of-Endodontists-1.png" data-retina="https://aaendo.wpengine.com/patients/wp-content/uploads/sites/3/2017/08/American-Association-of-Endodontists#2x.png" alt="Endodontists: Specialists in Saving Teeth" /><img class="sticky-logo fl-logo-img" itemscope itemtype="http://schema.org/ImageObject" src="https://aaendo.wpengine.com/patients/wp-content/uploads/sites/3/2017/08/American-Association-of-Endodontists-1.png"alt="Endodontists: Specialists in Saving Teeth" /><meta itemprop="name" content="Endodontists: Specialists in Saving Teeth" /></div></div><div class="fl-page-fixed-nav-wrap col-md-9 col-sm-12"><div class="fl-page-nav-wrap"><nav class="fl-page-nav fl-nav navbar navbar-default"><div class="fl-page-nav-collapse collapse navbar-c'
$('header').replaceWith(newhtml);
</script>
oh, and I'm sure you have it in your source code, but make sure all your div's are closed, and of course your header element.
to replace the content shown on your example you need use ".replaceWith" method, because you have different css classes in each case.
Another thing than you w'll have consider is the tags structure, open-close each them. To replace an object in the DOM you must give complete object/objects, for example:
Good using:
<header>
<div>
<!-- your header context here -->
</div>
</header>`
Bad using:
<header>
<div>
<!-- your header context here -->
Finally: use .replaceWith method and respect the tags structure.

Prevent iframe reload

I am currently working on a pdf viewer page with angularjs (although the same problem would probably occur with angular or react).
The document itself is embeded with the pdf-box in an iframe. The problem is that when switching from one layout type to another all the documents are getting downloaded again, causing unneccessary requests and bad UX. Using ng-show instead of ng-if causes some documents to be downloaded multiple times.
Here a small piece of how the component is structured:
<div class="doc-container">
<div class="container-fluid" ng-if="model.currentLayout.Label == '1'">
<div class="row">
<div class="col-lg-12 col-md-12">
<pdf-box doc="model.slots[0].doc"></pdf-box>
</div>
</div>
</div>
<div class="container-fluid" ng-if="model.currentLayout.Label == '1x2'">
<div class="row">
<div class="col-lg-6 col-md-6">
<pdf-box doc="model.slots[0].doc"></pdf-box>
</div>
<div class="col-lg-6 col-md-6">
<pdf-box doc="model.slots[1].doc"></pdf-box>
</div>
</div>
</div>
</div>
Some ideas that I have:
Try with ng-include or $templateCache, just for the iframe internally, not sure if it would work though.
Use pdf.js and cache the documents as base64 strings. That would prevent the additional http requests but would still cause the previously opened documents to be "restarted" to page 0, causing again a bad UX.
It seems to me that there might be a much easier solution in front of my eyes.
Is there a way to force the browser to "reuse" an existing iframe src after DOM manipulation?

Element unexcpectedly hidden on some instances of Safari, always appears in Firefox, Chrome

I've created an angular app the acts as "bounce" page for a social iOS application, that is in public beta.
Users can share their profiles among each other as a link to the site.
If the page is opened on an iOS device, and the app is installed, it opens the app using the app's URL scheme. Otherwise it redirects to the App Store.
For all other devices, android, desktop, etc it just renders a simple information page.
The problem:
On one instance of Safari, two elements of the page are not rendered.
Missing elements:
"Instruments" label below the profile name does not appear.
Copyright, privacy and terms and conditions does not appear.
On all other instances of Safari, also Chrome and Firefox, it appears correctly as follows:
The site is implemented using Angular + Bootstrap.
Page source for the section with the missing instruments label is:
<div ng-show="vm.loaded">
<div class="row">
<div class="col-sm-4 col-sm-offset-4 profile-container">
<div class="row">
<div class="col-xs-12" style="min-width: 300px !important;">
<div class="card hovercard">
<div class="photo-container">
</div>
<div class="useravatar">
<img ng-src="{{vm.profile.coverPhotoURL}}" vpr-load="vm.onImageLoaded()">
</div>
</div>
</div>
<div class="col-xs-12 text-center" style="margin-top: -10px">
<span class="profile-title">{{vm.titleText()}}</span>
</div>
<div class="col-xs-12 text-center" style="margin-top: 8px">
<span class="profile-sub-title">{{vm.instruments()}}</span>
</div>
<div class="col-xs-12 text-center" style="padding-bottom: 20px">
<span class="profile-genres">{{vm.genres()}}</span>
</div>
</div>
</div>
</div>
And the footer is as follows:
<footer class="vFooter hidden-xs">
<div class="container">
<div class="row">
<div class="col col-xs-6 text-center" style="height: 80px">
<div class="vertical-center">
<p class="copyright vertical-center" style="margin-top: 20px; margin-bottom: 20px">
<span class="text-nowrap">© Vampr Pty Ltd 2016 | <vpr-link-bar></vpr-link-bar></span>
</p>
</div>
</div>
(etc, etc)
(some closing <div> tags not shown in source snippet above)
Question:
Without having access to the machine where the problem occurs, how can I debug and correct the issue?
I found a similar problem on another instance of Safari that seemed to be caused by AdBlock - disabling it made the elements appear correctly. However the elements in question were different once - the social media buttons.
The problem appears to have been caused, simply, by an unescaped ampersand '&' character in the HTML source. Fixing this solved the problem. I'm not sure why the issue occurred only on one particular instance of Safari out of many others.

How to get retrieve a script tag from HTML file source using TWebbrowser component?

In the HTML document there is script tag that contains some javascript function
<div class="container">
<div id="container2">
<div id="container3">
<script>
loadme ('main');
</script>
</div>
</div>
But when I do 'Inspect Element' on browser, instead of this, appears a block of tags.
<div class="container">
<div id="container2">
<div id="container3">
<div class="content">
<div class="contenthead">
Some Text
</div>
<div class="c1">
<div class="c2">
<form id="myForm">
<label>
Text
</label>
</form>
<div class="c3">
<a href="#" onclick="javascript:f1('Text', 0, 0)">
</a>
</div>
<div class="clear">
</div>
</div>
...
I want to get this block with my own app but I cannot. I use Delphi TWebBrowser to do this.
How can I get this HTML code using Delphi WebBrowser?
Yes, obviously. because while inspecting through Firefox Firebug or Chrome Firebug or IE Firebug, you can't see the script like the one you mentioned.
And you can see your script tag by viewing through Source Window (ctrl + u => shortcut to open source window for Chrome and Firefox. )

Initializing DHTMXscheduler

I was testing out the DHTMLXscheduler module on my application. I followed all the steps which explained setting up the module on http://docs.dhtmlx.com/doku.php?id=dhtmlxscheduler:how_to_start but couldnt get it to work.
I have implemented the js and css in my header file and implemented all the necessary code. After running my application, I end up with an empty screen without any errors in my console.
Did someone ever experienced the same like this?
The code i used to initialize is exactly the same as shown on their webpage which is:
<script type="text/javascript">
scheduler.init('scheduler',null,"week");
</script>
<div id="content">
<div id="scheduler" class="dhx_cal_container" style='width:100%; height:100%;'>
<div class="dhx_cal_navline">
<div class="dhx_cal_prev_button"> </div>
<div class="dhx_cal_next_button"> </div>
<div class="dhx_cal_today_button"></div>
<div class="dhx_cal_date"></div>
<div class="dhx_cal_tab" name="day_tab" style="right:204px;"></div>
<div class="dhx_cal_tab" name="week_tab" style="right:140px;"></div>
<div class="dhx_cal_tab" name="month_tab" style="right:76px;"></div>
</div>
<div class="dhx_cal_header">
</div>
<div class="dhx_cal_data">
</div>
</div>
I have also tried calling the init method after the html content which resulted in the same result.
The problem is solved. I had to increase the height of the inner div #content.

Categories