How to get the contents of a Third Party lazy loading page?

How to get the contents of a Third Party lazy loading page? - javascript

Via a Chrome extension, I'm trying to get and modify the contents of a third-party page. Everything works for the part of the content that's immediately visible on initial page load.
The problem is that this page has a lazy-load/ajax pagination. To get all of the content I have to click "view all" (ajax link) (and I believe this works essentially the same way as lazy-loading that's why I put that keyword in the title).
Upon clicking that link (on that third-party website) all content gets loaded and becomes visible to the user but when I view source there's still only the originally loaded content present in the source code. i.e. none of the freshly loaded content can be found anywhere when I view page source after the new content has been loaded. The content is visible to the end user but not visible to me when I check the source code.
Initially, I tried to overcome the problem by using setInteval and checking the page content every second but as that wasn't working I checked the source code and sure enough, none of the newly loaded content is visible in the source code. No wonder my Chrome extension can't get that content.
Another confusing thing I just realized when typing here:
When I view source code, even the initial HTML content that my Chrome extension is detecting/loading is NOT actually present in the source code! It actually sits in a JavaScript array. So, somehow, my Chrome extension is correctly getting the initial HTML content that's constructed from that JS array. But it's NOT getting the content that gets loaded after clicking the "view all" ajax link on that page (even though I'm using setInteval and checking for new content every second).
What are possible solutions for this?
I can't post the link to the page because it's the "my certificates" page on Lynda.com and I don't know of a publicly accessible website/page with the same behavior.

you should find the actual service running in network-panel, when lazy loading happens, and then follow following code
//recursively make calls and gatther responses. cb is callback to run on response, end is end page-no (end of recursion condition) , pageId is the attribute changing in every subsequent lazy-loading call.
var callIfRequiredConfigured = ({cb,end,step=1,pageURL,pageId})=>callIfRequired = ()=>{
currentCounter = currentCounter + step;
if (currentCounter > end) {
return;
}
(async(currentCounter)=>{
queueCounter++;
//modify this as needed
const r = await fetch(pageURL+currentCounter,{credentials:"same-origin"});
//queueCounter to not make more than 6 calls at once
if (queueCounter > 6) {
return;
}
var response = await r.text();
cb(response);
queueCounter--;
callIfRequired();
}
)(currentCounter);
};
var call = (config)=>{
const callIfRequired = callIfRequiredConfigured(config);
callIfRequired();
}
call({
cb: (response)=>{
//do somrthing with response
}
,
end: 50,
step: 1,
pageId: 'PageNumber=',
pageURL: `https://www.lynda.com/home/CertificateOfCompletion/GetCertificatesByFilter?Start=0&Limit=99999&SortBy=CompletionDate&SortByOrder=1&_=[my_personal_id]&PageNumber=`
});
So main effort will be to deduce the service endpoint here and how it changes in subsequest requests. I have updated the url given in comments, but see if the fetch call is successful. Also this url should also have [my_personal_id] as given in url.

Related

Handling url targeted PHP page with custom jQuery loaded content

First I used include('pageName.php'); for every page I wanted to load.
Now I decided to rewrite everything and to load a page in a <div id="page_content"></div> with the jQuery function: $('#page_content').load(pageurl, {access:true});
I hope this is the best practice. Because I want to reduce load time on my web application by not refreshing the whole website with all CSS and JS files but just to refresh content when clicked on a new page.
Currently I am using the following function to load pages into the division and to pushState to history:
//Dynload pages
$("a[rel='dynload']").click(function(e) {
e.preventDefault();
var page = $(this).attr("page");
var pageurl = "pages/" + page + ".php";
$('#page_content').load(pageurl, {access:true});
if(pageurl!=window.location){
window.history.pushState({path:pageurl},'',page);
}
//stop refreshing to the page given in
return false;
});
This works perfectly.
I have this button that triggers this function and gives for attribute page="index_content" . The function will load this page to the division and the state is being pushed into the history.
However. We get an url something like this: https://mywebsite.com/index_content
The problem is, when I load this specific URL into my browser I get : "Page not found" ofcourse, because it is trying to search for the index_content folder which does not exist.
Is there a way to check the url by my PHP/jQuery script and to load the correct page into the division? If not, how can I solve this for a generic case.
When I add a new page, I want to spend no- to very less time on the pageswitcher function.
In that way I can also handle non-existing pages.
Thanks in advance!

Ajax load div in frame

I have a php page that I designed a div to be populated by an Ajax call.
function showcopay()
{
var apa = document.getElementById("alert_id").value;
$("#copay").load('show_copay.php?pid='+apa);
}
The parent page of the div used to be a popup page. I have moved the page to an iframe. Ajax does not work any more. When I click the link to load the div. Nothing happens.
The content file (show_copay.php) that is being called is in same folder as the parent file as before. Nothing moved as I stated before. I moved the parent page to the iframe and everything stopped working.
Do I need to include a path to the file?

The ajax was being more precise than imagined in the first place. Because the
doucment.getElementById("alert_id").value
was returning a null value (which was shown in the console but I was ignoring it). The console was showing an uncaught exception. So adding the try block around it solved the problem. Now the page works like it use to work.
function showcopay()
{
try{
var apa = document.getElementById("alert_id").value;
}catch(err){
}
$("#copay").load('show_copay.php?pid='+apa);
}

How to initialise a lightbox

I tried to use a page on
http://dimsemenov.com/plugins/magnific-popup/
to start with the project. So I took the code an assumed to find out, what I need from the larger page for me. Though, cutting anything away made id not function at all.
What resources are needed (css, js, links)?
I need on several pages a light box and want to load the first picture as soon as the page loads. Tried to build a test page on
http://grillparzerhof.at/magnificversuch/index.html
though there is a light box not at all. It is a very beginners question; please help.
~ Karl

This is the code on that page in Public Methods you should use to fire the lightbox on page load, this instruction is near the bottom of the Documentation page:
// Open popup immediately. If popup is already opened - it'll just overwite the content (but old options will be kept).
// - first parameter: options object
// - second parameter (optional): index of item to open
$.magnificPopup.open({
items: {
src: 'someimage.jpg'
},
type: 'image'
// You may add options here, they're exactly the same as for $.fn.magnificPopup call
// Note that some settings that rely on click event (like disableOn or midClick) will not work here
}, 0);

how to know when the change of a dynamic page is fully loaded (from a userscript)?

On facebook.com, where clicking on a 'link' appears to load a new page but (I believe) really just updates the existing page with new code, how can I know when that new content is fully loaded? For example, when you go to facebook.com, you start at your 'home' page (the news feed). When you click on your name in the navbar you're taken to facebook.com/yourname. However, it's not really a new page as a userscript doesn't get reloaded.
Because of this I wrote a little checker for my userscript that watches to see if the current href of the page has changed:
var startURL;
var curURL;
function main() {
startURL = window.location.href;
console.log("initial page. src: "+startURL);
setInterval(checkfornewpage, 500);
}
function checkfornewpage() {
curURL = window.location.href;
if(curURL != startURL) {
console.log("new page. src: "+curURL);
// do something
startURL = curURL;
}
}
main();
This works fine and notifies me when the page has changed. But when trying to then access elements within that new page I'm finding that what I get back is from the old page. I presume this is because even though the window.location.href has changed, the rest of the page isn't fully loaded.
So then I decided to look for some content on the new page that I know will change to clue me into whether the new content is fully loaded. I chose the body tag classes because the list of classes for body is always different with each new page. I wrote a function to watch for this, but even after the body class list has fully changed, my queries are still bringing up old page data.
I also tried calling load() on the body element, but that never fires.
I don't want to use some kind of generic setTimeout that just waits long enough...I want to know precisely when it's loaded so I can move forward immediately. Ideas?

Javascript: check if the url is changed or not?

My url format is like :
http://domain.in/home
http://domain.in/books/notes
http://domain.in/books/notes/copy
I've called a javascript function on window.load to check if the url has changed or not.
If the url has been changed then code is executed else it will return and checks again after 5 sec.
My code is :
window.onload = function(){
setInterval(function(){
page_open();
}, 5000);
};
function page_open(){
var pages=unescape(location.href);
pages=pages.substr( pages.lastIndexOf("studysquare.in/") + 15 );
// gives book if url is http://studysquare.in/book
//alert("pages"+pages+"\n\n recent"+recent);
if (pages==recent) { return; }
recent=pages;
alert("Reached down now the code will execute.");
}
The problem now is : when the url is like :
http://domain.in/book
Single level deep, then everything works fine. But when the url is like
http://domain.in/book/copy or http://domain.in/book/copy/notes
Then nothing works.....
Any help to check 3 level deep url change in javascript every 5 sec ? :)
Hi sorry I forgot to tell that... I've .htaccess file which doesnt allow to navigate the page when any length url after the domain.in/ is written.... that means only single page remains open and not affected by the url change...

When the user changes the URL, the browser unloads the entire page they're currently on (including your javascript, hence it stops running) and then loads the next page. No javascript is able to run across page changes. You can't monitor a change in the URL like you're doing if they're navigating to another page.
The best way to catch a change in the URL is to add an onUnload event to the body object to fire your javascript when the browser unloads the page just before starting to load the new page the user has requested -- but I'm not sure that's going to help achieve your goal of tracking their recent page views (if that's what you're looking to do).

Sounds like a history plugin such as jQuery address would help you a lot.
It lets you handle the event when the URL is changed, so you can load in new content as required.

We Keep Coding

JavaScript is the programming language of the Web.