Open link in PhantomJS synchronously - javascript

I want to know is there any way to click link (open link) on PhantomJS synchronously. First page must be opened first, then click link on the first page to go to second page. Here is my approach using setTimeout:
var page = require('webpage').create();
var url = "http://domain.tld/index.html";
page.open(url, function (status) {
page.render("1-home.png");
page.evaluate(function() {
// search for element and click it to redirect
document.getElementById("RetailUser").click();
});
// I use setTimeout to wait for firstpage to get loaded
setTimeout(function () {
// do another process on second page
}, 5000);
}
PS: I aware I can accomplish it using CasperJS. But if possible, I don't want to use CasperJS.

Clicking a link is always synchronously. What happens afterwards is always asynchronously, because of the asynchronous nature of JavaScript.
There are multiple ways to wait for the next page:
Static amount of time with setTimeout,
Dynamic waiting for a condition with waitFor,
Next page load by registering to page.onLoadFinished before the click,
Wait until all outstanding requests are finished (one and two) or
A combination of those four methods.

Related

Cannot get link to be clicked and switch to next page using PhantomJS

I am having an issue getting phantomJS to click the login button on a website.
I can see in my second screenshot that it is trying to select the login button, but I cannot get it to wait and take the screenshot on the next page.
Here is my JS file:
var page = require('webpage').create();
page.viewportSize = {width: 1920,height: 1080};
page.open('http://clubs.bluesombrero.com/default.aspx?portalid=1809', function (status) {
console.log("Status: " + status);
if (status === "success") {
var url = page.url;
console.log('URL: ' + url);
console.log("TC0001: Pass");
page.render('TC0001.png');
var a = page.evaluate(function() {
return document.querySelector('#dnn_dnnLOGIN_cmdLogin');
});
page.sendEvent('click', a.offsetLeft, a.offsetTop);
page.render('TC0002.png');
} else {
console.log("TC0001: Failed, Page did not load.");
}
phantom.exit();
});
I have tried a few ways to get it to wait to take the screenshot after the page has loaded, but I have not had any luck.
page.sendEvent() is a synchronous function that finishes as soon as its action is done. The next call (page.render()) is executed even before the request which was triggered by the click is answered.
1. setTimeout
JavaScript provides two functions to wait a static amount of time: setTimeout and setInterval:
page.sendEvent('click', a.offsetLeft, a.offsetTop);
setTimeout(function(){
page.render('TC0002.png');
phantom.exit();
}, 5000);
(don't forget to remove the other phantom.exit() since you don't want to exit too early)
Of course the problem is now that on one hand the page still might not be ready after 5 seconds or on the other hand the page was loaded extremely fast and just sits there doing nothing.
2. waitFor
A better approach would be to use the waitFor() function that is provided in the examples folder of PhantomJS. You can wait for a specific condition of the page like the existence of a specific element:
page.sendEvent('click', a.offsetLeft, a.offsetTop);
waitFor(function _testFx(){
return page.evaluate(function(){
return !!document.querySelector("#someID");
});
}, function _done(){
page.render('TC0002.png');
phantom.exit();
}, 10000);
3. page.onLoadFinished
Another approach would be to listen to the page.onLoadFinished event which will be called when the next page is loaded, but you should register to it before you click:
page.onLoadFinished = function(){
page.render('TC0002.png');
phantom.exit();
};
page.sendEvent('click', a.offsetLeft, a.offsetTop);
4. page.onPageCreated
Whenever a new window/tab would be opened in a desktop browser, the page.onPageCreated would be triggered in PhantomJS. It provides a reference to the newly created page, because the previous page is not overwritten.
page.onPageCreated = function(newPage){
newPage.render('TC0002.png');
newPage.close();
phantom.exit();
};
page.sendEvent('click', a.offsetLeft, a.offsetTop);
In all the other cases, the page instance is overwritten by the new page.
5. "Full" page load
That might still not be sufficient, because PhantomJS doesn't specify what it means when a page is loaded and the JavaScript of the page may still make further requests to build up the page. This Q&A has some good suggestions to wait for a "full" page load: phantomjs not waiting for “full” page load

Unload JS loaded via load() to avoid duplicates?

I'm building a dynamic website that loads all pages inside a "body" div via jquery's load(). The problem is I have a script looped with setInterval inside the loaded PHP page, the reason being I want the script loaded only when that page is displayed. Now I discovered that the scripts keep running even after "leaving" the page (loading something else inside the div without refresh) and if I keep leaving / returning the loops stack up flooding my server with GET requests (from the javascript).
What's a good way to unload all JS once you leave the page? I could do a simple dummy var to not load scripts twice, but I would like to stop the loop after leaving the page because it's causing useless traffic and spouting console errors as elements it's supposed to fill are no longer there.
Sorry if this has already been asked, but it's pretty hard to come up with keywords for this.
1) why don't you try with clearInterval?
2) if you have a general (main) function a( ) { ... } doing something you can just override it with function a() { }; doing nothing
3) if you null the references to something it will be garbage collected
no code provided, so no more I can do to help you
This really sounds like you need to reevaluate your design. Either you need to drop ajax, or you need to not have collisions in you method names.
You can review this link: http://www.javascriptkit.com/javatutors/loadjavascriptcss2.shtml
Which gives information on how to remove the javascript from the DOM. However, modern browsers will leave the code in memory on the browser.
Since you are not dealing with real page loads/unloads I would build a system that simulates an unload event.
var myUnload = (function () {
var queue = [],
myUnload = function () {
queue.forEach(function (unloadFunc) {
undloadFunc();
});
queue = [];
};
myUnload.add = function (unloadFunc) {
queue.push(unloadFunc);
};
return myUnload;
}());
The code that loads the new pages should just run myUnload() before it loads the new page in.
function loadPage(url) {
myUnload();
$('#page').load(url);
}
Any code that is loaded by a page can call myUnload.add() to register a cleanup function that should be run when a new page is loaded.
// some .js file that is loaded by a page
(function () {
var doSomething = function () {
// do something here
},
timer = setInterval(doSomething, 1000);
// register our cleanup callback with unload event system
myUnload.add(function () {
// since all of this code is isolated in an IIFE,
// clearing the timer will remove the last reference to
// doSomething and it will automatically be GCed
// This callback, the timer var and the enclosing IIFE
// will be GCed too when myUnload sets queue back to an empty array.
clearInterval(timer);
});
}());

How to put a sleep after clicking a button

I would like to put a delay after a button is pressed in order for the button to load the data from the cache before executing the next line of code. Would putting a sleep be the best way to do this?
Something like this or is there an alternative approach to best solve this problem?
setInterval(document.getElementById("generateButton"), 1000);
Don't use setInterval to do this. It doesn't have the functionality you seem to desire (it repeats). Instead, use jQuery and do something like this:
$("#generateButton").click(function(event){
setTimeout(function(){
//Do what the button normally does
}, 1000);
});
Or (without JQuery):
var generateButton = document.getElementById("generateButton");
generateButton.addEventListener("click", function(){
setTimeout(function(){
//Do what the button normally does
}, 1000);
});
Using setTimeout over setInterval is preferred in your case because setTimeout runs only once while setInterval runs multiple times.
I assume you have, in your html, <button id='generateButton' onclick='someFunction()'>Button Text</button>. Remove the onclick='someFunction() and put your someFunction() where I said (in the examples) "Do what the button normally does."
You can also add in the code that loads the cache a method that calls another method once the cache has been loaded (when the someFunction() from the button is called, it loads the cache, and at the end of the function (set this up using callbacks), once the cache has been loaded, it calls another method onCacheLoaded() that can be run once the cache has been loaded.
You should use callbacks, so the moment you loaded data from cache you can call it and continue executing the rest of the script.
You cannot use interval since you cannot be sure how much time is needed for the data to load. Though keep in mind the asynchronous nature of javascript and don't block the part of the script that does not depend on the data that's being loaded.
Try setTimeout:
myButton.addEventListener('click', function() {
setTimeout(delayed, 1e3); // Delay code
}, false);
function delayed() {
// Do whatever
}
Note setInterval runs a function periodically, setTimeout only once.
Also note that the delayed code must be a function (or a string which will be evaluated, but better avoid that). However, document.getElementById("generateButton") returns an html element (or null).

How to change current page in Phantomjs using buttons? [duplicate]

This question already has an answer here:
Open link in PhantomJS synchronously
(1 answer)
Closed 6 years ago.
I have start to using Phantomjs and I don't understand something. How to change current page using buttons? For example I've got a code:
var page = require('webpage').create();
page.open('https://ru-ru.facebook.com', function() {
page.injectJs('jQuery.js');
page.evaluate(function() {
$('#email').val('MyLogin');
$('#pass').val('MyPass');
$('#u_0_l').click();
});
page.render('example.png');
phantom.exit();
});
So after clicking a button I need to go to the next page. How can I do this?
Assuming the $.click() actually worked, you need to wait until the page has loaded.
setTimeout:
You may feel lucky and try it with
page.evaluate(function() {
// form vals and click
});
setTimeout(function(){
page.render('example.png');
phantom.exit();
}, 3000); // 3 sec timeout
waitFor selector:
You may use the waitFor example from the phantomjs package to wait for an element that you know is only on the loaded page, but not on the current page.
waitFor loadFinished:
You may combine waitFor with the onLoadFinished callback handler to do wait for the next page to load. This is preferable since this would work every time and would not impose overshooting with a conservative timeout.
Just use CasperJS:
Nothing more to add.

How can I capture a click on the browser page without any possible effect on the site's robustness?

My javascript code is added to random websites. I would like to be able to report to my server when a (specific) link/button on the a website is clicked. However I want to do it without any possible interruption to the website execution under any circumstances (such as error in my code, or my server id down etc.). In other words I want the site to do its default action regardless of my code.
The simple way to do it is adding event listener to the click event, calling the server synchronously to make sure the call is registered and then to execute the click. But I don't want my site and code to be able to cause the click not to complete.
Any other ideas on how to do that?
As long as you don't return false; inside your callback and your AJAX is asynchronous I don't think you'll have any problems with your links not working.
$("a.track").mousedown(function(){ $.post("/tools/track.php") })
I would also suggest you encapsulating this whole logyc inside a try{} catch() block so that any errors encauntered will not prevent the normal click behaviour to continue.
Perhaps something like this? I haven't tested it so it may contain some typo's but the idea is the same...
<script type="text/javascript">
function mylinkwasclicked(id){
try{
//this function is called asynchronously
setTimeOut('handlingfunctionname('+id+');',10);
}catch(e){
//on whatever error occured nothing was interrupted
}
//always return true to allow the execution of the link
return true;
}
</script>
then your link could look like this:
<a id="linkidentifier" href="somelink.html" onclick="mylinkwasclicked(5)" >click me!</a>
or you could add the onclick dynamically:
<script type="text/javascript">
var link = document.getElementById('linkidentifier');
link.onclick=function(){mylinkwasclicked(5);};
</script>
Attach this function:
(new Image()).src = 'http://example.com/track?url=' + escape(this.href)
+ '&' + Math.random();
It is asynchronous (the 'pseudo image' is loaded in the background)
It can be cross domain (unlike ajax)
It uses the most basic Javascript functionalities
It can, however, miss some clicks, due to site unloading before the image request is done.
The click should be processed normally.
1) If your javascript code has an error, the page might show an error icon in the status bar but it will continue the processing, it won't hang.
2) If your ajax request is asynchronous, the page will make that request and process the click simultaneously. If your server was down and the ajax request happening in the background timed out, it won't cause the click event to not get processed.
If you do the request to your server synchronously, you'll block the execution of the original event handler until the response is received, if you do it asynchronously, the original behaviour of the link or button may be doing a form post or changing the url of the document, which will interrupt your asynchronous request.
Delay the page exit just long enough to ping your server url
function link_clicked(el)
{
try {
var img = new Image();
img.src = 'http://you?url=' + escape(el.href) + '&rand=' + math.random();
window.onbeforeunload = wait;
}catch(e){}
return true;
}
function wait()
{
for (var a=0; a<100000000; a++){}
// do not return anything or a message window will appear
}
so what we've done is add a small delay to the page exit to give the outbound ping enough time to register. You could also just call wait() in the click handler but that would add an unnecessary delay to links that don't exit the page. Set the delay to whatever gives good results without slowing down the user noticeably. Anything more than a second would be rude but a second is a long time for a request roundtrip that returns no data. Just make sure your server doesn't need any longer to process the request or simply dump to a log and process that later.

Categories