Why does phantom.exit() have a 2 second delay? - javascript

I noticed that for a simple script like:
var url = "http://stackoverflow.com";
var page = require('webpage').create();
page.onConsoleMessage = function(msg) {
console.log('Page title is ' + msg);
};
page.onLoadFinished = function(status) {
console.log('Status: ' + status);
};
page.open(url, function(status) {
page.evaluate(function() {
console.log(document.title);
});
phantom.exit();
});
calling phantom.exit() will not exit immediately, rather it will wait 2 seconds before doing so. I'm using version 2.1.1.
Do you know where this delay comes from and how I can make phantom exit immediately? Thank you!

Look like "open issue":
https://github.com/ariya/phantomjs/issues/14033
I did try your code with the last version and I get the same behaviour.
You can use phantomjs#1.9.X as "jmullo" commented in the issue thread.

Like other browser, phantomjs is just headless. so it's exit() method will do many things, and it may cost couple of seconds.

Related

CasperJS running not showing output to console

I'm running Casperjs from the cmd, and while everything seems to work ok (the script runs as expected),
It won't show the echo command in the cmd without me pressing a key (return).
Here's the code:
var casper = require('casper').create();
casper.start('http://casperjs.org/');
casper.then(function() {
this.echo('First Page: ' + this.getTitle());
});
casper.thenOpen('http://phantomjs.org', function() {
this.echo('Second Page: ' + this.getTitle());
});
casper.run();
And here's an image of the problem (won't show anything without me pressing return):
After I hit return twice:
Thanks!
Well, I found a work around so to say...
I created a batch file and ran it and it worked.
I've created something like this:
echo Testing something...
casperjs test1.js

How to end an endless stream of alert messages in PhantomJS

How can I close endless alert in PhantomJS?
a website has endless alert
I used
page.onAlert = function(msg) {
console.log('ALERT: ' + msg);
};
to check whether the website has alert
but this method you will never stop because that's an endless alert
You don't have to print all alerts that the page you're visiting sends your way. Simply remove the handler to silence them:
page.onAlert = function(msg) {
console.log('ALERT: ' + msg);
page.onAlert = function(){};
};
This prints only the first alert. You can make this more sophisticated by adding counting alerts or something like that.
Try this for every reloaded page that would have an alert later.
driver.execute_script("window.confirm = function(){return true;}");
See more reference here.

HTML output from PhantomJS and Google Chrome/Firefox are different

I've been debugging this for a long time and it has me completely baffled. I need to save ads to my computer for a work project. Here is an example ad that I got from CNN.com:
http://ads.cnn.com/html.ng/site=cnn&cnn_pagetype=main&cnn_position=300x250_rgt&cnn_rollup=homepage&page.allowcompete=no&params.styles=fs&Params.User.UserID=5372450203c5be0a3c695e599b05d821&transactionID=13999976982075532128681984&tile=2897967999935&domId=6f4501668a5e9d58&kxid=&kxseg=
When I visit this link in Google Chrome and Firefox, I see an ad (if the link stops working, simply go to CNN.com and grab the iframe URL for one of the ads). I developed a PhantomJS script that will save a screenshot and the HTML of any page. It works on any website, but it doesn't seem to work on these ads. The screenshot is blank and the HTML contains a tracking pixel (a 1x1 transparent gif used to track the ad). I thought that it would give me what I see in my normal browser.
The only thing that I can think of is that the AJAX calls are somehow messing up PhantomJS, so I hard-coded a delay but I got the same results.
Here is the most basic piece of test code that reproduces my problem:
var fs = require('fs');
var page = require('webpage').create();
var url = phantom.args[0];
page.open(url, function (status) {
if (status !== 'success') {
console.log('Unable to load the address!');
phantom.exit();
}
else {
// Output Results Immediately
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlBeforeTimeout.htm", html, 'w');
page.render('RenderBeforeTimeout.png');
// Output Results After Delay (for AJAX)
window.setTimeout(function () {
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlAfterTimeout.htm", html, 'w');
page.render('RenderAfterTimeout.png');
phantom.exit();
}, 9000); // 9 Second Delay
}
});
You can run this code using this command in your terminal:
phantomjs getHtml.js 'http://www.google.com/'
The above command works well. When you replace the Google URL with an Ad URL (like the one at the top of this post), is gives me the unexpected results that I explained.
Thanks so much for your help! This is my first question that I've ever posted on here, because I can almost always find the answer by searching Stack Overflow. This one, however, has me completely stumped! :)
EDIT: I'm running PhantomJS 1.9.7 on Ubuntu 14.04 (Trusty Tahr)
EDIT: Okay, I've been working on it for a while now and I think it has something to do with cookies. If I clear all of my history and view the link in my browser, it also comes up blank. If I then refresh the page, it displays fine. It also displays fine if I open it in a new tab. The only time it doesn't is when I try to view it directly after clearing my cookies.
EDIT: I've tried loading the link twice in PhantomJS without exiting (manually requesting it twice in my script before calling phantom.exit()). It doesn't work. In the PhantomJS documentation it says that the cookie jar is enabled by default. Any ideas? :)
You should try using the onLoadFinished callback instead of checking for status in page.open. Something like this should work:
var fs = require('fs');
var page = require('webpage').create();
var url = phantom.args[0];
page.open(url);
page.onLoadFinished = function()
{
// Output Results Immediately
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlBeforeTimeout.htm", html, 'w');
page.render('RenderBeforeTimeout.png');
// Output Results After Delay (for AJAX)
window.setTimeout(function () {
var html = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML;
});
fs.write("HtmlAfterTimeout.htm", html, 'w');
page.render('RenderAfterTimeout.png');
phantom.exit();
}, 9000); // 9 Second Delay
};
I have an answer here that loops through all files in a local folder and saves images of the resulting pages: Using Phantom JS to convert all HTML files in a folder to PNG
The same principle applies to remote HTML pages.
Here is what I have from the output:
Before Timeout:
http://i.stack.imgur.com/GmsH9.jpg
After Timeout:
http://i.stack.imgur.com/mo6Ax.jpg

Accessing the contentDocument of an iframe in phantomjs

I'm having difficulties accessing the contentDocument of an iframe. I am using phantomjs (1.9). I have looked into various threads but none seem to have the answer.
This is my phantomjs script where I have injected jquery to try and select the element.
var page = require('webpage').create();
page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('CONSOLE: ' + msg);
};
page.onError = function(msg) {
console.log('ERROR MESSAGE: ' + msg);
};
page.open('http://localhost:8080/', function() {
page.includeJs("http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js", function() {
page.evaluate(function() {
console.log( $('iframe').contentDocument.documentElement );
});
phantom.exit();
});
});
Apart form jquery, I have also used these two lines of code to get the DOM element that I want (the DOM HTML element that's inside the iframe). PhantomJS seems unable to parse anything beyond getElementsByTagName('iframe') or $('iframe') could it be because it hasn't finished loading yet?
document.getElementsByTagName('iframe')[0].contentDocument.activeElement;
document.getElementsByTagName('iframe')[0].contentDocument.documentElement;
I am also running the script with --web-security=no setting disabled
I ran into this issue but found it was because I was not wrapping the code in evaluate(). You seem to be doing that though. Try this not using jquery.
page.evaluate(function (){
iframe = document.getElementById('iframeName').contentDocument
iframe.getElementById("testInput").value = "test";
});

Phantomjs check if javascript exists and is working

I am quite new to Phantomjs and am starting to getting to know how to use it. However all of the semi advanced tutorials does not cover the thing i want to use Phantomjs for.
Now my question is how would i check if a Javascript is active on the site and if it is working correcly (i.e not throwing erros in the console).
I hope someone is able to point me in the right direction or know how to do this.
you can interact with the open page using the webpage.evaluate method:
var page = require('webpage').create();
page.open('http://m.bing.com', function(status) {
var title = page.evaluate(function(s) {
//what you do here is done in the context of the page
//this console.log will appear in the virtual page console
console.log("test")
//here they are just returning the page title (tag passed as argument)
return document.querySelector(s).innerText;
//you are not required to return anything
}, 'title');
console.log(title);
phantom.exit(); //closes phantom, free memory!
});
in order to see the virtual page console, you have to add a onConsoleMessage callback:
page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('CONSOLE: ' + msg + ' (from line #' + lineNum + ' in "' + sourceId + '")');
};
EDIT: by default javascript is executed.

Categories