I've been looking at how to automate actions on a webpage with PhantomJS, however I'm having issues manipulating the page to do what I want it to.
I'm using this as test site. I've managed to get Phantom to open the webpage and scrape the random sentence from the #result span. But now what I want to do is get another sentence without re-launching the script. I don't want to close and re-open the page as Phantom takes ages to launch the webkit and load the page. So I thought I could get another sentence by getting Phantom to click on the 'Refresh' button below the sentence box. Here's what I have at the moment:
var page = require('webpage').create();
console.log("connecting...");
page.open("http://watchout4snakes.com/wo4snakes/Random/RandomSentence", function(){
console.log('connected');
var content = page.content;
var phrase = page.evaluate(function() {
return document.getElementById("result").innerHTML;
});
console.log(phrase);
page.includeJs("http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js", function() {
page.evaluate(function() {
$("frmSentence").click();
});
});
var content = page.content;
var phrase = page.evaluate(function() {
return document.getElementById("result").innerHTML;
});
console.log(phrase);
phantom.exit();
});
As you can see I'm trying to click the refresh button by using a .click() function, but this isn't working for me as I still get the same sentence as beforehand. Given the HTML for the button:
<form action="/wo4snakes/Random/NewRandomSentence" id="frmSentence" method="post" novalidate="novalidate">
<p><input type="submit" value="Refresh"></p>
</form>
I'm not sure what I should be referencing in the script to be clicked on? I'm trying the form ID 'frmSentence' but that isn't working. I'm wondering if .click() is the right way to go about this, is there some way for Phantom to submit the form that the button is linked to? Or maybe I can run the associated script on the page that gets the sentence? I'm a bit lost on this one so I don't really know which method I should go with?
You have a problem with your control flow. page.includeJs is an asynchronous function. If you have some other statements page.includeJs, they are likely executed before the script is loaded and the callback is executed. It means in your case that you've read the sentence 2 times before you even trigger a click.
If you want to do this multiple times, I suggest to use recursion since you cannot write this synchronously. Also, since you want this to be fast, you cannot use a static setTimeout with a timeout of 1 second, because sometimes the request may be faster (you lose time) and sometimes slower (your script breaks). You should use waitFor from the examples.
Instead of loading jQuery every time, you can move page.includeJs up and include everything else in its callback. If you only need to click an element or if jQuery click doesn't work (yes, that happens from time to time), you should use PhantomJS; click an element.
web scraping is about sending require information to a web server and get the result. It is not about behaving like a user clicking button or entering search criteria.
All you need to do in this example is send a POST request to http://watchout4snakes.com/wo4snakes/Random/NewRandomSentence. The result is just text in page.content, it does not even need to evaluate. So to get more than one sentence you just need to do a loop of page.open
Related
Is there a way to re-execute JS without refreshing a page?
Say if I have a parent page and an inside page. When the inside page gets called, it gets called via ajax, replacing the content of the parent page. When user clicks back, I would like to navigate them back to the parent page without having to reload the page. But, the parent page UI relies on javascript so after they click back, I would like to re-execute the parent page's javascript. Is this possible?
Edit: Here is some code. I wrap my code in a function but where and how would you call this function?
function executeGlobJs() {
alert("js reload success");
}
You could use the html5 history-api:
In your click-handler you'll call the pushState-method, stat stores the current state for later reuse:
$(document).on('click', 'a.link', function () {
// some ajax magic here to load the page content
// plus something that replaces the content...
// execute your custom javascript stuff that should be called again
executeGlobJs()
// replace the browser-url to the new url
// maybe a link where the user has clicked
history.pushState(data, title, url);
})
...later if the user browses back:
$(window).on('popstate', function () {
// the user has navigated back,
// load the content again (either via ajax or from an cache-object)
// execute your custom stuff here...
executeGlobJs()
})
This is a pretty simple example and of course not perfect!
You should read more about it here:
https://css-tricks.com/using-the-html5-history-api/
https://developer.mozilla.org/en-US/docs/Web/API/History_API
For the ajax and DOM-related parts, you should need to learn a bit about jQuery http://api.jquery.com/jquery.ajax/. (It's all about the magic dollar sign)
Another option would be the hashchange-event, if you've to support older browsers...
You can encapsulate all your javascript into a function, and call this function on page load.
And eventually this will give you control of re-executing entire javascript without reloading the page.
This is common practise when you use any concat utility (eg. Gulp)
If you want to reload the script files as if it would be on a page reload, habe a look at this.
For all other script functions needed, just create a wrapper function as #s4n989 and #Rudolf Manusadzhyan wrote it. Then execute that function when you need to reinit your page.
I'm having the same problem I don't use jquery.
I don't have a solution yet. I think that your problem is that it doesn't read all the document.getelements after you add content, so my idea is to put all the element declarations in a function. And than after the ajax call ends to call the function to get all the elements again.
So it might be something like that
Func getElems(){
const elem= document.getelementsby...
Const elem.....
At the end of the js file make a call for
the function
getelems()
And than at the end of the event of the
ajax call. Just call the function again.
Sorry that is something that comes to my mind on the fly while reading and thinking on the problem i have too:).
Hope it helped I will try it too when I will be on the computer :)
I believe you are looking for a function called
.preventDefault();
Here's a link to better explain what it does - https://api.jquery.com/event.preventdefault/
Hope this helps!
EDIT:
By the way, if you want to execute the JS on back you can wrap the script inside of
$('.your-div').on('load', function(e) {
e.preventDefault();
//your JavaScript goes here
}
I have a simple javascript program that runs onclick of an image.
However, whenever I clicked the image, the page reloaded.
After a lot of debugging I found that the page doesn't reload until right as the script completes.
There are several setTimeouts in the code, but I noticed the page was reloading instantly. I even changed these timeouts to 15000 milliseconds, but it still reloads immediately.
I am using jquery, if it makes any difference.
I also want a different result from the program every time you click it, so that each time you click it a different script runs and a some text changes in a specific order. I did this by changing the onclick attribute of the images in each script to the name of the next script, so that script one would switch onclick to script two, and so on. I set a timeout on these switches so that one click doesn't race through every single script. script two isn't running, so that much works.
my code:
function getSounds() {
console.log("script. initiated");
$("#soundwebGetSoundDirections").html("Now, Wait until the file is done downloading and click below again.");
console.log("new message");
$("#soundwebGetSoundA").attr('href',"");
console.log("href eliminated");
setTimeout($("#soundwebGetSoundImg").attr('onclick','findFile()'),2000);
console.log("onclick to findFile()");
}
function findFile(){
console.log("FINDFILE")
$("#soundwebGetSoundDirections").html("Find the file(it's probably in your downloads), copy the path of the file (usually at the top of the file explorer) and paste in it the box below. Then, make sure there is a '/' at the end of the path and type 'Linkiness.txt' (case sensitive, without quotes) at the end. Once you have all that stuff typed, click the icon again.");
console.log("FIND IT, DARN IT!!");
$("#soundwebGetSoundPathInput").css("opacity",1);
console.log("diving into reader");
setTimeout($("#soundwebGetSoundImg").attr('onclick','readFile()'),1000);
}
function readFile(){
console.log("loading...");
$("#soundwebGetSoundDirections").html("loading...");
if(document.getElementById("soundwebGetSoundPathInput").value.length == 0){
setTimeout($("#soundwebGetSoundDirections").html("Please fill in Path!"),1000);
setTimeout(findFile(),2000);
}
}
and the HTML that's linked to,
<a id = "soundwebGetSoundA" href = "https://docs.google.com/feeds/download/documents/export/Export?id=1ynhHZihlL241FNZEar6ibzEdhHcWJ1qXKaxMUKM-DpE&exportFormat=txt">
<img onclick = "getSounds();" class = "soundwebImgResize" src = "https://cdn3.iconfinder.com/data/icons/glypho-music-and-sound/64/music-note-sound-circle-512.png" id = "soundwebGetSoundImg"/>
</a>
Thanks for any help,
Lucas N.
If you don't want clicking the image to cause the anchor tag to load the href, then move the image tag outside of the anchor tag.
You aren't using setTimeout correctly. You should be passing in a function not a statement. So, for example, instead of
setTimeout($("#soundwebGetSoundDirections").html("Please fill in Path!"),1000);
setTimeout(findFile(),2000);
you should use
setTimeout(function () { $("#soundwebGetSoundDirections").html("Please fill in Path!") },1000);
setTimeout(findFile,2000);
I think the same goes for setting the onclick attribute but I've never tried dynamically changing an onclick attribute like that.
Since you're already using jQuery you could try using .on('click'... and .off('click'... if your current setup isn't working.
So I recently started working on Greasemonkey scripts without much prior experience in JavaScript. It was going fine until I hit this roadbloack.
I'm writing a script for a page that has a small table of information. If a link at the bottom is clicked, the table expands fully in the page to display all information. I need to call a function in Greasemonkey when this happens, however, the link doesn't appear to have an ID or anything I can actually reference to watch it. It's simply this:
When it's clicked, the table expands and it then shows as true. I initially used the following to expand the table upon loading the page, but that broke several things:
window.location.href = ('javascript: expandFullTable(false)');
I've attempted using "click", "onclick", and even "mouseover" to have Greasemonkey detect when it's pressed but nothing seems to work. From what I can tell it's simply a link that calls a function, but after some significant searching I wasn't able to find out anything about how to reference it in my script. I'm sure it's incredibly simple, but it's frustrated me to no end.
You can hijack the function like this:
var oldExpandFullTable = unsafeWindow.expandFullTable;
unsafeWindow.expandFullTable = function() {
// Do something
alert("You clicked on that thing!");
// Call the original function
oldExpandFullTable.apply(this, arguments);
};
But since you tagged this jquery this should let you retrieve the link:
var link = $("a[href^=\"javascript: expandFullTable\"]);
It should work if jQuery is injected into your script with #require. If it's already in the page, you can add this before to access it: var $ = unsafeWindow.jQuery;.
And by the way, perhaps you should learn more about unsafeWindow to avoid security holes.
Page A:
$(document).ready(function () {
bindData();
});
function bindData() {
$('#searchbtn').bind('click', function () { SearchResult(); });
}
function SearchResult() {
ajax call...
}
Page A HTML:
<input type="button" id="searchbtn" />
Page B Details---> this page comes after selecting a specific search result from page A search list
Back<br />
Now when I go back to the Page A I can see my search criteria's as they were selected but the result Div is gone. What I am trying to do is I want the search list to stay when the Page comes back.
I think what I can do here is some how call the searchbtn click event again when the page comes back so the list will come-up again. Can anyone tell me how to fire the searchbtn click event only when the page comes back from Page B. or point me in the right way of doing this..
Thanks
The Browser Back button has long been problematic with AJAX. There are scripts, workarounds, and techniques out there (depending on the framework that you want to use).
Since it appears that you are using jQuery (based on your posted JavaScript syntax), here is a link to another Stackoverflow post regarding back button jQuery plugins.
history.back() will return you to the last URL visited, meaning that any ajax calls made during the user's visit will not be automatically repeated. Your browser may automatically restore your form selections, but the SearchResults() function is only called by a click event, not a selection event.
You can bind URLs to ajax states using a framework like sammy.js. That way, history.back() would take you to a URL associated with SearchResults().
function bindData() {
var chkinput1 = $("input:checkbox[name=x]:checked").length;
var chkinput2 = $("input:checkbox[name=y]:checked").length;
if (chkinput1 > 0 && chkinput2 > 0) {
SearchResult();
}
$('#searchbtn').bind('click', function () { SearchResult(); });
}
I know this is the worst way to achieve this result but I think instead of using any other plugins to add complexity we will go with this for now. If anyone else is looking for the same question let me tell you again this is not the best practice as on returning back to the history we are calling the search result again depending upon the cached input selection of checkboxes and generating the whole ajax call again to display the list. On the first request I am caching the list and setting sliding expiration so its not taking anytime to comeback and so everyone lives happily.
I am working on chrome extension for facebook. If you use facebook, you know that when you scroll down to the bottom of the news feed/timeline/profile it shows more posts. The extension actually adds a button beside the "like" button. So I need to check if there are more posts to add that button to.
Right now to check if the page has been modified, I use setInterval(function(){},2000).
I want to run a function when the user clicks the button. But this function doesn't work if I put it outside (or even inside) setInterval() – The Koder just now edit
How can I check if the webpage has been modified WITHOUT using a loop?
Example:
$(document).ready(function(){
window.setInterval(function(){
$(".UIActionLinks").find(".dot").css('display','none');
$(".UIActionLinks").find(".taheles_link").css('display','none');
$(".miniActionList").find(".dot").css('display','none');
$(".miniActionList").find(".taheles_link").css('display','none');
//only this function doesn't work:
$(".taheles_link").click(function(){
$(".taheles_default_message").hide();
$(".taheles_saving_message").show();
});
//end
$(".like_link").after('<span class="dot"> · </span><button class="taheles_link stat_elem as_link" title="תגיד תכל´ס" type="submit" name="taheles" onclick="apply_taheles()" data-ft="{"tn":">","type":22}"><span class="taheles_default_message">תכל´ס</span><span class="taheles_saving_message">לא תכלס</span></button>');
$(".taheles_saving_message").hide();
}, 2000);
});
In the future, this extension will use AJAX, so setInterval() can make even more problems for me.
If I understand correctly you want to get a notification when the page's DOM changes. And you want to do this without using the setInterval() function.
As your problem lies within the attaching event handlers to elements that are created after the page has loaded, you might be interested in checking out the jquery.live event attachment technique. I think it will solve your issue.
In general you want the page to throw a mutation event. There is a mutation event spec that might be what you're looking for. Here are some links that might be useful.
http://tobiasz123.wordpress.com/2009/01/19/utilizing-mutation-events-for-automatic-and-persistent-event-attaching/
Detect element content changes with jQuery
$(document).ready(function(){
setInterval('fun()',5000);
fun();
});
function fun()
{
alert(11)
}