First I use python and selenium to load a website in firefox. Then I fill in a simple javascript type form. The site is poorly made but usually if I tell selenium to use (Keys.RETURN) then it will drop down a list of options. The problem is I don't know how to click on one of these because they didn't load with the web page. I tried using (Keys.ARROW_DOWN) to go through them but it still doesn't really work.
How can I interact with javascript through selenium using python?
Thanks.
P.S. I know almost nothing about javascript so if there is some way to do it, I still would be clueless on how to use javascript anyway...
You might have to tell the browser to wait for a few milliseconds.
Some places to look:
Clicking on a Javascript Link on Firefox with Selenium
http://seleniumhq.org/docs/04_webdriver_advanced.jsp
Related
I have recently posted a question about how to login to twitter using requests library. Finally, I got the solution for that and another problem i am facing is that i am able to scrape only visible content in the page. How to scrape dynamically loaded content in that page?
Note: I am not using selenium. Please provide any other means to do this.
How to load dynamic content and then scrape it?
Without using something like Selenium or another browser (headless or otherwise) which will actually run the JavaScript in a normal-ish manner, the only other method would be to manually reverse engineer the JavaScript, see what kind of calls it's making, and make them yourself directly.
There wouldn't be any other kind of "one-size-fits-all" solution.
Is it possible to autorun any script, for example from scratchpad in Firefox?
I want to add one button to website without making an extension, because I don't need to use XUL and Rtf.
Maybe can I make add-up, containing only JS file?
If you use Python, then you may use Selenium to mimic person clicking in browser by writing python script.
Besides, Python mechanize is also a good module to automatically do something related to website.
I need to scrape a site with python. I obtain the source html code with the urlib module, but I need to scrape also some html code that is generated by a javascript function (which is included in the html source). What this functions does "in" the site is that when you press a button it outputs some html code. How can I "press" this button with python code? Can scrapy help me? I captured the POST request with firebug but when I try to pass it on the url I get a 403 error. Any suggestions?
In Python, I think Selenium 1.0 is the way to go. It’s a library that allows you to control a real web browser from your language of choice.
You need to have the web browser in question installed on the machine your script runs on, but it looks like the most reliable way to programmatically interrogate websites that use a lot of JavaScript.
Since there is no comprehensive answer here, I'll go ahead and write one.
To scrape off JS rendered pages, we will need a browser that has a JavaScript engine (e.i, support JavaScript rendering)
Options like Mechanize, url2lib will not work since they DO NOT support JavaScript.
So here's what you do:
Setup PhantomJS to run with Selenium. After installing the dependencies for both of them (refer this), you can use the following code as an example to fetch the fully rendered website.
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('http://jokes.cc.com/')
soupFromJokesCC = BeautifulSoup(driver.page_source) #page_source fetches page after rendering is complete
driver.save_screenshot('screen.png') # save a screenshot to disk
driver.quit()
I have had to do this before (in .NET) and you are basically going to have to host a browser, get it to click the button, and then interrogate the DOM (document object model) of the browser to get at the generated HTML.
This is definitely one of the downsides to web apps moving towards an Ajax/Javascript approach to generating HTML client-side.
I use webkit, which is the browser renderer behind Chrome and Safari. There are Python bindings to webkit through Qt. And here is a full example to execute JavaScript and extract the final HTML.
For Scrapy (great python scraping framework) there is scrapyjs: an additional downloader handler / middleware handler able to scraping javascript generated content.
It's based on webkit engine by pygtk, python-webkit, and python-jswebkit and it's quite simple.
I want to interact with my local HTML page through my C++ application. Just like using java script console, we can edit a page in real time, e.g
document.getElementById('divlayer').style.visibility = 'hidden';
Similarly i want to call such functions in real time through my application.
Can you give me some idea if there is a way to accomplish this job?
I am using Google Chrome at the moment.
Do i need some plugin, but how can i make plugin to interact with my application then?
Also, i head about JQuery, can this be done using JQuery? Or do i have to try some server mecahnism may be using Ajax??
I know that you can control your IE browser with COM on Windows, and you can interact with the page with it. But I didn't try it with C++, I just use it with Python and it works well. May be you'd like to check it out.
This is kind of tricky. There is this webpage which, I am guessing, uses some kind of AJAX to pull out content based on the search query. When I fetch the page using get in Perl, it fetches the script code behind the php/html, but not the results which are displayed when the query is searched manually. I need to be able to fetch the content of the results page. Is there anyway to do this in Perl?
Take a look at Selenium RC and the WWW::Selenium module in Perl. With them you can control a real web browser.
Another option is WWW::HtmlUnit which uses the HtmlUnit Java library to execute the JavaScript without a web browser. WWW::HtmlUnit uses Inline::Java to give Perl access to the library. I have found that when installing, it is best to say No to the question "Do you wish to build the JNI extension?".
If you are writing tests that need to check the rendered page, you can have a look at Schwern's javascript-tap-harness, which works with Selenium and handles all the scaffolding.
I also found Using WWW::Selenium To Test Or Automate An Ajax Website pretty useful.