Web scraping javascript based websites using selenium gives error

Web scraping javascript based websites using selenium gives error - javascript

I've been working on a project to send a few numbers to a specific discord server scraped from a javascript based website. I've gotten to the point where I only need to scrape the numbers, but I am having issues with it. When I try to get the numbers, this error pops up:
Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\cukor4_dry.py", line 48, in <module>
element = wait.until(EC.visibility_of_element_located((By.ID, "mainbgsection")))
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
code I use:
#import libraries
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from bs4 import BeautifulSoup
import time
from twill.commands import *
import pyautogui
import os
import subprocess
from dhooks import Webhook, File
import sys
#set settings
chrome_options = webdriver.ChromeOptions()
webdriver = webdriver.Chrome("chromedriver.exe", options=chrome_options)
hook = Webhook('webhook link')
time.sleep(4)
print('form')
showforms()
try:
#try to log into page
webdriver.get('url')
webdriver.find_element_by_id('username').send_keys('username')
webdriver.find_element_by_id('password').send_keys('password')
webdriver.find_element_by_name('actionButton').click()
print('submit')
except:
#already logged in
pass
print('waited')
#try to scrape the website
url = "url"
webdriver.get(url)
wait = WebDriverWait(webdriver, 10)
element = wait.until(EC.visibility_of_element_located((By.ID, "mainbgsection")))

Related

Python web scraping with requests sign in

I am working with www.freightquote.com and at some point I need to sign in otherwise not allowed me to get freight rates for more than 45 pairs.
I would like to enter sign in information for this website but for some reason it is not working. I could not understand the problem.
You can directly use this website: https://account.chrobinson.com/
I have problem to enter the information that I am asked. Here is what I did:
from selenium import webdriver
from time import sleep
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.service import Service
PATH = r'C:\Users\b\Desktop\Webscraping\chromedriver.exe'
s= Service(PATH )
driver = webdriver.Chrome(service=s)
link = "https://www.freightquote.com/book/#/free-quote/pickup"
driver.get(link)
sleep(2)
driver.maximize_window()
sleep(2)
driver.find_elements(by=By.XPATH, value = '//button[#type="button"]')[0].click()
sleep(3)
#Username:
driver.find_element(by=By.XPATH, value='//input[#type="email"]').send_keys('USERNAME')
driver.find_elements(by=By.XPATH, value = '//input[#class="button button-primary" and #type="submit"]')[0].click()
#password
driver.find_element(by=By.XPATH, value='//input[#type="password"]').send_keys('PASSWORD')
driver.find_elements(by=By.XPATH, value = '//input[#class="button button-primary" and #type="submit"]')[0].click()
sleep(2)

your code and your technic have too many problems, you should learn how to code in selenium completely and then start writing code.
I modified your code to the point of entering the email, please complete the code accordingly.
driver = webdriver.Chrome()
link = "https://www.freightquote.com/book/#/free-quote/pickup"
driver.get(link)
driver.maximize_window()
WebDriverWait(driver, 30).until(
EC.presence_of_element_located((By.XPATH,
'(//button[#type="button"])[1]'))).click()
WebDriverWait(driver, 30).until(
EC.presence_of_element_located((By.XPATH,
'//input[#type="email"]'))).send_keys('USERNAME')
also, you don't need to add chromedriver path in your code. if you use Windows or Linux you should add it into your virtualenv, in the /bin folder
and if you use from mac you should add it to this path /usr/local/bin

To enter sign in information for the website you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following locator strategies:
Using CSS_SELECTOR:
driver.get("https://account.chrobinson.com/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='username']"))).send_keys("Ribella")
driver.find_element(By.CSS_SELECTOR, "input[name='password']").send_keys("Ribella")
driver.find_element(By.CSS_SELECTOR, "input[value='Sign In']").click()
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:

Clicking a button in JavaScript page - Selenium/Python

My code accesses a page, and I am trying to click on the button that says "Physician Program" on the menu list. If you click on this on the browser, it directs you to a new webpage.
However, there is no href on the html of the page that would help me find this link via code (I am assuming because it is JavaScript?) Currently, I just used its Xpath.
My question is - If I am able to click on it in a browser, shouldnt I be able to click on it using Selenium? If so, how can this be done?
import time
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.kidney.org/spring-clinical/program')
time.sleep(6)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
element1 = driver.find_element_by_xpath('//*[#id="dx-c7ad8807-6124-b55e-d292-29a4389dee8e"]/div')
element1.click()

The element is inside iframe you need to switch to iframe
driver.switch_to.frame("SCM20 Advanced Practitioner Program")
element1 = driver.find_element_by_xpath("//div[text()='Physician Program']")
element1.click()
Ideally you should use webdriverwait and wait for frame to be available.
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.NAME,"SCM20 Advanced Practitioner Program")))
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH "//div[text()='Physician Program']"))).click()
You need to import below libraries
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import subprocess
#other imports
import time
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.kidney.org/spring-clinical/program')
time.sleep(6)
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'html.parser')
frame= WebDriverWait(driver,10).until(EC.presence_of_element_located(
(By.NAME, "SCM20 Advanced Practitioner Program")))
driver.switch_to.frame(frame)
options = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located(
(By.CSS_SELECTOR, '[class="track-selector-popup"] [role="option"]')))
options[0].click()
input()
Element is inside iframe so switch to it and also use waits, to switch back and interact with elements outside the frame use:
driver.switch_to.default_content()

How to click on the Cookie numerous times in-order to play cookie clicker within https://orteil.dashnet.org/cookieclicker/ using Selenium and Python

I was trying to make a simple selenium program to play cookie clicker from what I have seen I can't figure why it is not working here is my code
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
PATH = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(PATH)
driver.get('https://orteil.dashnet.org/cookieclicker/')
driver.implicitly_wait(5)
cookie = driver.find_elements_by_id('bigCookie')
cookie_count = driver.find_elements_by_id('cookies')
items = [driver.find_elements_by_id('productPrice' + str(i)) for i in range (1,-1,-1)]
actions = ActionChains(driver)
actions.click(cookie)
for i in range(5000):
actions.perform()
and here is the error I was getting
Traceback (most recent call last):
File "c:/Users/ffl_s/Desktop/Botting/My Bot/cookie.py", line 15, in <module>
actions.click(cookie)
File "C:\Users\ffl_s\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\action_chains.py", line 102, in click
self.move_to_element(on_element)
File "C:\Users\ffl_s\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\action_chains.py", line 273, in move_to_element
self.w3c_actions.pointer_action.move_to(to_element)
File "C:\Users\ffl_s\AppData\Local\Programs\Python\Python38-32\lib\site-packages\selenium\webdriver\common\actions\pointer_actions.py", line 42, in move_to
raise AttributeError("move_to requires a WebElement")
AttributeError: move_to requires a WebElement
PS C:\Users\ffl_s\Desktop\Botting\My Bot> [21704:18120:0918/223803.402:ERROR:device_event_log_impl.cc(208)] [22:38:03.402] Bluetooth: bluetooth_adapter_winrt.cc:1074 Getting Default Adapter failed.

If you want to click 5000 times and display the cookies text you could do this.
Just pip install webdriver-manager to fix up your binaries as well.
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.action_chains import ActionChains
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://orteil.dashnet.org/cookieclicker/')
for i in range(5000):
ActionChains(driver).move_to_element(driver.find_element_by_id('bigCookie')).click().perform()
items = driver.find_element_by_id('cookies')
print(items.text)

To click on the Cookie numerous times inorder to play cookie clicker you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following Locator Strategies:
Using CSS_SELECTOR:
driver.get('https://orteil.dashnet.org/cookieclicker/')
for i in range(100):
driver.execute_script("arguments[0].click();", WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#bigCookie"))))
print(driver.find_element_by_css_selector("#cookies").text)
Console Output:
80 cookies
per second : 0
Note: You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Browser Snapshot:

Python Selenium BeautifulSoup Page Source Does Not Display Everything

Goal:
Hello I am pretty new to Web and Selenium. I am currently trying to grab a value from my JIRA Board. Here:
Problem:
For some reason that value does not show up in the page source. I think it might be a JavaScript rendered value? or maybe it gets generated after the page loads. I tried using implicitly_wait, WebDriverWait, and switch_Frame but nothing seems to work. =/
Code:
#!/usr/local/bin/python2.7
#import requests
import json
import base64
import sys
import getopt
import argparse
from datetime import datetime
from datetime import timedelta
from bs4 import BeautifulSoup
from jira import JIRA
from jira.client import GreenHopper
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
JIRA_INSTALLATION = "jira.turn.com"
STATE_IN_PROGRESS = "In Progress"
STATE_RESOLVED = "Resolved"
STATE_CLOSED = "Closed"
options = {'server': 'https://jira.turn.com'}
CUR_TIMEZONE_SHIFT = timedelta(hours=7)
def main(argv):
p=argparse.ArgumentParser(description="Gets a set of completed stories and list them with their implementation time.")
p.add_argument('filter_id', help="Id of the filter that contains completed stories.")
p.add_argument('-u', dest='username', help="JIRA username. Needs to have read access to all tickets returned from the search filter.")
p.add_argument('-p', dest='password', help="Password for the JIRA user to use for API calls.")
args = p.parse_args(argv)
driver = webdriver.Firefox()
driver.get('https://jira.turn.com/')
driver.find_element_by_id("login-form-username").send_keys(args.username)
driver.find_element_by_id ("login-form-password").send_keys(args.password)
driver.find_element_by_id("login").click()
#driver.implicitly_wait(10)
#ele = WebDriverWait(driver, 10)
driver.get('https://jira.turn.com/secure/RapidBoard.jspa?rapidView=184&view=reporting&chart=controlChart&days=30&column=1214&column=1298')
#WebDriverWait
soup_level1 = BeautifulSoup(driver.page_source, 'lxml')#'html.parser')#'lxml')
print soup_level1.find(id='ghx-chart-snapshot')#find(id='content').find(id="gh").find(id="ghx-content-main").find(id="ghx-chart-header"))
print soup_level1.find(id='ghx-chart-snapshot').find(id='ghx-chart-snapshot')
driver.quit()
return
if __name__ == "__main__":
main(sys.argv[1:])
Output:
<div id="ghx-chart-snapshot"></div>
None

Failing to scrape web data with Selenium

I'm trying to fetch data from the front page table on https://icostats.com/. But something just isn't clicking.
from selenium import webdriver
browser = webdriver.Chrome(executable_path=r'C:\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
browser.find_element_by_xpath("""//*[#id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").s()
posts = browser.find_element_by_class_name("tdPrimary-0-75")
for post in posts:
print(post.text)
The errors I'm getting:
*
C:\Python36\python.exe C:/.../PycharmProjects/PyQtPS/ICO_spyder.py
Traceback (most recent call last): File
"C:/.../PycharmProjects/PyQtPS/ICO_spyder.py", line 5, in
browser.find_element_by_xpath("""//[#id="app"]/div/div[2]/div[2]/div[2]/div[1]/div[2]""").click()
File
"C:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 313, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath) File "C:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 791, in find_element
'value': value})['value'] File "C:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 256, in execute
self.error_handler.check_response(response) File "C:\Python36\lib\site-packages\selenium\webdriver\remote\errorhandler.py",
line 194, in check_response
raise exception_class(message, screen, stacktrace) selenium.common.exceptions.NoSuchElementException: Message: no such
element: Unable to locate element:
{"method":"xpath","selector":"//[#id="app"]/div/div[2]/div[2]/div[2]/div[1]/div[2]"}
(Session info: chrome=59.0.3071.115) (Driver info:
chromedriver=2.30.477700
(0057494ad8732195794a7b32078424f92a5fce41),platform=Windows NT
6.1.7600 x86_64)
*
EDIT
Finally got it working:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
browser = webdriver.Chrome(executable_path=r'C:\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#app > div > div.container-0-16 > div.table-0-20 > div.tbody-0-21 > div:nth-child(2) > div:nth-child(8)")))
posts = browser.find_elements_by_class_name("thName-0-55")
for post in posts:
print(post.text)
posts = browser.find_elements_by_class_name("tdName-0-73")
for post in posts:
print(post.text)
Is there any way to iterate over every header/column and export it to a csv file without having to go through each class like this?

Required data generated dynamically by JavaScript. You need to wait until it present on the page:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait as wait
browser = webdriver.Chrome(executable_path=r'C:\Scrapers\chromedriver.exe')
browser.get("https://icostats.com")
wait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div#app>div")))
posts = browser.find_element_by_class_name("tdPrimary-0-75")
for post in posts:
print(post.text)

Seems like there is no s() method in this line
browser.find_element_by_xpath("""//*[#id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").s()
so, what you need might be
browser.find_element_by_xpath("""//*[#id="app"]/div/div[2]/div[2]/div[2]/div[2]/div[8]/span/span""").text
Since you want to iterate on the results, this line:
posts = browser.find_element_by_class_name("tdPrimary-0-75")
should be
posts = browser.find_elements_by_class_name("tdPrimary-0-75")

We Keep Coding

JavaScript is the programming language of the Web.

Web scraping javascript based websites using selenium gives error - javascript

Related

Python web scraping with requests sign in

Clicking a button in JavaScript page - Selenium/Python

How to click on the Cookie numerous times in-order to play cookie clicker within https://orteil.dashnet.org/cookieclicker/ using Selenium and Python

Python Selenium BeautifulSoup Page Source Does Not Display Everything

Failing to scrape web data with Selenium

Categories

Resources