Python Forum

I am looking to download the tables in this URL for different selections like Crude, Oil etc using Python but I am a beginner and I tried using Selenium to get the table but I don't know how to export this format to a file like excel or csv. I am not even sure if this is the correct approach.

http://www.jodidb.org/TableViewer/tableV...rtId=93904

Below is my code so far

from selenium import webdriver
import chromedriver_binary
import pandas as pd
driver = webdriver.Chrome()

driver.get('http://www.jodidb.org/TableViewer/tableView.aspx?ReportId=93904')

    elem = driver.find_element_by_class_name('ExplorerContentPane').click()

Any help is appreciated.

you are missing a lot of code.
you need to set options, start selenium, do your work and then stop selenium.
start with the following two tutorials (which will only take a short time to complete)
web scraping part 1
web scraping part 2 (this is main one for selenium)

Thank you for the links. They were useful. I was able to simulate most of the filters I need to change using Selenium but unfortunately the final step involves clicking on download to excel or csv and that link opens a new window. Not sure how to go about handling that. I tried find_element_by_xpath on the download but that doesnt open the new tab. Please help.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
from bs4 import BeautifulSoup
import requests
import pandas as pd


 
#browser = webdriver.Chrome()
#--| Setup
chrome_options = Options()
#chrome_options.add_argument("--headless")
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--log-level=3')
browser = webdriver.Chrome(chrome_options=chrome_options)
# chrome_options.add_argument("headless")
# browser = webdriver.Chrome(options=chrome_options)

browser.get('http://www.jodidb.org/TableViewer/tableView.aspx?ReportId=93904')


wait = WebDriverWait(browser, 30)


elem = wait.until(EC.element_to_be_clickable((By.ID, 'DnDTmp')))

elem.click()
#SelectAll(0)
elem_selectAllNone = browser.find_element_by_xpath('//*[@id="w_itemSelContainer"]/table[2]/tbody/tr[2]/td[3]')
elem_selectAllNone.click()

elem_selectcheck = wait.until(EC.element_to_be_clickable((By.ID, 'SelectRange')))  
elem_selectcheck.click()

elem_startdate = wait.until(EC.element_to_be_clickable((By.ID, 'WD_DateLower')))  
elem_startdate.send_keys('12/2019')

elem_enddate = browser.find_element_by_id('WD_DateUpper')
elem_enddate.send_keys('05/2020')

elem_search = browser.find_element_by_xpath('//*[@id="w_itemSelContainer"]/table[1]/tbody/tr/td[2]/input[1]')
elem_search.click()


elem_SelectAllAll = browser.find_element_by_xpath('//*[@id="w_itemSelContainer"]/table[2]/tbody/tr[1]/td[3]/a')

# driver.execute_script("arguments[0].click();", element) 
elem_SelectAllAll.click()

elem_showtbl = browser.find_element_by_xpath('//*[@id="cell4"]/table[2]/tbody/tr[1]/td[4]/input')
elem_showtbl.click()

# Save the window opener (current window)

elem_menudwnload = browser.find_element_by_xpath('//*[@id="MenuCell_Download"]')
elem_menudwnload.click()



#time.sleep(30) #seconds
browser.close()

My post in the tutorial 2 explains a little about switching to new tabs.

(Aug-02-2020, 10:50 PM)gj31980 Wrote: [ -> ]but unfortunately the final step involves clicking on download to excel or csv and that link opens a new window

When i run your code and select download csv file it opens a new tab, not a new window. When a new tab opens you can run

browser.switch_to.window(browser.window_handles[1])

to switch to the next tab and select download from it.

As for getting the xpath, i was able to get it from expanding the html and reselecting inspect to again get the correct xpath.

Put this at the end of your script and it downloads for me

# Save the window opener (current window)
 
elem_menudwnload = browser.find_element_by_xpath('//*[@id="MenuCell_Download"]')
elem_menudwnload.click()
 
drop_down_download = browser.find_element_by_xpath('//*[@id="ActDiv"]/table/tbody/tr/td[5]/a/img')
drop_down_download.click()
csv = browser.find_element_by_xpath('//*[@id="MenuCell_DownloadDiv"]/p[3]/nobr/a')
csv.click()
browser.switch_to.window(browser.window_handles[1])
wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="cell4"]/input[1]')))
download = browser.find_element_by_xpath('//*[@id="cell4"]/input[1]')
download.click()


time.sleep(3000) #seconds
browser.close()

Im assuming you have enough knowledge to modify it enough to change to excel if you want that instead.

Thank you so much for your help!!! That worked perfectly.
I was not able to get below by inspecting myself but I will figure it out.

drop_down_download = browser.find_element_by_xpath('//*[@id="ActDiv"]/table/tbody/tr/td[5]/a/img')
drop_down_download.click()

gj31980

Larz60+

gj31980

metulburr

gj31980