Posts: 7
Threads: 1
Joined: Sep 2019
Sep-24-2019, 04:17 PM
(This post was last modified: Sep-24-2019, 04:24 PM by buran.)
Hi All,
I'm learning Python right now (and this is actually my first threat so let me know if there is a way to ask my question in a clearer manner)
and I want to retrieve the rates from the table in attached link:
https://www.global-rates.com/interest-ra...libor.aspx
however with the following code:
import urllib.request as ur
from bs4 import BeautifulSoup
url = input('Enter URL: ')
html = ur.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')
data = []
table = soup.find('table', attrs={'class':'lineItemsTable'})
table_body = table.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele]) I receive this error:
Error: Traceback (most recent call last):
File "....\global_rates.py", line 11, in
table_body = table.find('tbody')
AttributeError: 'NoneType' object has no attribute 'find'
Can anybody help me on that?
Thanks a lot!
Gerald
Posts: 8,170
Threads: 160
Joined: Sep 2016
Sep-24-2019, 04:34 PM
(This post was last modified: Sep-24-2019, 05:47 PM by buran.)
I don't see table tag with class attribute lineItemsTable .
Also this site is using javascript so you need tool like selenium to render the webpage and be able to access the content
I had a typo in the class and that confused me
Posts: 7
Threads: 1
Joined: Sep 2019
Thanks Buran! I will follow the correct tagging going forward!
What would be the correct table tag to retrieve the table with the rates?
If I use 'tabledata1' I receive the same error:
import urllib.request as ur
from bs4 import BeautifulSoup
url = input('Enter URL: ')
html = ur.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')
data = []
table = soup.find('table', attrs={'class':'tabledata1'})
table_body = table.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele])
Posts: 12,052
Threads: 488
Joined: Sep 2016
Sep-24-2019, 04:57 PM
(This post was last modified: Sep-24-2019, 05:19 PM by Larz60+.)
Try this:
(will scrape page and show all table elements)
you will need to install requests and lxml:
pip install requests, lxml import requests
from bs4 import BeautifulSoup
def parsepage(page):
soup = BeautifulSoup(page, 'lxml')
table = soup.find('table')
if table is not None:
trs = table.find_all('tr')
for n, tr in enumerate(trs):
tds = tr.find_all('td')
for n1, td in enumerate(tds):
print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
print(f"{td.prettify}")
else:
print(f"Cound not find table")
def scrape_url(url):
response = requests.get(url)
if response.status_code == 200:
page = response.content
parsepage(page)
else:
print(f"unable to retreive {url}")
if __name__ == '__main__':
url = 'https://www.global-rates.com/interest-rates/libor/libor.aspx'
scrape_url(url) partial results:
Output: ------------------------------ tr_0, td_0 ------------------------------
<bound method Tag.prettify of <td>
<table cellpadding="0" cellspacing="0" style="width:100%;margin:10px 0px 0px 0px;">
<tr>
<td>
<img alt="" src="//www.global-rates.com/images/misc/ittybittyclear.gif" style="margin:3px 4px 3px 0px;"/>
</td>
<td align="right" valign="bottom">
<a href="//www.global-rates.com/"><img alt="English - worldwide actual interest rates and economic indicators" border="0" src="//www.global-rates.com/images/misc/gb.gif"/></a>
<a href="//nl.global-rates.com/"><img alt="Nederlands - actuele, internationale rentetarieven en economische indicatoren" border="0" src="//www.global-rates.com/images/misc/nl.gif"/></a>
<a href="//de.global-rates.com/"><img alt="Deutsch - aktuelle, internationale Zinssätze und Wirtschaftindikatoren" border="0" src="//www.global-rates.com/images/misc/de.gif"/></a>
<a href="//es.global-rates.com/"><img alt="Español - Español - tipos de interés e indicadores económicos actuales e internacionales" border="0" src="//www.global-rates.com/images/misc/es.gif"/></a>
<a href="//it.global-rates.com/"><img alt="Italiano - tassi dâinteresse internazionali e sugli indicatori economici" border="0" src="//www.global-rates.com/images/misc/it.gif"/></a>
<a href="//fr.global-rates.com/"><img alt="Français - taux dâintérêts et indicateurs économiques actuelles et internationaux" border="0" src="//www.global-rates.com/images/misc/fr.gif"/></a>
<a href="//pt.global-rates.com/"><img alt="Português - taxas de juros actuais e internacionais e indicadores económicos" border="0" src="//www.global-rates.com/images/misc/pt.gif"/></a>
</td>
</tr>
</table>
</td>>
------------------------------ tr_0, td_1 ------------------------------
<bound method Tag.prettify of <td>
<img alt="" src="//www.global-rates.com/images/misc/ittybittyclear.gif" style="margin:3px 4px 3px 0px;"/>
</td>>
Posts: 7
Threads: 1
Joined: Sep 2019
Thanks Larz60+! I was able to run the code successfully only the table seems to be the wrong one.
It parses through the first table with all the different languages instead of the table with the rates.
How can I tweak the code to jump to the right table?
Output: ------------------------------ tr_0, td_0 ------------------------------
<bound method Tag.prettify of <td>
<table cellpadding="0" cellspacing="0" style="width:100%;margin:10px 0px 0px 0px;">
<tr>
<td>
<img alt="" src="//www.global-rates.com/images/misc/ittybittyclear.gif" style="margin:3px 4px 3px 0px;"/>
</td>
<td align="right" valign="bottom">
<a href="//www.global-rates.com/"><img alt="English - worldwide actual interest rates and economic indicators" border="0" src="//www.global-rates.com/images/misc/gb.gif"/></a>
<a href="//nl.global-rates.com/"><img alt="Nederlands - actuele, internationale rentetarieven en economische indicatoren" border="0" src="//www.global-rates.com/images/misc/nl.gif"/></a>
Posts: 5,151
Threads: 396
Joined: Sep 2016
Something funky is with that site or i am just having a moment. If I loop the tables with Larz code modified to:
def parsepage(page):
soup = BeautifulSoup(page, 'lxml')
tables = soup.find_all('table')
for table in tables:
print(table)
print("------------------------------------------------------------")
return I am able to see the table with the rates. But if i go to the index of that table tables[6] , there are other tables before and after (makes no sense).
Recommended Tutorials:
Posts: 7
Threads: 1
Joined: Sep 2019
Thanks everyone for your help :)
I was able to modify the code from Larz60+ a bit and get the correct table:
import requests
from bs4 import BeautifulSoup
lst=list()
def parsepage(page):
soup = BeautifulSoup(page, 'lxml')
table = soup.find_all('table')[7]
if table is not None:
trs = table.find_all('tr')[12:13]
for n, tr in enumerate(trs):
tds = tr.find_all('td')[:2]
for n1, td in enumerate(tds):
# print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
# print(f"{td.prettify}")
# print(td.contents)
for key in td:
if key != None:
lst.append(key)
else:
print(f"Cound not find table")
print(lst)
def scrape_url(url):
response = requests.get(url)
if response.status_code == 200:
page = response.content
parsepage(page)
else:
print(f"unable to retreive {url}")
if __name__ == '__main__':
url = 'https://www.global-rates.com/interest-rates/libor/libor.aspx'
scrape_url(url) my output is now a list with the following:
Output: ['\xa0', <a class="tabledatalink" href="/interest-rates/libor/european-euro/eur-libor-interest-rate-1-month.aspx" title="1 month European euro (EUR) LIBOR interest rate">Euro LIBOR - 1 month</a>, '-0.50200\xa0%']
however I fail to make a nice dictionary of tuples... I would like to make it look like the following:
Output: {'Euro LIBOR - 1 month, -0.50200%', ...}
Could you help me to create this dictionary?
Thanks a lot!
Posts: 7
Threads: 1
Joined: Sep 2019
Hi All,
just wanted to let you know that I was able to write the code :)
Thanks for all your help!!
import requests
from bs4 import BeautifulSoup
d=dict()
def parsepage(page):
soup = BeautifulSoup(page, 'lxml')
table = soup.find_all('table')[7]
if table is not None:
trs = table.find_all('tr')[8:15]
for n, tr in enumerate(trs):
tds = tr.find_all('td')[:2]
for n1, td in enumerate(tds):
# print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
# print(f"{td.prettify}")
# print(td.contents)
td = str(td)
if td.find('LIBOR')>0:
spos = td.find('">')
epos = td.find('</a>')
title = td[spos+2:epos]
d[title]=d.get(title,0)
if td.find('%')>0:
spos = td.find('>')
epos = td.find('%')
rate = td[spos+1:epos-1]
d[title] = rate
else:
print(f"Cound not find table")
def scrape_url(url):
response = requests.get(url)
if response.status_code == 200:
page = response.content
parsepage(page)
else:
print(f"unable to retreive {url}")
if __name__ == '__main__':
url = 'https://www.global-rates.com/interest-rates/libor/libor.aspx'
scrape_url(url)
print(d) Output: {'Euro LIBOR - overnight': '-0.56971', 'Euro LIBOR - 1 week': '-0.54743', 'Euro LIBOR - 2 weeks': 0, 'Euro LIBOR - 1 month': '-0.50200', 'Euro LIBOR - 2 months': '-0.44600', 'Euro LIBOR - 3 months': '-0.42529'}
Posts: 7
Threads: 1
Joined: Sep 2019
Hi guys!
I tried the code Larz60+ sent earlier on a different website but it seems like it is unable to retrieve any data. I went through the code and the website details but couldn't figure out what causes the error. Can someone point me in the right direction - not sure what is causing this...
Thanks a lot for your help!!
Gerald
Website: https://www.barchart.com/forex/quotes/%5...erDir=desc
import requests
from bs4 import BeautifulSoup
def parsepage(page):
soup = BeautifulSoup(page, 'lxml')
table = soup.find('table')
if table is not None:
trs = table.find_all('tr')
for n, tr in enumerate(trs):
tds = tr.find_all('td')
for n1, td in enumerate(tds):
print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
print(f"{td.prettify}")
else:
print(f"Cound not find table")
def scrape_url(url):
response = requests.get(url)
if response.status_code == 200:
page = response.content
parsepage(page)
else:
print(f"unable to retreive {url}")
if __name__ == '__main__':
url = 'https://www.barchart.com/forex/quotes/%5EEURUSD/forward-rates?orderBy=bidPrice&orderDir=desc'
scrape_url(url) Output: unable to retreive https://www.barchart.com/forex/quotes/%5EEURUSD/forward-rates?orderBy=bidPrice&orderDir=desc
Posts: 12,052
Threads: 488
Joined: Sep 2016
change:
table = soup.find('table') to
table = soup.find_all('table')[tableno] replace tableno with the instance of desired table, 0 = first, 1 = second, etc.
|