Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Table data with BeatifulSoup
#1
Hi All,

I'm learning Python right now (and this is actually my first threat so let me know if there is a way to ask my question in a clearer manner)
and I want to retrieve the rates from the table in attached link:
https://www.global-rates.com/interest-ra...libor.aspx

however with the following code:

import urllib.request as ur
from bs4 import BeautifulSoup

url = input('Enter URL: ')

html = ur.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')

data = []
table = soup.find('table', attrs={'class':'lineItemsTable'})
table_body = table.find('tbody')

rows = table_body.find_all('tr')
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele])
I receive this error:

Error:
Traceback (most recent call last): File "....\global_rates.py", line 11, in table_body = table.find('tbody') AttributeError: 'NoneType' object has no attribute 'find'
Can anybody help me on that?

Thanks a lot!
Gerald
buran wrote Sep-24-2019, 04:24 PM:
Please, use proper tags when post code, traceback, output, etc. This time I have added tags for you.
See BBcode help for more info.
Quote
#2
I don't see table tag with class attribute lineItemsTable.
Also this site is using javascript so you need tool like selenium to render the webpage and be able to access the content
I had a typo in the class and that confused me
Quote
#3
Thanks Buran! I will follow the correct tagging going forward!

What would be the correct table tag to retrieve the table with the rates?
If I use 'tabledata1' I receive the same error:

import urllib.request as ur
from bs4 import BeautifulSoup

url = input('Enter URL: ')

html = ur.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')

data = []
table = soup.find('table', attrs={'class':'tabledata1'})
table_body = table.find('tbody')

rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele])
Quote
#4
Try this:
(will scrape page and show all table elements)
you will need to install requests and lxml:
pip install requests, lxml
import requests
from bs4 import BeautifulSoup


def parsepage(page):
    soup = BeautifulSoup(page, 'lxml')
    table = soup.find('table')
    if table is not None:
        trs = table.find_all('tr')
        for n, tr in enumerate(trs):
            tds = tr.find_all('td')
            for n1, td in enumerate(tds):
                print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
                print(f"{td.prettify}")
    else:
        print(f"Cound not find table")

def scrape_url(url):
    response = requests.get(url)
    if response.status_code == 200:
        page = response.content
        parsepage(page)
    else:
        print(f"unable to retreive {url}")

if __name__ == '__main__':
    url = 'https://www.global-rates.com/interest-rates/libor/libor.aspx'
    scrape_url(url)
partial results:
Output:
------------------------------ tr_0, td_0 ------------------------------ <bound method Tag.prettify of <td> <table cellpadding="0" cellspacing="0" style="width:100%;margin:10px 0px 0px 0px;"> <tr> <td> <img alt="" src="//www.global-rates.com/images/misc/ittybittyclear.gif" style="margin:3px 4px 3px 0px;"/> </td> <td align="right" valign="bottom"> <a href="//www.global-rates.com/"><img alt="English - worldwide actual interest rates and economic indicators" border="0" src="//www.global-rates.com/images/misc/gb.gif"/></a> <a href="//nl.global-rates.com/"><img alt="Nederlands - actuele, internationale rentetarieven en economische indicatoren" border="0" src="//www.global-rates.com/images/misc/nl.gif"/></a> <a href="//de.global-rates.com/"><img alt="Deutsch - aktuelle, internationale Zinssätze und Wirtschaftindikatoren" border="0" src="//www.global-rates.com/images/misc/de.gif"/></a> <a href="//es.global-rates.com/"><img alt="Español - Español - tipos de interés e indicadores económicos actuales e internacionales" border="0" src="//www.global-rates.com/images/misc/es.gif"/></a> <a href="//it.global-rates.com/"><img alt="Italiano - tassi dâinteresse internazionali e sugli indicatori economici" border="0" src="//www.global-rates.com/images/misc/it.gif"/></a> <a href="//fr.global-rates.com/"><img alt="Français - taux dâintérêts et indicateurs économiques actuelles et internationaux" border="0" src="//www.global-rates.com/images/misc/fr.gif"/></a> <a href="//pt.global-rates.com/"><img alt="Português - taxas de juros actuais e internacionais e indicadores económicos" border="0" src="//www.global-rates.com/images/misc/pt.gif"/></a> </td> </tr> </table> </td>> ------------------------------ tr_0, td_1 ------------------------------ <bound method Tag.prettify of <td> <img alt="" src="//www.global-rates.com/images/misc/ittybittyclear.gif" style="margin:3px 4px 3px 0px;"/> </td>>
Quote
#5
Thanks Larz60+! I was able to run the code successfully only the table seems to be the wrong one.
It parses through the first table with all the different languages instead of the table with the rates.
How can I tweak the code to jump to the right table?

Output:
------------------------------ tr_0, td_0 ------------------------------ <bound method Tag.prettify of <td> <table cellpadding="0" cellspacing="0" style="width:100%;margin:10px 0px 0px 0px;"> <tr> <td> <img alt="" src="//www.global-rates.com/images/misc/ittybittyclear.gif" style="margin:3px 4px 3px 0px;"/> </td> <td align="right" valign="bottom"> <a href="//www.global-rates.com/"><img alt="English - worldwide actual interest rates and economic indicators" border="0" src="//www.global-rates.com/images/misc/gb.gif"/></a>   <a href="//nl.global-rates.com/"><img alt="Nederlands - actuele, internationale rentetarieven en economische indicatoren" border="0" src="//www.global-rates.com/images/misc/nl.gif"/></a>
Quote
#6
Something funky is with that site or i am just having a moment. If I loop the tables with Larz code modified to:
def parsepage(page):
    soup = BeautifulSoup(page, 'lxml')
    tables = soup.find_all('table')
    for table in tables:
        print(table)
        print("------------------------------------------------------------")
    return
I am able to see the table with the rates. But if i go to the index of that table tables[6], there are other tables before and after (makes no sense).
Quote
#7
Thanks everyone for your help :)
I was able to modify the code from Larz60+ a bit and get the correct table:
import requests
from bs4 import BeautifulSoup

lst=list()

def parsepage(page):
    soup = BeautifulSoup(page, 'lxml')
    table = soup.find_all('table')[7]
    if table is not None:
        trs = table.find_all('tr')[12:13]
        for n, tr in enumerate(trs):
            tds = tr.find_all('td')[:2]
            for n1, td in enumerate(tds):
    #            print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
    #            print(f"{td.prettify}")
#                print(td.contents)
                for key in td:
                    if key != None:
                        lst.append(key)

    else:
        print(f"Cound not find table")
    print(lst)




def scrape_url(url):
    response = requests.get(url)
    if response.status_code == 200:
        page = response.content
        parsepage(page)
    else:
        print(f"unable to retreive {url}")

if __name__ == '__main__':
    url = 'https://www.global-rates.com/interest-rates/libor/libor.aspx'
    scrape_url(url)
my output is now a list with the following:
Output:
['\xa0', <a class="tabledatalink" href="/interest-rates/libor/european-euro/eur-libor-interest-rate-1-month.aspx" title="1 month European euro (EUR) LIBOR interest rate">Euro LIBOR - 1 month</a>, '-0.50200\xa0%']
however I fail to make a nice dictionary of tuples... I would like to make it look like the following:
Output:
{'Euro LIBOR - 1 month, -0.50200%', ...}
Could you help me to create this dictionary?
Thanks a lot!
Quote
#8
Hi All,

just wanted to let you know that I was able to write the code :)
Thanks for all your help!!

import requests
from bs4 import BeautifulSoup

d=dict()

def parsepage(page):
    soup = BeautifulSoup(page, 'lxml')
    table = soup.find_all('table')[7]
    if table is not None:
        trs = table.find_all('tr')[8:15]
        for n, tr in enumerate(trs):
            tds = tr.find_all('td')[:2]
            for n1, td in enumerate(tds):
    #            print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
    #            print(f"{td.prettify}")
#                print(td.contents)
                td = str(td)
                if td.find('LIBOR')>0:
                    spos = td.find('">')
                    epos = td.find('</a>')
                    title = td[spos+2:epos]
                    d[title]=d.get(title,0)

                if td.find('%')>0:
                    spos = td.find('>')
                    epos = td.find('%')
                    rate = td[spos+1:epos-1]
                    d[title] = rate



    else:
        print(f"Cound not find table")




def scrape_url(url):
    response = requests.get(url)
    if response.status_code == 200:
        page = response.content
        parsepage(page)
    else:
        print(f"unable to retreive {url}")

if __name__ == '__main__':
    url = 'https://www.global-rates.com/interest-rates/libor/libor.aspx'
    scrape_url(url)

print(d)
Output:
{'Euro LIBOR - overnight': '-0.56971', 'Euro LIBOR - 1 week': '-0.54743', 'Euro LIBOR - 2 weeks': 0, 'Euro LIBOR - 1 month': '-0.50200', 'Euro LIBOR - 2 months': '-0.44600', 'Euro LIBOR - 3 months': '-0.42529'}
Quote
#9
Hi guys!

I tried the code Larz60+ sent earlier on a different website but it seems like it is unable to retrieve any data. I went through the code and the website details but couldn't figure out what causes the error. Can someone point me in the right direction - not sure what is causing this...

Thanks a lot for your help!!
Gerald

Website: https://www.barchart.com/forex/quotes/%5...erDir=desc

import requests
from bs4 import BeautifulSoup


def parsepage(page):
    soup = BeautifulSoup(page, 'lxml')
    table = soup.find('table')
    if table is not None:
        trs = table.find_all('tr')
        for n, tr in enumerate(trs):
            tds = tr.find_all('td')
            for n1, td in enumerate(tds):
                print(f"\n------------------------------ tr_{n}, td_{n1} ------------------------------")
                print(f"{td.prettify}")
    else:
        print(f"Cound not find table")

def scrape_url(url):
    response = requests.get(url)
    if response.status_code == 200:
        page = response.content
        parsepage(page)
    else:
        print(f"unable to retreive {url}")

if __name__ == '__main__':
    url = 'https://www.barchart.com/forex/quotes/%5EEURUSD/forward-rates?orderBy=bidPrice&orderDir=desc'
    scrape_url(url)
Output:
unable to retreive https://www.barchart.com/forex/quotes/%5EEURUSD/forward-rates?orderBy=bidPrice&orderDir=desc
Quote
#10
change:
table = soup.find('table')
to
table = soup.find_all('table')[tableno]
replace tableno with the instance of desired table, 0 = first, 1 = second, etc.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Extract data from a table Bob_M 0 68 Aug-05-2020, 12:49 PM
Last Post: Bob_M
  Scraping a dynamic data-table in python through AJAX request filozofo 0 289 Mar-18-2020, 01:44 PM
Last Post: filozofo
  Want to scrape a table data and export it into CSV format tahir1990 9 841 Oct-22-2019, 08:03 AM
Last Post: buran
  Using flask to add data to sqlite3 table with PRIMARY KEY catafest 1 2,037 Sep-09-2019, 07:00 AM
Last Post: buran
  sqlalchemy DataTables::"No data available in table" when using self-joined table Asma 0 895 Nov-22-2018, 02:46 PM
Last Post: Asma
  beatifulsoup scrap td tag. piuk3man 1 1,440 Jun-11-2018, 06:16 AM
Last Post: buran
  Insert data in a table after a user is created from djando admin prithvi 0 1,562 Aug-11-2017, 06:25 PM
Last Post: prithvi
  Installation of bs4 and BeatifulSoup landlord1984 7 4,105 Jan-09-2017, 07:41 AM
Last Post: landlord1984

Forum Jump:


Users browsing this thread: 1 Guest(s)