Python Forum
BeautifulSoup: Error while extracting a value from an HTML table
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
BeautifulSoup: Error while extracting a value from an HTML table
#1
Hi all,

From the below HTML text:

<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr>
I would like to extract the value 872,308,162


from bs4 import BeautifulSoup
import requests, io
import pandas as pd

timestamp = pd.datetime.today().strftime('%Y%m%d-&H&M&S')

links_df = pd.read_excel(r'myfolder\myfile.xlsx', sheetname='Sheet1')
links_Df = links_df[(links_df['Country'] == 'PT')]

results = pd.DataFrame(columns=['ISIN', 'N Shares', 'Link'])

for ISIN in links_df.ISIN:
    link='https://live.euronext.com/en/product/equities/=' + ISIN + '-XLIS/market-information'
    shares = soup.find('td', {'Shares outstanding'}).contents
    results = results.append({'ISIN': ISIN, 'N Shares': shares, 'Link': link}, ignore_index=True)
    print(ISIN +": " + shares)
    
    results.to_csv(r'myfolder\myoutputfile' + timestamp + 'csv', index=False)

print('Finish')
The error I get is
Error:
'NoneType' object has no attribute 'contents'
Could you please guide on this?
Reply
#2
Quote:shares = soup.find('td', {'Shares outstanding'}).contents

I am sorry, but I didn't manage to find in BS::find documentation an argument of type set([]). Can you show me it?
Reply
#3
First you most always tell what parser BS should use,here html.parser which comes with Python.
Can not search as you try to do with {'Shares outstanding'}.
Here a example.
from bs4 import BeautifulSoup

html = '''\
<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>'''

soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', class_="table table__group table-sm table-hover")
price = table.find_all('strong')[3]
print(f'The price is {price.text}')
Output:
The price is 872,308,162
Reply
#4
(Aug-23-2019, 11:24 AM)snippsat Wrote: First you most always tell what parser BS should use,here html.parser which comes with Python.
Can not search as you try to do with {'Shares outstanding'}.
Here a example.
from bs4 import BeautifulSoup

html = '''\
<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>'''

soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', class_="table table__group table-sm table-hover")
price = table.find_all('strong')[3]
print(f'The price is {price.text}')
Output:
The price is 872,308,162

Thanks. I am not sure to fully understand why it is not possible to look for the "Shares outstanding", but the solution provided works great.

Cheers
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Strange ModuleNotFound Error on BeautifulSoup for Python 3.11 Gaberson19 1 923 Jul-13-2023, 10:38 AM
Last Post: Gaurav_Kumar
  [Solved]Help with BeautifulSoup.getText() Error Extra 5 3,646 Jan-19-2023, 02:03 PM
Last Post: prvncpa
  Getting a URL from Amazon using requests-html, or beautifulsoup aaander 1 1,620 Nov-06-2022, 10:59 PM
Last Post: snippsat
  requests-html + Beautifulsoup klaarnou 0 2,399 Mar-21-2022, 05:31 PM
Last Post: klaarnou
  Suggestion request for scrapping html table Vkkindia 3 1,989 Dec-06-2021, 06:09 PM
Last Post: Larz60+
  BeautifulSoup Showing none while extracting image url josephandrew 0 1,905 Sep-20-2021, 11:40 AM
Last Post: josephandrew
  HTML multi select HTML listbox with Flask/Python rfeyer 0 4,536 Mar-14-2021, 12:23 PM
Last Post: rfeyer
Smile Extracting the Address tag from multiple HTML files using BeautifulSoup Dredd 8 4,803 Jan-25-2021, 12:16 PM
Last Post: Dredd
  Error with NumPy, BeautifulSoup when using pip tsurubaso 7 5,165 Oct-20-2020, 04:34 PM
Last Post: tsurubaso
  Help: Beautiful Soup - Parsing HTML table ironfelix717 2 2,623 Oct-01-2020, 02:19 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020