Python Forum

Full Version: BeautifulSoup: Error while extracting a value from an HTML table
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all,

From the below HTML text:

<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr>
I would like to extract the value 872,308,162


from bs4 import BeautifulSoup
import requests, io
import pandas as pd

timestamp = pd.datetime.today().strftime('%Y%m%d-&H&M&S')

links_df = pd.read_excel(r'myfolder\myfile.xlsx', sheetname='Sheet1')
links_Df = links_df[(links_df['Country'] == 'PT')]

results = pd.DataFrame(columns=['ISIN', 'N Shares', 'Link'])

for ISIN in links_df.ISIN:
    link='https://live.euronext.com/en/product/equities/=' + ISIN + '-XLIS/market-information'
    shares = soup.find('td', {'Shares outstanding'}).contents
    results = results.append({'ISIN': ISIN, 'N Shares': shares, 'Link': link}, ignore_index=True)
    print(ISIN +": " + shares)
    
    results.to_csv(r'myfolder\myoutputfile' + timestamp + 'csv', index=False)

print('Finish')
The error I get is
Error:
'NoneType' object has no attribute 'contents'
Could you please guide on this?
Quote:shares = soup.find('td', {'Shares outstanding'}).contents

I am sorry, but I didn't manage to find in BS::find documentation an argument of type set([]). Can you show me it?
First you most always tell what parser BS should use,here html.parser which comes with Python.
Can not search as you try to do with {'Shares outstanding'}.
Here a example.
from bs4 import BeautifulSoup

html = '''\
<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>'''

soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', class_="table table__group table-sm table-hover")
price = table.find_all('strong')[3]
print(f'The price is {price.text}')
Output:
The price is 872,308,162
(Aug-23-2019, 11:24 AM)snippsat Wrote: [ -> ]First you most always tell what parser BS should use,here html.parser which comes with Python.
Can not search as you try to do with {'Shares outstanding'}.
Here a example.
from bs4 import BeautifulSoup

html = '''\
<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>'''

soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', class_="table table__group table-sm table-hover")
price = table.find_all('strong')[3]
print(f'The price is {price.text}')
Output:
The price is 872,308,162

Thanks. I am not sure to fully understand why it is not possible to look for the "Shares outstanding", but the solution provided works great.

Cheers