Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 BeautifulSoup: Error while extracting a value from an HTML table
#1
Hi all,

From the below HTML text:

<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr>
I would like to extract the value 872,308,162


from bs4 import BeautifulSoup
import requests, io
import pandas as pd

timestamp = pd.datetime.today().strftime('%Y%m%d-&H&M&S')

links_df = pd.read_excel(r'myfolder\myfile.xlsx', sheetname='Sheet1')
links_Df = links_df[(links_df['Country'] == 'PT')]

results = pd.DataFrame(columns=['ISIN', 'N Shares', 'Link'])

for ISIN in links_df.ISIN:
    link='https://live.euronext.com/en/product/equities/=' + ISIN + '-XLIS/market-information'
    shares = soup.find('td', {'Shares outstanding'}).contents
    results = results.append({'ISIN': ISIN, 'N Shares': shares, 'Link': link}, ignore_index=True)
    print(ISIN +": " + shares)
    
    results.to_csv(r'myfolder\myoutputfile' + timestamp + 'csv', index=False)

print('Finish')
The error I get is
Error:
'NoneType' object has no attribute 'contents'
Could you please guide on this?
Quote
#2
Quote:shares = soup.find('td', {'Shares outstanding'}).contents

I am sorry, but I didn't manage to find in BS::find documentation an argument of type set([]). Can you show me it?
Quote
#3
First you most always tell what parser BS should use,here html.parser which comes with Python.
Can not search as you try to do with {'Shares outstanding'}.
Here a example.
from bs4 import BeautifulSoup

html = '''\
<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>'''

soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', class_="table table__group table-sm table-hover")
price = table.find_all('strong')[3]
print(f'The price is {price.text}')
Output:
The price is 872,308,162
Quote
#4
(Aug-23-2019, 11:24 AM)snippsat Wrote: First you most always tell what parser BS should use,here html.parser which comes with Python.
Can not search as you try to do with {'Shares outstanding'}.
Here a example.
from bs4 import BeautifulSoup

html = '''\
<div class="card-body">
        <div class="table-responsive">
          <table class="table table__group table-sm table-hover">
                          <tr>
                <td>Trading currency</td>
                <td><strong>EUR</strong></td>
              </tr>
                                      <tr>
                <td>Price multiplier</td>
                <td><strong>1</strong></td>
              </tr>
                                      <tr>
                <td>Quantity notation</td>
                <td><strong>Number of units</strong></td>
              </tr>
                                      <tr>
                <td>Shares outstanding</td>
                <td><strong>872,308,162</strong></td>
              </tr>
                                      <tr>
                <td>Trading group</td>
                <td><strong>P0</strong></td>
              </tr>
                                      <tr>
                <td>Trading type</td>
                <td><strong>Continuous</strong></td>
              </tr><div class="g-recaptcha" data-sitekey="VALUE_TO_RETURN"></div>'''

soup = BeautifulSoup(html, 'html.parser')
table = soup.find('table', class_="table table__group table-sm table-hover")
price = table.find_all('strong')[3]
print(f'The price is {price.text}')
Output:
The price is 872,308,162

Thanks. I am not sure to fully understand why it is not possible to look for the "Shares outstanding", but the solution provided works great.

Cheers
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  TDD/CSS & HTML testing - CSS selector (.has-error) makoseafox 0 115 May-13-2020, 07:41 PM
Last Post: makoseafox
  Extracting html data using attributes WiPi 14 449 May-04-2020, 02:04 PM
Last Post: snippsat
  Python beautifulsoup pagination error The61 5 360 Apr-09-2020, 09:17 PM
Last Post: Larz60+
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 172 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning
  Imprt HTML table to array meleghengersor 2 202 Jan-23-2020, 10:23 AM
Last Post: perfringo
  Web crawler extracting specific text from HTML lewdow 1 766 Jan-03-2020, 11:21 PM
Last Post: snippsat
  Beautifulsoup table question tantony 5 381 Sep-30-2019, 03:26 PM
Last Post: tantony
  convert html table to json bhojendra 5 3,875 Jul-30-2019, 07:53 PM
Last Post: DeaD_EyE
  How to capture Single Column from Web Html Table? ahmedwaqas92 5 655 Jul-29-2019, 02:17 AM
Last Post: ahmedwaqas92
  [Flask] html error 405 SheeppOSU 0 499 Jun-08-2019, 04:42 PM
Last Post: SheeppOSU

Forum Jump:


Users browsing this thread: 1 Guest(s)