Apr-26-2018, 05:19 AM
That should be okay to parse.
from bs4 import BeautifulSoup html_data = '''\ <table class='Foo'> <thead>...</thead> <tbody> <tr>1</tr> <tr>2</tr> <tr>4</tr> <tr>5</tr> <tr>6</tr> </tbody> </table>''' soup = BeautifulSoup(html_data, 'lxml')Test:
>>> table = soup.find('table') >>> tbody = table.find('tbody') >>> tbody <tbody> <tr>1</tr> <tr>2</tr> <tr>4</tr> <tr>5</tr> <tr>6</tr> </tbody> >>> for item in tbody.find_all('tr'): ... print(item.text) 1 2 4 5 6 >>> # CSS selector >>> soup.select('tr') [<tr>1</tr>, <tr>2</tr>, <tr>4</tr>, <tr>5</tr>, <tr>6</tr>] >>> [int(i.text) for i in soup.select('tr')] [1, 2, 4, 5, 6]