Jul-04-2019, 02:00 AM
It's mighty difficult to give advise without looking at the page.
usual layout for a table is to have multiple tr's and multiple td's within each tr.
Here's an example of this on a simple page with only one table:
the prettify method is in module PrettifyPage.py which is a modified version of BeautfulSoup's prettify which allows changing indent size
usual layout for a table is to have multiple tr's and multiple td's within each tr.
Here's an example of this on a simple page with only one table:
table = soup.find('table', {'summary': 'This table displays Connecticut towns and the year of their establishment.'}) trs = table.tbody.find_all('tr') for n, tr in enumerate(trs): for n1, td in enumerate(self.get_td(tr)): print(f'==================================== tr {n}, td: {n1} ====================================') print(f'{self.pp.prettify(td, 2)}')This will give you a layout of the page and make it easier to determine how to proceed.
the prettify method is in module PrettifyPage.py which is a modified version of BeautfulSoup's prettify which allows changing indent size
from bs4 import BeautifulSoup import requests import pathlib class PrettifyPage: def __init__(self): pass def prettify(self, soup, indent): pretty_soup = str() previous_indent = 0 for line in soup.prettify().split("\n"): current_indent = str(line).find("<") if current_indent == -1 or current_indent > previous_indent + 2: current_indent = previous_indent + 1 previous_indent = current_indent pretty_soup += self.write_new_line(line, current_indent, indent) return pretty_soup def write_new_line(self, line, current_indent, desired_indent): new_line = "" spaces_to_add = (current_indent * desired_indent) - current_indent if spaces_to_add > 0: for i in range(spaces_to_add): new_line += " " new_line += str(line) + "\n" return new_line