Dec-16-2018, 06:50 PM
(Dec-16-2018, 05:02 PM)jlkmb Wrote: and how did the headers get there?
The column headings are within <tr> </tr>
to save it as csv:
(change the delimiter to what you need)
from urllib.request import urlopen from bs4 import BeautifulSoup as bsoup import csv url = 'http://www.cfbstats.com/2018/team/234/index.html' ofile = urlopen(url) soup = bsoup(ofile, "html.parser", from_encoding='utf-8') soup.prettify() table = soup.find("table", attrs={"class":"team-schedule"}) datasets = [] mytable = table.find_all("tr")#[1:] for row in mytable: text = str(row.get_text()).split('\n') datasets.append(text) mypath = '/tmp/test_cfbstats.csv' with open(mypath, 'w') as stream: writer = csv.writer(stream, delimiter='\t') _len = len(datasets) for x in range(_len -1): t = datasets[x] myrow = [t[1], t[2], t[5]] writer.writerow(myrow)