Thanks Axel - Can you explain what the following code does?
Thanks Larz. That makes sense. A few questions just to make sure I understand.
I hadn't seen enumerate yet. Interesting. Where is n defined? How about item? Can you explain the print statement?
Larz - forgot to ask - do you implement csv the same way as Axel?
_len = len(datasets) for x in range(_len -1): t = datasets[x] myrow = [t[1], t[2], t[5]]
(Dec-16-2018, 06:50 PM)Axel_Erfurt Wrote:(Dec-16-2018, 05:02 PM)jlkmb Wrote: and how did the headers get there?
The column headings are within <tr> </tr>
to save it as csv:
(change the delimiter to what you need)
from urllib.request import urlopen from bs4 import BeautifulSoup as bsoup import csv url = 'http://www.cfbstats.com/2018/team/234/index.html' ofile = urlopen(url) soup = bsoup(ofile, "html.parser", from_encoding='utf-8') soup.prettify() table = soup.find("table", attrs={"class":"team-schedule"}) datasets = [] mytable = table.find_all("tr")#[1:] for row in mytable: text = str(row.get_text()).split('\n') datasets.append(text) mypath = '/tmp/test_cfbstats.csv' with open(mypath, 'w') as stream: writer = csv.writer(stream, delimiter='\t') _len = len(datasets) for x in range(_len -1): t = datasets[x] myrow = [t[1], t[2], t[5]] writer.writerow(myrow)
Thanks Larz. That makes sense. A few questions just to make sure I understand.
I hadn't seen enumerate yet. Interesting. Where is n defined? How about item? Can you explain the print statement?
trs = table.find_all('tr') header = [] for n, tr in enumerate(trs): if n == 0: # Get Header ths = tr.find_all('th') for th in ths: header.append(th.text.strip()) for item in header: print('{:22}'.format(item), end='') print()How would I get the last line to not appear?
Larz - forgot to ask - do you implement csv the same way as Axel?