Need help with Beautiful Soup - table

jlkmb · (This post was last modified: Dec-17-2018, 02:47 PM by jlkmb.)

Thanks Axel - Can you explain what the following code does?

    _len = len(datasets)
    for x in range(_len -1):
        t = datasets[x]
        myrow = [t[1], t[2], t[5]]

(Dec-16-2018, 06:50 PM)Axel_Erfurt Wrote:

(Dec-16-2018, 05:02 PM)jlkmb Wrote: and how did the headers get there?

The column headings are within <tr> </tr>

to save it as csv:
(change the delimiter to what you need)

from urllib.request import urlopen
from bs4 import BeautifulSoup as bsoup
import  csv
 
url = 'http://www.cfbstats.com/2018/team/234/index.html'
 
ofile = urlopen(url)
soup = bsoup(ofile, "html.parser", from_encoding='utf-8')
soup.prettify()
 
table = soup.find("table", attrs={"class":"team-schedule"})
 
datasets = []
mytable = table.find_all("tr")#[1:]
for row in mytable:
    text = str(row.get_text()).split('\n')
    datasets.append(text)

mypath = '/tmp/test_cfbstats.csv'
with open(mypath, 'w') as stream:
    writer = csv.writer(stream, delimiter='\t')
    _len = len(datasets)
    for x in range(_len -1):
        t = datasets[x]
        myrow = [t[1], t[2], t[5]]
        writer.writerow(myrow)

Thanks Larz. That makes sense. A few questions just to make sure I understand.

I hadn't seen enumerate yet. Interesting. Where is n defined? How about item? Can you explain the print statement?

  
trs = table.find_all('tr')
header = []
for n, tr in enumerate(trs):
    if n == 0:
        # Get Header
        ths = tr.find_all('th')
        for th in ths:
            header.append(th.text.strip())
        for item in header:
            print('{:22}'.format(item), end='')
        print()

How would I get the last line to not appear?

Larz - forgot to ask - do you implement csv the same way as Axel?

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Beautiful Soup - access a rating value in a class	KatMac	1	3,483	Apr-16-2021, 01:27 PM Last Post: snippsat
	Beginner web scraping/Beautiful Soup help	7ken8	2	2,632	Jan-28-2021, 04:26 PM Last Post: 7ken8
	Help: Beautiful Soup - Parsing HTML table	ironfelix717	2	2,704	Oct-01-2020, 02:19 PM Last Post: snippsat
	Beautiful Soup (suddenly) doesn't get full webpage html	j.crater	8	17,028	Jul-11-2020, 04:31 PM Last Post: j.crater
	Requests-HTML vs Beautiful Soup - How to Choose?	robin73	0	3,835	Jun-23-2020, 02:53 PM Last Post: robin73
	looking for direction - scrappy, crawler, beautiful soup	Sly_Corn	2	2,472	Mar-17-2020, 03:17 PM Last Post: Sly_Corn
	Beautiful soup truncates results	jonesjoz	4	3,903	Mar-09-2020, 06:04 PM Last Post: jonesjoz
	Beautiful soup and tags	starter_student	11	6,207	Jul-08-2019, 03:41 PM Last Post: starter_student
	Beautiful Soup find_all()	kirito85	2	3,389	Jun-14-2019, 02:17 AM Last Post: kirito85
	[split] Using beautiful soup to get html attribute value	moski	6	6,330	Jun-03-2019, 04:24 PM Last Post: moski

Need help with Beautiful Soup - table

User Panel Messages

Announcements