Need help with Beautiful Soup - table

jlkmb · (This post was last modified: Dec-16-2018, 05:44 PM by jlkmb.)

Axel - Very interesting. Thank you!

Do you mind stepping through some questions/assumptions?

This creates a dataset from a table that takes all rows in the table, splits the string after a space and creates a new line. The rows are then appended.

datasets = []
mytable = table.find_all("tr")#[1:]
for row in mytable:
    text = str(row.get_text()).split('\n')
    datasets.append(text)

I'm having a real hard time following this one - and how did the headers get there?

_len = len(datasets)
for x in range(_len -1):
    t = datasets[x]
    print((t[1] + '\t' + t[2] + '\t' + t[5]).expandtabs(30))

I have learned some code for csv writer. Below is a sample.

 with open('test_cfbstats.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['Date', 'Opponent'])
    writer.writerows(data)

How would you suggest modifying for use in your code? I'm not sure if the writerow would be necessary, and the writerows would change to datasets?

File "cfbstats_larz.py", line 9, in <module>
soup = BeautifulSoup(page, 'html.parser')
NameError: name 'page' is not defined

Is is something I did?

(Dec-15-2018, 11:10 PM)Larz60+ Wrote: I did it a bit differently, same results:

import requests
from bs4 import BeautifulSoup
import csv
import os


url = 'http://www.cfbstats.com/2018/team/234/index.html'
r = requests.get(url)
soup = BeautifulSoup(page, 'html.parser')
 
table = soup.findAll("table",{"class": "team-schedule"})[0]
trs = table.find_all('tr')
header = []
for n, tr in enumerate(trs):
    if n == 0:
        # Get Header
        ths = tr.find_all('th')
        for th in ths:
            header.append(th.text.strip())
        for item in header:
            print('{:22}'.format(item), end='')
        print()

        continue
    else:
        game_item = []
        tds = tr.find_all('td')
        for td in tds:
            game_item.append(td.text.strip())
    for item in game_item:
        print('{:22}'.format(item), end='')
    print()

Output:Date                  Opponent              Result                Game Time             Attendance
09/03/18              Virginia Tech         L 3-24                3:12                  75,237
09/08/18              Samford               W 36-26               3:51                  72,239
09/15/18              @ 17 Syracuse         L 7-30                3:37                  37,457
09/22/18              Northern Ill.         W 37-19               3:34                  65,633
09/29/18              @ Louisville          W 28-24               3:27                  52,798
10/06/18              @ Miami (Fla.)        L 27-28               4:01                  65,490
10/20/18              Wake Forest           W 38-17               3:34                  67,274
10/27/18              2 Clemson             L 10-59               3:47                  68,403
11/03/18              @ North Carolina St.  L 28-47               3:33                  57,600
11/10/18              @ 3 Notre Dame        L 13-42               3:22                  77,622
11/17/18              Boston College        W 22-21               3:31                  57,274
11/24/18              10 Florida            L 14-41               3:27                  71,953
@ : Away, + : Neutral Site

My apologies for combining the replies, I don't know what happened.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Beautiful Soup - access a rating value in a class	KatMac	1	3,479	Apr-16-2021, 01:27 PM Last Post: snippsat
	Beginner web scraping/Beautiful Soup help	7ken8	2	2,627	Jan-28-2021, 04:26 PM Last Post: 7ken8
	Help: Beautiful Soup - Parsing HTML table	ironfelix717	2	2,703	Oct-01-2020, 02:19 PM Last Post: snippsat
	Beautiful Soup (suddenly) doesn't get full webpage html	j.crater	8	16,999	Jul-11-2020, 04:31 PM Last Post: j.crater
	Requests-HTML vs Beautiful Soup - How to Choose?	robin73	0	3,833	Jun-23-2020, 02:53 PM Last Post: robin73
	looking for direction - scrappy, crawler, beautiful soup	Sly_Corn	2	2,470	Mar-17-2020, 03:17 PM Last Post: Sly_Corn
	Beautiful soup truncates results	jonesjoz	4	3,897	Mar-09-2020, 06:04 PM Last Post: jonesjoz
	Beautiful soup and tags	starter_student	11	6,202	Jul-08-2019, 03:41 PM Last Post: starter_student
	Beautiful Soup find_all()	kirito85	2	3,388	Jun-14-2019, 02:17 AM Last Post: kirito85
	[split] Using beautiful soup to get html attribute value	moski	6	6,328	Jun-03-2019, 04:24 PM Last Post: moski

Need help with Beautiful Soup - table

User Panel Messages

Announcements