Python Forum
Reading Baseball Box Scores
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Reading Baseball Box Scores
#1
Here is a sample of 100 baseball box scores. These are automatically generated
by another app. There may be hundreds of these box scores chained together in 1 file.
This file contains 100 which are separated by a date/time line.
Your area of interest should be concentrated to two lines of each box score.

They are the 2 lines immediately following the line that contains "R H E"
These 2 lines are called the linescore lines. The linescore lines are arranged in groups of 3 innings
played, followed by a space, and then 3 more innings, etc. A typical game has 3 groupings of 3 innings
per game (for a total of 9 innings). If the HOME team (the 2nd line) has a dash ("-") where the 9th inning
appears, that signifies that the HOME team did not require an at bat in their half of the 9th inning,
because they were already leading and won the game.

Extra inning games will have additional groups of 3. on the same line, following the inning-by-inning
linescore (regardless of the number of innings played), are the game totals of "Runs", "Hits", "and
"Errors" for the given team... this is the total of the number of runs, hits, and errors accumulated by the team.

The two lines of the linescore begin with a 2-digit year of the team playing, followed by visiting team
name on the first line, and then an inning-by-inning tally of runs scored in each inning for that team,
followed by the last 3 entries on the line representing the total number of runs scored, hits made, and
errors made by the team. The HOME team stats are on the on the next line (or second line) of the linescore.

In this example, your area of interest for the first 3 games listed, are located on lines:
29 & 30 (for the 1st game)
58 & 59 (for the 2nd game)
87 & 88 (for the third game)

NOTE: These line numbers will vary due to number of players/pitchers who played, etc.
You will have to "find" the lines of interest with code.
The code should find a line in each box score that has the string "R H E"
(2 spaces in between the "R&H", and 2 spaces between the "H&E")
Once you find that string, the next 2 lines are what you're looking for.
AWAY (or visiting) team is the first line, and the HOME team is on the 2nd line.

Here are 3 sample games box scores: There is a game # at the end of the first line on each
new box score identified by the "#" (pound sign):

{See File Attachment]

Write a Python program that:
Prompts the user for filename/location of the data file containing the box scores
Opens that file and finds all linescore lines (the 2 lines you're looking for in each box score).

Identify/Count the teams participating in the entire list of box scores
How many games Won and Lost by each team
Total Runs, Hits, and Errors accumulated for each team.
Save all readable output to "SBSMatrix.dat"
.txt   x1.txt (Size: 174.89 KB / Downloads: 32)
Reply


Messages In This Thread
Reading Baseball Box Scores - by tommythumb - May-15-2024, 05:32 PM
RE: Reading Baseball Box Scores - by Pedroski55 - May-16-2024, 09:13 AM
RE: Reading Baseball Box Scores - by tommythumb - May-19-2024, 06:25 AM
RE: Reading Baseball Box Scores - by Pedroski55 - May-21-2024, 09:13 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Calculating BLEU scores in Python kg17 1 2,667 Jan-12-2021, 08:26 PM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020