Jun-06-2019, 03:01 PM
That makes sense but for whatever reason I didn't think it was the proper approach based on what I read on similar studies using R. Maybe I need to use a different method/approach? This is a quote from an article that discusses the same thing I'm attempting to do; it probably does a better job of explaining what I'm looking to accomplish.
Quote:Our goal is to use GAMs to learn about each umpire. To start, we grabbed pitch-level data from BS. Next, we fit a GAM for each umpire to identify the likelihood of taken pitches being called a strike and extrapolated from this model the percent chance a taken pitch is called a strike on each part of the plate. Finally, we compared each umpire’s estimated zone with one estimated on all umpires across the major leagues to roughly identify where each umpire has called either fewer or more strikes.