Python Forum
I cannot figure out this error
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
I cannot figure out this error
#1
#!/usr/bin/python
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import xgboost as xgb

from sklearn import metrics

df = pd.read_csv('plays.csv')

print(len(df))
print(df.head())



# drop st plays

df = df[~df['isSTPlay']]
print (len(df))

# drop kneels

df = df[~df['playDescription'].str.contains('kneels')]
print(len(df))


#drop overtime

df = df[~(df['quarter']  == 5)]
print(len(df))

#convert time/quarters 
def translate_game_clock(row):
    raw_game_clock = row['GameClock']
    quarter = row['quarter']
    minutes, seconds_raw = raw_game_clock.partition(':')[::2]
    
    seconds = seconds_raw.partition(':')[0]
    
    total_seconds_left_in_quarter = int(seconds) + (int(minutes) * 60)
    
    if quarter == 3 or quarter == 1:
         return total_seconds_left_in_quarter + 900
    elif quarter == 4 or quarter == 2: 
         return total_seconds_left_in_quarter
            
if 'GameClock' in list (df.columns):
    df["secondsLeftInHalf"] = df.apply(translate_game_clock, axis=1)

if 'quarter' in list (df.columns): 
    df["half"] = df['quarter'].map(lambda q: 2 if q > 2 else 1)
    
    
    
    

def yards_to_endzone(row):
    if row['possessionTeam'] == row['yardlineSide']:
        
        return 100 - row['yardlineNumber']
    
    else : 
                                    
        return row['yardlineNumber']
                                
df['yardsToEndzone'] = df.apply(yards_to_endzone, axis = 1)

def transform_off_personnel(row):

    rb_count = 0

    te_count = 0

    wr_count = 0

    ol_count = 0

    dl_count = 0

    db_count = 0

    if not pd.isna(row['personnel.offense']):
        personnel = row['personnel.offense'].split(',')

    for p in personnel:

        if p[2:4] == 'RB':

            rb_count = int(p[0])
        elif p[2:4] == 'TE':

            te_count = int(p[0])
        elif p[2:4] == 'WR':

            wr_count = int(p[0])
        elif p[2:4] == 'OL':

            ol_count = int(p[0])
        elif p[2:4] == 'DL':

            dl_count = int(p[0])
        elif p[2:4] == 'DB':

            db_count = int(p[0])


    return pd.Series([
            rb_count,
            te_count,
            wr_count,
            ol_count,
            dl_count,
            db_count,
    ])

df[[
    'rb_count',
    'te_count',
    'wr_count',
    'ol_count',
    'dl_count',
    'db_count',
    ]] = df.apply(transform_off_personnel, axis=1)

df['offenseFormation'] = df['offenseFormation'].map(lambda f: ('EMPTY' if pd.isna(false) else f))


def formation(row):

    form = row['offenseFormation'].strip()

    if form == 'SHOTGUN':

        return 0
    elif form == 'SINGLEBACK':

        return 1
    elif form == 'EMPTY':

        return 2
    elif form == 'I_FORM':

        return 3
    elif form == 'PISTOL':

        return 4
    elif form == 'JUMBO':

        return 5
    elif form == 'WILDCAT':

        return 6
    elif form == 'ACE':

        return 7
    else:

        return -1


df['numericFormation'] = df.apply(formation, axis=1)

print(df.yardlineNumber.unique())

def play_type(row):
    if row['PassResult'] == 'I' or row['PassResult'] == 'C' or row['PassResult'] == 'S':
        
        return 'Passing'
                                       
    else:
                                       
        return 'Rushing' 

df['play_type'] = df.apply(play_type, axis = 1)
df['numeric_PlayType'] = df['play_type'] .map(lambda p : 1 if p == 'Passing' else 0)                


df_final= df[['down','yardsToGo','yarsdtoEndzone','rb_count','te_count','wr_count','ol_count','db_count','secondsLeftInHalf',
             'half','numericPlayType', 'numericFormation','play_type']]

print(df.final.describe(include='all'))

print(df.yardlineNumber.unique())

df['yardlineNumber'] = df['yardlineNumber'].fillna(50)

sns.catplot(x='play_type', kind='count', data=df_final, orient='h')

plt.show()

sns.catplot(x="down", kind="count", hue='play_type', data=df_final)

plt.show()

sns.lmplot(x="yrdsToGo", y="numericPlayType", data=df_final, y_jitter=0.03, Logistic=True, aspect=2);

plt.show()

train_df, validation_df, test_df = np.split(df_final.sample(frac=1),[int(0.7 * len(df)),int(0.9 * len(df))])

print("Training size is %d, validation size is %d, test_size is %d" % (len(train_df), len(validation_df),len(test_df)))
                                                                       
                                                                       

train_clean_df = train_df.drop(columns=['numericPlayType'])

d_train = xgb.DMatrix(train_clean_df, label=train_df['numericPlayType'],feature_names=list(train_clean_df))

val_clean_df = train_df.drop(columns=['numericPlayType'])

d_val = xgb.DMatrix(val_clean_df, label=valiation_df['numericPlayType'],feature_names=list(val_clean_df))

eval_list [(d_train, 'train'), (d_val, 'eval')]
            
results ={}            

parm = {
    
   'objective': 'binary:logistic',
    
    'eval_metric':  'auc',
    
    'max_depth': 5,
    
    'eta': 0.2,
    
    'rate_drop': 0.2,
    
    'min_child_weight': 6,
    
    'gama' : 4,
    
    'subsample': 0.8,
    
    'alpha': 0.1
    
}

num_round = 250
xgb_model = xgb.train(param, d_train, num_round, eval_list, early_stopping_rounds=8)

test_clean_df = test_df.drop(columns=['numericPlayType'])
d_test = xgb.DMatrix(test_clean_df, label=test_df['numericPlayType'], feature_names=list(test_clean_df))

actual = test_df['numericPlayType']
predictions = xgb_model.predict(d_test)
print(predictions[:5])

accuracy_predictions = np.round(predictions)
accuracy = metrics.accuracy_score(actual, rounded_predictions)
print("Metrics:\nAccuracy: % 4f" % (accuracy))
I am not sure what is goingon.My python is not good enough to detect the error.


Any help appreciated. Thanks in advance.

the link is:

https://opensource.com/article/19/10/pre...%20More%20

Error:
UnboundLocalError Traceback (most recent call last) <ipython-input-1-bceecf1fca01> in <module> 123 'dl_count', 124 'db_count', --> 125 ]] = df.apply(transform_off_personnel, axis=1) 126 127 df['offenseFormation'] = df['offenseFormation'].map(lambda f: ('EMPTY' if pd.isna(false) else f)) /opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds) 6926 kwds=kwds, 6927 ) -> 6928 return op.get_result() 6929 6930 def applymap(self, func): /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in get_result(self) 184 return self.apply_raw() 185 --> 186 return self.apply_standard() 187 188 def apply_empty_result(self): /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply_standard(self) 290 291 # compute the result using the series generator --> 292 self.apply_series_generator() 293 294 # wrap results /opt/conda/lib/python3.7/site-packages/pandas/core/apply.py in apply_series_generator(self) 319 try: 320 for i, v in enumerate(series_gen): --> 321 results[i] = self.f(v) 322 keys.append(v.name) 323 except Exception as e: <ipython-input-1-bceecf1fca01> in transform_off_personnel(row) 85 personnel = row['personnel.offense'].split(',') 86 ---> 87 for p in personnel: 88 89 if p[2:4] == 'RB': UnboundLocalError: ("local variable 'personnel' referenced before assignment", 'occurred at index 12807')
Respectfully,

ErnestTBass

Attached Files

.zip   plays-m.zip (Size: 17.2 KB / Downloads: 330)
Reply
#2
Hint: what do you think happens if the expression not pd.isna(row['personnel.offense']) on line 84 evaluates to False?

Reply
#3
personnel is defined on line 85.

That should be it.

I am sorry for taking so long to get back to you. But other things intervened.

As I said

personnel

is defined right before it is used.

There should be no error. This is not my python code, but somebody else's. I left ther reference in the first or second post. I have programmed a lot, but not in python. A variable must be defined before it is used in any computer language.

Please drop another hint.

Respectfully,

ErnestTBass
Reply
#4
I will admit it. I am confused. What is the python code trying to do from line 84 to 106.

My python 3+ is just not that good.

Please explain.

any help appreciated. Thanks in advance.

Respectfully,

EnestTBass
Reply
#5
When this evaluates to False
if not pd.isna(row['personnel.offense']):
this is not executed
personnel = row['personnel.offense'].split(',')
so when you get here
for p in personnel:
personnel is not defined.

Your code only works if the "if" statement evaluates to True.

You are long past being able to use "My python is not that good" as an excuse. Your python is fine, you debugging is weak. If you don't understand what is going on in your program, add some code to help you understand better. If I had your problem I would add a few print statements to help me understand:
    try:  # Check if personnel is defined
         print(personnel)
    except:
         print('personnel not defined')
    print('checking if personnel.offense is available')
    if not pd.isna(row['personnel.offense']):
        print('it is')
        personnel = row['personnel.offense'].split(',')
 
    try:  # Check if personnel is defined now
         print(personnel)
    except:
         print('personnel still not defined')
    for p in personnel:
I would run the code and look at what is returned. I would see that personnel is not defined before the if statement. I would see the if statement evaluates to False because it does not print "it is", and I would see that personnel is still not defined after the if statement. Now I know that either I need to protect the code below. Maybe moving it inside the if statement. After I get the code working I remove the extra debugging statements.
Reply
#6
I would like to add one tidbit to deanhystad excellent answer: starting from Python 3.8 f-strings support = for self-documenting expressions and debugging. I have found using it to be very useful and addictive :-)

>>> personnel = ['very', 'confidential']
>>> f'{personnel=}'
"personnel=['very', 'confidential']"
As a side-effect one can now without any magic and hack get (variable) name as string:

>>> f'{personnel=}'.split('=')[0]
'personnel'
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#7
I thank for your reply on all fronts. Yes, it was helpful.

Now to my concern: yes it seems personnel is not defined.

But what do you do then? It has to be defined! Must be.

I mean if it does not go through the loop with personnel defined
then I am up the creek without a paddle.

This seems to be a formatting problem. That is why it is not so obvious to me.

If it were a syntax problem then it is easy, even a logic problem is easy.

Python is very big on formating (other computer languages use formating, but in a very simple way), which is why I use PEP8 or Black to format all of my python code. I just do not get it.

Some of it makes sense and much of it (formatting) does not. I got that piece of code from another app that I
did not write.

So if it is a formatting problem then is is beyond me. That is probably why I have not fixed the bug.

In python one must worry about syntax, logic and formatting in no particular order.

This is why I am hitting this error.

Any help appreciated.

Respectfully,

ErnestTBass
Reply
#8
I guess you better check what pd.isna(row['personnel.offense']) returns and compare to your expectations
Reply
#9
Yes, I have done that. Right out the gate the program on starting that line of code "personnel.offense" it a NaN.


Somehow that must be dealt with either do not go through the IF statement and count offensive positions or just eliminate all rows with "NaN" in the position under personnel.offense.

I choose to do the later by inserting a line the python code right after the three line section dealing with "kneels".

I am a little unsure about his because that would eliminate the whole row which might have useful info for some other parts of the program.

I could always program it to detect "NaN" and do not go through the rest of the line if statement, but it seems to be doing that now and still it is crashing.

If not pd.isna(row[personnel.offense']:) is false then just do not go through the if statement; yet it crashes. That is curious and probably could be answered by some whoe python is better than mine.

Any help appreciated. Thanks in advance.

Respectfully,

ErnestTBass    
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  I'm trying to figure out why I always get this error while trying to calculate zonal AndreiDaniel22 0 1,533 Sep-17-2021, 09:00 AM
Last Post: AndreiDaniel22
  I cannot figure out this error ErnestTBass 8 3,423 Apr-24-2020, 06:03 PM
Last Post: ErnestTBass
  I cannot figure out this error ErnestTBass 6 2,537 Mar-28-2020, 04:58 AM
Last Post: SheeppOSU
  Python error. Can't figure it out. ignatius 9 4,726 Oct-21-2018, 11:47 PM
Last Post: ignatius
  Can't figure out the syntax error, relatively simple code maho686868 3 3,083 Jul-08-2018, 03:43 PM
Last Post: volcano63

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020