Python Forum
AUC and other training/validation coming in at 1.000...is this overfitting
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
AUC and other training/validation coming in at 1.000...is this overfitting
#1
I'm working on a homework assignment to predict hospital readmissions based on a dataset my school provided. I found a similar walkthrough (https://medium.com/@awlong20/i-added-the...006defd960) that uses a different dataset, but is similar enough I believe that it will help.

However, K nearest neighbors and Logistic REgression (haven't done others yet) are both showing 1.000. Is this overfitting? If so, what can I do to fix it? I'm happy to share more code if it's helpful. ALso, I've tried the thresh as .50 and .36.

KNN
Training:
AUC:1.000
accuracy:0.851
recall:1.000
precision:0.770
specificity:0.664
prevalence:0.500

Validation:
AUC:1.000
accuracy:0.811
recall:1.000
precision:0.655
specificity:0.670
prevalence:0.360



from sklearn.metrics import roc_auc_score, accuracy_score, precision_score, recall_score
def calc_specificity(y_actual, y_pred, thresh):
    # calculates specificity
    return sum((y_pred < thresh) & (y_actual == 0)) /sum(y_actual ==0)

def print_report(y_actual, y_pred, thresh):
    
    auc = roc_auc_score(y_actual, y_pred)
    accuracy = accuracy_score(y_actual, (y_pred > thresh))
    recall = recall_score(y_actual, (y_pred > thresh))
    precision = precision_score(y_actual, (y_pred > thresh))
    specificity = calc_specificity(y_actual, y_pred, thresh)
    print('AUC:%.3f'%auc)
    print('accuracy:%.3f'%accuracy)
    print('recall:%.3f'%recall)
    print('precision:%.3f'%precision)
    print('specificity:%.3f'%specificity)
    print('prevalence:%.3f'%calc_prevalence(y_actual))
    print(' ')
    return auc, accuracy, recall, precision, specificity
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to obtain the result from the unstandardised training dataset vokoyo 0 1,665 May-07-2019, 12:46 AM
Last Post: vokoyo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020