Python Forum

data = pd.read_csv('data.csv')

X = data[:,:-1]
Y = data['Outcome']

X_train,X_test,Y_train,Y_test = train_test_split(X,Y, test_size=0.3)

model = GaussianNB()

model.fit(X_train,Y_train)

y_pred = model.predict(X_test)

acc = accuracy_score(Y_test,y_pred)
cm = confusion_matrix(y_pred,Y_test)

print(cm)
print(acc)

[attachment=1850]

for starters, You're not loading the file diabetes.csv

Then, with print diagnostics (still some errors -- left for you to fix):

import pandas as pd
from sklearn.model_selection import train_test_split
import os


# I need next line on my system to show where input file is located (same dir as script)
os.chdir(os.path.abspath(os.path.dirname(__file__)))
data = pd.read_csv('diabetes.csv')

# This will print entire dataframe (with ellipsis)
print(f"All data:\n{data}")

X = data[1:-1]
print(f"\nAll but last row, X:\n{X}")

Y = data['Outcome']
print(f"\nLast column:Y\n{Y}")

X_train,X_test,Y_train,Y_test = train_test_split(X,Y, test_size=0.3)
 
model = GaussianNB()
 
model.fit(X_train,Y_train)
 
y_pred = model.predict(X_test)
 
acc = accuracy_score(Y_test,y_pred)
cm = confusion_matrix(y_pred,Y_test)
 
print(cm)
print(acc)

Output:All data:
     Pregnancies  Glucose  BloodPressure  SkinThickness  Insulin   BMI  DiabetesPedigreeFunction  Age  Outcome
0              6      148             72             35        0  33.6                     0.627   50        1
1              1       85             66             29        0  26.6                     0.351   31        0
2              8      183             64              0        0  23.3                     0.672   32        1
3              1       89             66             23       94  28.1                     0.167   21        0
4              0      137             40             35      168  43.1                     2.288   33        1
..           ...      ...            ...            ...      ...   ...                       ...  ...      ...
763           10      101             76             48      180  32.9                     0.171   63        0
764            2      122             70             27        0  36.8                     0.340   27        0
765            5      121             72             23      112  26.2                     0.245   30        0
766            1      126             60              0        0  30.1                     0.349   47        1
767            1       93             70             31        0  30.4                     0.315   23        0

[768 rows x 9 columns]

All but last row, X:
     Pregnancies  Glucose  BloodPressure  SkinThickness  Insulin   BMI  DiabetesPedigreeFunction  Age  Outcome
1              1       85             66             29        0  26.6                     0.351   31        0
2              8      183             64              0        0  23.3                     0.672   32        1
3              1       89             66             23       94  28.1                     0.167   21        0
4              0      137             40             35      168  43.1                     2.288   33        1
5              5      116             74              0        0  25.6                     0.201   30        0
..           ...      ...            ...            ...      ...   ...                       ...  ...      ...
762            9       89             62              0        0  22.5                     0.142   33        0
763           10      101             76             48      180  32.9                     0.171   63        0
764            2      122             70             27        0  36.8                     0.340   27        0
765            5      121             72             23      112  26.2                     0.245   30        0
766            1      126             60              0        0  30.1                     0.349   47        1

[766 rows x 9 columns]

Last column:Y
0      1
1      0
2      1
3      0
4      1
      ..
763    0
764    0
765    0
766    1
767    0
Name: Outcome, Length: 768, dtype: int64

New error:

Error:Traceback (most recent call last):
  File "/media/larz/Projects/projects/QRST/T/TryStuffNew/src/Jul_15_2022_1.py", line 21, in <module>
    X_train,X_test,Y_train,Y_test = train_test_split(X,Y, test_size=0.3)
  File "/media/larz/Projects/projects/QRST/T/TryStuffNew/venv/lib/python3.10/site-packages/sklearn/model_selection/_split.py", line 2430, in train_test_split
    arrays = indexable(*arrays)
  File "/media/larz/Projects/projects/QRST/T/TryStuffNew/venv/lib/python3.10/site-packages/sklearn/utils/validation.py", line 433, in indexable
    check_consistent_length(*result)
  File "/media/larz/Projects/projects/QRST/T/TryStuffNew/venv/lib/python3.10/site-packages/sklearn/utils/validation.py", line 387, in check_consistent_length
    raise ValueError(
ValueError: Found input variables with inconsistent numbers of samples: [766, 768]

SuperNinja3I3

Larz60+