Hello!!
I have a dataset with the a dimension of [38355, 257] , and it has 14 classes. I tried to split the data into (70%, 15%, 15%) for training, validation and testing sets. At the same time I used (stratify=y) to make sure that the percentages are taken from each corresponding class correctly.
The split code I used:
I have a dataset with the a dimension of [38355, 257] , and it has 14 classes. I tried to split the data into (70%, 15%, 15%) for training, validation and testing sets. At the same time I used (stratify=y) to make sure that the percentages are taken from each corresponding class correctly.
The split code I used:
from sklearn.model_selection import train_test_split X_train, X_val, y_train, y_val = train_test_split(X, y, test_size = 0.15, random_state = 1,stratify=y) from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size = 0.1764705, random_state = 1, stratify=y)The problem is ... When applying the second split I got this error:
ValueError: Found input variables with inconsistent numbers of samples: [32601, 38355]How can I fix the problem?!

