Python Forum
sklearn and train_test_split
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
sklearn and train_test_split
#1
Hey everyone,

Could someone better explain train_test_split and what it's actually doing?

So from my understanding:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
When you give TTS an X variable, that is the dataset and y is what you are trying to predict, correct? And TTS will split up your dataset and assign the variables X_train, X_test, y_train, and y_test at its own will to train the dataset? And the test size is how big the test sampling will be from your dataset to test on? Or am I completely off the mark?
Reply
#2
Your dataset has X and y variables in it, say days and CV-19 cases. These are all known values.
You send it to TTS, then train on the train values. Once trained, you can validate your model by testing the model against the test set. So, when it predicts that day 27 you should have 500 cases and the actual value is 600, you have an error of 100. You can then get statistics (typically the mse - mean squared error) on how close your model actually predicts reality.

Once you are comfortable with your model you can use it to predict new values - what will we be looking at on day 300?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Column Transformer with Mixed Types - sklearn aaldb 0 243 Feb-22-2024, 03:27 PM
Last Post: aaldb
  sklearn.neural_network MLPClassifier forecast variances CK1960 1 1,779 Oct-29-2020, 10:13 AM
Last Post: CK1960
  Customizing an sklearn submodule with cython JHogg11 0 1,930 May-27-2020, 05:39 PM
Last Post: JHogg11
  Error When Using sklearn Predict Function firebird 0 2,022 Mar-21-2020, 04:34 PM
Last Post: firebird
  Outputing LogisticRegression Coefficients (sklearn) RawlinsCross 6 4,656 Feb-27-2020, 02:47 PM
Last Post: RawlinsCross
  Predicting an output variable with sklearn Ccross1 1 2,484 Jun-04-2019, 03:11 PM
Last Post: michalmonday
  sklearn regression to excel punksnotdead 1 2,719 Apr-14-2019, 12:32 PM
Last Post: punksnotdead
  sklearn imported but not recognized kerberg 6 16,364 Jun-18-2017, 12:32 PM
Last Post: snippsat
  Sklearn Agglomerative Hierarchical Clustering - help with array set up pstarrett 4 5,227 Feb-21-2017, 05:05 AM
Last Post: pstarrett

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020