Python Forum
Basic data analysis and predictions
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Basic data analysis and predictions
#4
Standard with small datasets is 80-20 train and test. If you want to do train, validate, and test it would be more like 60-20-20. Recognize that you are not supposed to adjust the parameters to fix predictions on your test set, rather train on the train, see the results on validation and go back to adjust (avoid overfitting, etc) and when done prove you did a good job by running the predictions on your test set. Small set this may be hard, so you may have to compromise some and just use validation or test, though you will need to explain that in your paper.
So here is an example from one of my projects:
trainval_dataset = df.sample(frac=0.8,random_state=42)
test_dataset = df.drop(trainval_dataset.index)
train_dataset = trainval_dataset.sample(frac=0.8, random_state=42)
validate_dataset = trainval_dataset.drop(train_dataset.index)
print(f"Train {train_dataset.shape} Validate {validate_dataset.shape} Test {test_dataset.shape}")
trainval_dataset is the training and validation sets, with test_dataset as the test set (what remains from the total after removing the trainval). Then split trainval into training and validation. So, get 3 sets.
Seed of 42 is traditional, and besides being the answer to life, the universe, and everything carries no meaning.

So for you, you really just have 2 columns in your dataframe - year and population. Do the split, then take the year column as X and the population column as Y, and plot it. If it looks linear, do a linear regression. If it does not look linear consider polynomial.
Reply


Messages In This Thread
Basic data analysis and predictions - by mates - Mar-06-2020, 11:57 PM
RE: Basic data analysis and predictions - by mates - Mar-07-2020, 07:08 AM
RE: Basic data analysis and predictions - by jefsummers - Mar-07-2020, 03:54 PM
RE: Basic data analysis and predictions - by mates - Mar-07-2020, 11:29 PM
RE: Basic data analysis and predictions - by mates - Mar-08-2020, 09:49 AM
RE: Basic data analysis and predictions - by mates - Mar-08-2020, 04:42 PM
RE: Basic data analysis and predictions - by mates - Mar-08-2020, 06:19 PM
RE: Basic data analysis and predictions - by mates - Mar-10-2020, 02:15 PM
RE: Basic data analysis and predictions - by mates - Mar-11-2020, 09:01 PM
RE: Basic data analysis and predictions - by mates - Mar-14-2020, 09:06 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Multivariate Analysis of Ecology Data Will_Robertson 2 1,000 Aug-04-2023, 11:19 AM
Last Post: jefsummers
  Neural network and data analysis from clients survey result pthon3 2 1,948 Mar-17-2022, 02:21 AM
Last Post: jefsummers
  HELP- DATA FRAME INTO TIME SERIES- BASIC bntayfur 0 1,783 Jul-11-2020, 09:04 PM
Last Post: bntayfur
  How to save predictions made by an autoencoder Glasgow1988 0 1,599 Jul-03-2020, 12:43 PM
Last Post: Glasgow1988
  Easy analysis of Data ranjjeetk 1 1,963 Jun-06-2020, 01:44 AM
Last Post: Larz60+
  Utilize input predictions for Supervised Learning donnertrud 2 1,972 May-20-2020, 12:45 PM
Last Post: donnertrud
  complex survey data analysis abeshkc 1 2,857 Nov-06-2019, 06:14 AM
Last Post: ThomasL
  Merge Predictions with whole data set mayanksrivastava 0 3,645 Jun-29-2017, 11:39 AM
Last Post: mayanksrivastava

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020