Jul-21-2021, 10:15 AM
Hi,
I am training a neural network for a classification task. I have used 10x cross validation and have a good indication of out of sample (test) performance for all folds. I now want to train the final model on all data but not sure on the best/recommended strategy or what is typically done, I am sure there is no right or wrong here so would like to get your thoughts... a couple points and ideas i have below -
1. train on 100% of the data (absolutely no test or validation data) and chose a number of epochs based on another run with validation data and determine when overfitting starts to occur then save model/weights
2. train on 90% data and maintain a validation dataset for early stopping then save model/weights
In terms of plotting metrics - say confusion matrices for results, precision recall F1 - what is typically done? is it the average of the metrics during 10x cross validation and a plot of a representative test prediction?
any thoughts and advice is appreciated.
thank you
I am training a neural network for a classification task. I have used 10x cross validation and have a good indication of out of sample (test) performance for all folds. I now want to train the final model on all data but not sure on the best/recommended strategy or what is typically done, I am sure there is no right or wrong here so would like to get your thoughts... a couple points and ideas i have below -
1. train on 100% of the data (absolutely no test or validation data) and chose a number of epochs based on another run with validation data and determine when overfitting starts to occur then save model/weights
2. train on 90% data and maintain a validation dataset for early stopping then save model/weights
In terms of plotting metrics - say confusion matrices for results, precision recall F1 - what is typically done? is it the average of the metrics during 10x cross validation and a plot of a representative test prediction?
any thoughts and advice is appreciated.
thank you