Bottom Page

• 0 Vote(s) - 0 Average
• 1
• 2
• 3
• 4
• 5
 Random Forest Hyperparamter Optimization donnertrud Silly Frenchman Posts: 21 Threads: 14 Joined: Dec 2019 Reputation: 0 Likes received: 0 #1 Jan-16-2020, 03:02 PM Hello, I have been trying to optimize the Random Forest Hyperparameter in order to lower the Mean Absolute Error of my Regression Model. I used Python to automatically look for those best parameters, which were the following : (n_estimators= , bootstrap=, min_samples_split=, min_samples_leaf=, max_features=", max_depth=). The code looked like this : ```# Number of trees in random forest n_estimators = [int(x) for x in np.linspace(start = 200, stop = 5000)] # Number of features to consider at every split max_features = ['auto', 'sqrt', 'log2'] # Maximum number of levels in tree max_depth = [int(x) for x in np.linspace(10, 110)] max_depth.append(None) # Minimum number of samples required to split a node min_samples_split = [2, 5, 10, 15, 20] # Minimum number of samples required at each leaf node min_samples_leaf = [1, 2, 5, 10, 15] # Method of selecting samples for training each tree bootstrap = [True, False]# Create the random grid random_grid = {'n_estimators': n_estimators, 'max_features': max_features, 'max_depth': max_depth, 'min_samples_split': min_samples_split, 'min_samples_leaf': min_samples_leaf, 'bootstrap': bootstrap} rf_random = RandomizedSearchCV(estimator = rf, scoring="neg_mean_absolute_error", param_distributions = random_grid, cv= 3, n_iter = 100, verbose=2, random_state=42, n_jobs = -1) search = rf_random.fit(X_train, y_train) print("best parameters for MAE: ", search.best_params_) ```Eventually I used the values that the machine gave me and plugged them in, however, the MAE is worse than if I simply put in values by trial and error. I wonder how that is possible, or in other words what am I doing wrong ? How can trial and error yield better? thanks in advance ! scidam Posts: 673 Threads: 1 Joined: Mar 2018 Reputation: 90 Likes received: 103 #2 Jan-17-2020, 06:30 AM The search space in your case is very huge. The number of parameter combinations: 4800 * 3 * 100 * 5 * 5 * 2, but search algorithm checks only 100 (n_iter=100) combinations. So, it is possible that search algorithm chose bad combinations. It would be better, if you reduce the volume of search space, e.g. ```n_estimators = [100, 150, 200, 300, 500] max_depth =[5, 10, 20, 40]``` « Next Oldest | Next Newest »

Top Page

 Possibly Related Threads... Thread Author Replies Views Last Post Can't make Random Forest Prediction work donnertrud 0 194 May-23-2020, 12:26 PM Last Post: donnertrud Random Forest high R2 Score but poor prediction donnertrud 5 214 Jan-13-2020, 11:23 PM Last Post: jefsummers Python code optimization problem servanm 2 913 May-23-2018, 01:28 PM Last Post: servanm AUCPR of individual features using Random Forest (Error: unhashable Type) melissa 1 1,341 Jul-10-2017, 12:48 PM Last Post: sparkz_alot

Forum Jump:

Users browsing this thread: 1 Guest(s)