Can't make Random Forest Prediction work - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Can't make Random Forest Prediction work (/thread-27033.html) |
Can't make Random Forest Prediction work - donnertrud - May-23-2020 Hi guys, I am trying to predict the target variable "Exchange Rate EURUSD" using a data set that I created by myself. I included a lot of economic indicators and the monthly EURUSD closing price from 01/01/2000 to 01/12/2020, which gives me 240 columns ( 20 years * 12 months ) and 34 features ( 33 indicators and 1 EURUSD exchange rate) Here is one sample of the data set : https://imgur.com/a/iJuGtXw I kept the date as an index and also replaced all "," with "." in python. All in all the code looks like this : # Load Data Set df = pd.read_csv("C:/merged.csv") df = df.set_index('Date') df["EURUSD Closing Price"] = df["EURUSD Closing Price"].replace(',', '.', regex=True).astype(float) # Define variables X = df.drop(["EURUSD Closing Price"], axis=1).values y = df["EURUSD Closing Price"].values # Split data into 75 train / 25 test X_train, X_test, y_train, y_test = train_test_split(X, y) X_train = X[:int(X.shape[0]*0.75)] X_test = X[int(X.shape[0]*0.75):] y_train = y[:int(X.shape[0]*0.75)] y_test = y[int(X.shape[0]*0.75):] # RF Train and predict # RF rf = RandomForestRegressor(n_estimators = 1000, random_state = 42) rf.fit(X_train, y_train) rf.predict(y_train)First of all, I have to keep the time series nature in the data, thats why the data can'T be split randomly into a train and test set. Moreover, I know you should not predict on training set, but on test set. But I get an error even with that code : Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample. If i do that, I get the new error : ValueError: Number of features of the model must match the input. Model n_features is 33 and input n_features is 1 I kind of understand what the problem is, but I have no idea how to fix it. I would appreciate any help a lot ! |