Python Forum
How to predict with date as input for DecisionTreeRegressor - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: How to predict with date as input for DecisionTreeRegressor (/thread-23119.html)



How to predict with date as input for DecisionTreeRegressor - sandeep_ganga - Dec-12-2019

Hi Forum,

How to predict with date as input for DecisionTreeRegressor model?

source: student_mark_result_dec_hist.csv

name day subject percentage
john 12/1/2019 maths 30
john 12/2/2019 maths 40
john 12/3/2019 maths 33
john 12/4/2019 maths 32
john 12/5/2019 maths 31
john 12/6/2019 maths 38
john 12/7/2019 maths 35
john 12/8/2019 maths 38
john 12/9/2019 maths 39
john 12/10/2019 maths 55
john 12/11/2019 maths 65
john 12/12/2019 maths 68
john 12/13/2019 maths 62
john 12/14/2019 maths 70
john 12/15/2019 maths 64
john 12/16/2019 maths 82
john 12/17/2019 maths 80
john 12/18/2019 maths 55
john 12/19/2019 maths 68
john 12/20/2019 maths 79
john 12/21/2019 maths 88
john 12/22/2019 maths 87
john 12/23/2019 maths 80
john 12/24/2019 maths 75


Now, i want to predict for 12/25/2019 and 11/30/2019 marks for subject-maths for name -john, Any ideas?
I was trying with below, but i doubt that's absolutely incorrect,

import pandas as pd
#import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import preprocessing


raw_data=pd.read_csv('student_mark_result_dec_hist.csv',index_col=False)
blankIndex=[''] * len(raw_data)
raw_data.index=blankIndex

le = preprocessing.LabelEncoder()

for column_name in raw_data.columns:
	if raw_data[column_name].dtype == object:
		raw_data[column_name] = le.fit_transform(raw_data[column_name])
		le_name_mapping = dict(zip(le.classes_, le.transform(le.classes_)))
		print(le_name_mapping)
		
	else:
		pass
print('---->', raw_data[:])

X=raw_data[['name','day','subject']]
y=raw_data['percentage']

model=DecisionTreeRegressor()
model.fit(X,y)

predictions=model.predict([  [0,24,0], [0,-1,0]  ])
print(predictions)

#here, i am not sure if [0.24.0],[0,-1,0] points to date 12/25/2019 and 11/30/2019, Any ideas?
Best Regards,
Sandeep

GANGA SANDEEP KUMAR