Linear Regression on Time Series - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Linear Regression on Time Series (/thread-24010.html) |
Linear Regression on Time Series - karlito - Jan-27-2020 Hi, I'm trying this time to use a simple linear regression on my time series dataset to linearly predict data. But I got this error and I don't know how to handle it. Any ideas? # print df.head() eie Date_Time 2017-11-10 4470.76 2017-11-11 5465.72 2017-11-12 15465.72 2017-11-13 25465.72 2017-11-14 21480.59 y = np.array(df.values, dtype=float) x = np.array(pd.to_datetime(df['eie']).index.values, dtype=float) slope, intercept, r_value, p_value, std_err =sp.linregress(x,y) xf = np.linspace(min(x),max(x),100) xf1 = xf.copy() xf1 = pd.to_datetime(xf1) yf = (slope*xf)+intercept print('r = ', r_value, '\n', 'p = ', p_value, '\n', 's = ', std_err)# Error
RE: Linear Regression on Time Series - buran - Jan-27-2020 The error is clear - string '2017-11-10' could not be converted to float (obviously)
RE: Linear Regression on Time Series - karlito - Jan-27-2020 (Jan-27-2020, 02:22 PM)buran Wrote: The error is clear - string Yes I can read :) but for regression purpose, I read that all dates should be passed through pandas 'to_datetime()' function to convert it to float numeric because corresponding dates will be saved in the 'x' variable. nb: before setting Date_Time as index it was already converted to 'to_datetime()'. I'm kind of lost. Any ideas? RE: Linear Regression on Time Series - buran - Jan-27-2020 maybe import pandas as pd import numpy as np df = pd.DataFrame([['2017-11-10', 4470.76], ['2017-11-11', 5465.72], ['2017-11-12', 15465.72]], columns=['Date_Time', 'eie']) y = np.array(df['eie'], dtype=float) x = np.array(pd.to_datetime(df['Date_Time'], format='%Y-%m-%d'), dtype=float) print(y) print(x)
RE: Linear Regression on Time Series - karlito - Jan-28-2020 (Jan-27-2020, 03:00 PM)buran Wrote: maybe Thanks for your effort but it doesn't really helps me, sorry. I wish something like or even better
RE: Linear Regression on Time Series - buran - Jan-28-2020 I think there is some confusion in your understanding, but anyway import pandas as pd import numpy as np df = pd.DataFrame([['2017-11-10', 4470.76], ['2017-11-11', 5465.72], ['2017-11-12', 15465.72]], columns=['Date_Time', 'eie']) y = np.array(df['eie'], dtype=float) x = np.array(pd.to_datetime(df['Date_Time'], format='%Y-%m-%d'), dtype='datetime64[D]') print(y) print(x) orimport pandas as pd import numpy as np df = pd.DataFrame([['2017-11-10', 4470.76], ['2017-11-11', 5465.72], ['2017-11-12', 15465.72]], columns=['Date_Time', 'eie']) y = np.array(df['eie'], dtype=float) x = np.array(pd.to_datetime(df['Date_Time'].index.values+1, format='%Y-%m-%d'), dtype=int) print(y) print(x)
|