I have sets of data, which look like this:
I do the following:
As you can see, there seems to be something wrong. I would expect the fit to go through the data. Any help is greatly appreciated.
I do the following:
df=pd.read_csv('data.csv') timestamp_fields = ['Year', 'Month', 'Day', 'Hour', 'Minute','Second'] df['Date']=pd.to_datetime(df[timestamp_fields]) df=df.dropna()I'd like to plot the data and fit a straight line. Here is my approach:
#fit x = np.arange(df.iloc[:,-1:].size) #df.iloc[:,-1:] is the column Date added above fit = np.polyfit(x, df.iloc[:,-2:-1], 1) #df.iloc[:,-2:-1] is the column of data fit_fn=np.poly1d(fit[0]) #plotting plt.rcParams["figure.figsize"] = (20,5) plt.plot(df.iloc[:,-1:],df.iloc[:,-2:-1],'x',color='k') plt.plot(df.iloc[:,-1:], fit_fn(x), 'k-') plt.title('Isoprene and MBO') ax = plt.gca() ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3)) ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m')) plt.xticks(rotation='45') plt.yscale('log') plt.ylabel('Concentration [ppb]');Output:
As you can see, there seems to be something wrong. I would expect the fit to go through the data. Any help is greatly appreciated.