Python Forum
Fit straight line to pandas time series data with semilog plot
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Fit straight line to pandas time series data with semilog plot
#1
I have sets of data, which look like this:

   

I do the following:

df=pd.read_csv('data.csv')
timestamp_fields = ['Year', 'Month', 'Day', 'Hour', 'Minute','Second']
df['Date']=pd.to_datetime(df[timestamp_fields])
df=df.dropna()
I'd like to plot the data and fit a straight line. Here is my approach:

#fit
x = np.arange(df.iloc[:,-1:].size) #df.iloc[:,-1:] is the column Date added above
fit = np.polyfit(x, df.iloc[:,-2:-1], 1) #df.iloc[:,-2:-1] is the column of data
fit_fn=np.poly1d(fit[0])

#plotting
plt.rcParams["figure.figsize"] = (20,5) 
plt.plot(df.iloc[:,-1:],df.iloc[:,-2:-1],'x',color='k')
plt.plot(df.iloc[:,-1:], fit_fn(x), 'k-')
plt.title('Isoprene and MBO')
ax = plt.gca()
ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
plt.xticks(rotation='45')
plt.yscale('log')
plt.ylabel('Concentration [ppb]');
Output:
   

As you can see, there seems to be something wrong. I would expect the fit to go through the data. Any help is greatly appreciated.
Reply
#2
For some reason the fit became more accurate when I changed x = np.arange(df.iloc[:,-1:].size) to x = np.linspace(0,1,len(df.iloc[:,-1:])). Grateful for an explanation if this really is the solution.
Reply
#3
If I am reading it correctly, your np.arange(value) gives you default start (0), stop point of your df.iloc[].size, and does not specify the interval. The np.linspace call specifies tart, stop, and the number of steps which I think is what you want.

Also, according to the docs
Quote:When using a non-integer step, such as 0.1, it is often better to use numpy.linspace.
schniefen likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Exclamation URGENT: How to plot data from text file. Trying to recreate plots from MATLAB JamieAl 4 3,559 Dec-03-2023, 06:56 AM
Last Post: Pedroski55
  Plot time series data schniefen 3 1,334 Mar-04-2023, 04:22 PM
Last Post: noisefloor
  Help on Time Series problem Kishore_Bill 1 4,822 Feb-27-2020, 09:07 AM
Last Post: Kishore_Bill
  How to plot date series in matplotlib? StrybolData 2 8,376 Jan-25-2018, 07:13 PM
Last Post: StrybolData
  Visualisation of gaps in time series data ulrich48155 11 19,338 Jul-04-2017, 11:47 PM
Last Post: zivoni
  Removing data in a plot ulrich48155 3 3,809 Jun-19-2017, 06:31 PM
Last Post: zivoni
  10fold cross-validation on time series ulrich48155 5 9,215 May-08-2017, 04:36 PM
Last Post: ulrich48155
  Numerically determining the highest a ball travels when shot straight up JakeWitten 3 3,437 Apr-22-2017, 04:37 PM
Last Post: Ofnuts
  pandas series to list metalray 5 38,321 Feb-21-2017, 04:16 PM
Last Post: metalray
  Linking Data in Pandas PietonNewbie 2 3,716 Nov-10-2016, 03:02 AM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020