Python Forum
fit each group and extract coefficients - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: fit each group and extract coefficients (/thread-19905.html)



fit each group and extract coefficients - Progressive - Jul-19-2019

I have a data frame df as follows:

	Board	Time	CO	H2
0	B000653BE	05.11.2018 13:28	89720	20320
1	B000653BE	05.11.2018 13:32	112760	35070
2	B000653BE	05.11.2018 13:36	130783	47063
(...)
Board	Time	CO	H2
B000653BF	08.11.2018 14:04	261254	217003
B000653BF	08.11.2018 14:08	261395	216402
B000653C2	05.11.2018 13:28	95564	49094
B000653C2	05.11.2018 13:32	90978	73274
B000653C2	05.11.2018 13:36	87743	93204
(...)
And I want to fit each group of Board via

def func_exp(x, a, b, c):
        #c = 0
        return a * np.exp(b * x) + c
        
def exponential_regression (x_data, y_data):
    popt, pcov = curve_fit(func_exp, 
                           x_data, 
                           y_data,          
                           p0 = (1000.1, 0.01, 1000000),        
                           maxfev=5000
                           )
    print(popt)   
    return func_exp(x_data, *popt)
but I can't figure out how to proceed. I thought about

df.groupby('Board').apply(exponential_regression(df.index, df["H2"]))
but I only get different errors (when I try to adjust the syntax somehow..)
I'm totally used to R, there I know how to proceed but I don't know how to do the same in python. However, I need to fit each group in regards to H2 and CO and extract the corresponding regression coefficients.

Can somebody please help me?


RE: fit each group and extract coefficients - scidam - Jul-20-2019

exponential_regression function should return a list (an array) of coefficients. But you return y-value estimations.

I slightly restructure the code:

from scipy.optimize import curve_fit
import pandas as pd

c = pd.np.random.choice(range(1, 5), 1000)
df = pd.DataFrame({'Board': c, 'x':pd.np.linspace(2, 10, 1000), 'y': 2+c*pd.np.exp(pd.np.linspace(2, 10, 1000))})
df is a sample data frame, it contains 'board' variable. This variable takes random values 1, ... , 4. This is just sample data.

def func_exp(x, a, b, c):
        #c = 0
        return a * pd.np.exp(b * x) + c
         
def exponential_regression (x_data, y_data):
    popt, pcov = curve_fit(func_exp, 
                           x_data, 
                           y_data,          
                           p0 = (1,1,1),        
                           maxfev=5000
                           )
    return popt
res = df.groupby('Board').apply(lambda x: exponential_regression(x['x'], x['y']))
res
Output:
Board 1 [1.0, 1.0, 2.0000000000021814] 2 [1.9999999999999998, 0.9999999999999998, 1.999... 3 [3.000000000000001, 0.9999999999999997, 2.0000... 4 [3.9999999999999956, 1.0000000000000002, 2.000... dtype: object
So, the first value consequently takes 1, 1.9999, ... ,3.99999. These are coefficients a for our groups of data. Everything
works as expected.