Python Forum
Fitting Lognormal Data - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Fitting Lognormal Data (/thread-10414.html)



Fitting Lognormal Data - Carolyn - May-19-2018

Hi!

I have some x- and y- data, and i need to get the best fitting lognormal function, to obtain the mu and sigma of it.

Plottet, the data looks like this:

[Image: Gm1Sv85]

I am quite struggeling with the stats.lognorm.fit() function. My code looks like this:

from scipy import stats

 s, loc, scale = stats.lognorm.fit(x0, floc=0) #x0 is rawdata x-axis
 estimated_mu = np.log(scale)
 estimated_sigma = s

 print estimated_mu
 print estimated_sigma

 estimated mu = 1.4968829026551267
 estimated sigma = 0.8699922377581952
Unfortunately the calculated mu and sigma have nothing in common with my data.
Any help is appreciated, thanks!


RE: Fitting Lognormal Data - scidam - May-20-2018

Lets consider for exmaple the following piece of code:

import numpy as np 
from scipy import stats
x = 2 * np.random.randn(10000) + 7.0 # normally  distributed values
y = np.exp(x) # these values have lognormal distribution
stats.lognorm.fit(y, floc=0)
(1.9780155814544627, 0, 1070.4207866985835) #so, sigma = 1.9780155814544627 approx 2.0
np.log(1070.4207866985835) #yields 6.9758071087468636 approx 7.0
So, everything works fine: sigma and mu are estimated correctly...
What about sample sizes in your case? The example above shows that we have to
use very large samples to get high accuracy estimations.


RE: Fitting Lognormal Data - Carolyn - May-20-2018

Thank you for your answer!

I think i have to ask my question more precise.

I have 2 arrays, one of x-data, one of y-data:

x-data array:
Plotting this, i geht the graph that i postet yesterday. Its clearly a lognormal function, so what i need now is a (lognormal-) function, that fits my data best, to gain the median and the sigma. I thought, that "stats.lognorm.fit" provides me this information. Unfortunately when i run the following code, i get a median of approximately 1.4, but looking at my graph, it clearly should be somewhere around 6.

from scipy import stats
 
 s, loc, scale = stats.lognorm.fit(x0, floc=0) #x0 is rawdata x-axis
 estimated_mu = np.log(scale)
 estimated_sigma = s
 
 print estimated_mu
 print estimated_sigma
 
 estimated mu = 1.4968829026551267
 estimated sigma = 0.8699922377581952
I am really sorry, for posting thiy wall of numbers, and clearly dumb beginner questions, but im quite desperate right now. Any help is appreciated!

Thx, Carol


RE: Fitting Lognormal Data - scidam - May-21-2018

The example below should clarify what happening when we trying to calculate mean value
of data came from lognormal distribution:



from pylab import *
from scipy import stats as st

x = [[19.8815 ],[19.0141 ],[18.1857 ],[17.3943 ],[16.6382 ],[15.9158 ],[15.2254 ],[14.5657 ],[13.9352 ],[13.3325 ],[12.7564 ],[12.2056 ],[11.679 ],[11.1755 ],
[10.6941 ],[10.2338 ],[ 9.79353],[ 9.37249],[ 8.96979],[ 8.58462],[ 8.21619],[ 7.86376],[ 7.52662],[ 7.20409],[ 6.89552],[ 6.6003 ],
[ 6.31784],[ 6.04757],[ 5.78897],[ 5.54151],[ 5.30472],[ 5.07812],[ 4.86127],[ 4.65375],[ 4.45514],[ 4.26506],[ 4.08314],[ 3.90903],[ 3.74238],
[ 3.58288],[ 3.4302 ],[ 3.28407],[ 3.14419],[ 3.01029],[ 2.88212],[ 2.75943],[ 2.64198],[ 2.52955],[ 2.42192],[ 2.31889],[ 2.22026],[ 2.12583],
[ 2.03543],[ 1.94889],[ 1.86604],[ 1.78671],[ 1.71077],[ 1.63807],[ 1.56845],[ 1.50181],[ 1.43801],[ 1.37691],[ 1.31842],[ 1.26242],[ 1.2088 ],
[ 1.15746],[ 1.10832],[ 1.06126],[ 1.01619]]


x = np.array(x).ravel()


# testing for lognormality
# the data is lognormal if np.log(data) is normal
pvalx = st.shapiro(np.log(x))[-1]
print("p-value for `accepting` lognormality of x-data = ", pvalx)
print("Ok: the array come from lognormal distribution" if pvalx>0.01  else "Hm...  the array isn't lognormal")


print('Raw mean value of x-data is: ', np.mean(x)) # Note: median and mean values could significantly differ in case of lognormal distribution
print('Mean value of related normal distribution: ',  np.mean(np.log(x))) 
print('Mapped mean value: ',  np.exp(np.mean(np.log(x)))) 
 
s, loc, scale = st.lognorm.fit(x, floc=0) #x0 is rawdata x-axis
estimated_mu = np.log(scale)

print("Estimated mu is almost equal to mapped mean value (above):  ", abs(np.exp(estimated_mu)-np.exp(np.mean(np.log(x)))))