Python Forum
Chi2 minimization with scipy
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Chi2 minimization with scipy
#1
Hello everyone,

I am trying to test the minimize function of scipy using a user-defined chi2 function. The idea is to find the optimal parameter r that will shift the mean of one of the histograms to match the mean of the other one. The optimal value being found by finding the smallest chi2.

First, I define two gaussian histograms with the same standard deviation but with different means.

data1 = np.random.normal(loc=5, scale=1.5, size=1000)
data2 = np.random.normal(loc=8, scale=1.5, size=1000)

h1, _ = np.histogram(data1, bins=30)
h2, _ = np.histogram(data2, bins=30)
Naturally, I would expect that the optimal parameter r in this case would be 3 or -3, depending on which histogram I decide to shift. So I define my chi2 function such as:

def chi2F(r):
    chi2_stat = 0.0

    for bin1, bin2 in zip(h1, h2):
    	chi2_stat += ((bin2+r) - bin1)**2 / (bin1+ 1e-10)  # 1e-10 is there to avoid dividing by 0
    
    return chi2_stat
where I decide to shift h2.

Then I minimize the function by guessing a value for r:

guess = 0.0
result = minimize(chi2F,guess)
and I return the optimal value of r that was found following the minimization.

optimal_r = result.x[0]
However, when I run this short code, the optimal_r I find is around -2, instead of -3. I find it very strange that somehow, the chi2 minimization process works without error but returns to me a value far from what it should be. I feel like I am missing something regarding the way the minimize function works and I would appreciate any insight you can provide me with :)

Cheers.

EDIT:

I realize now that shifting bin1 and bin2 only shift the number of events in the bin. Instead, I would like to shift the value around each bin is centered. So a shift along the x-axis if you plot the histogram.
Reply
#2
Well, I guess I found the solution to my problem (it's always after one posts something that the solution somehow reveals itself). So here is the solution in case anyone has to deal with it in the future.

Save the edges of the bins and get the center of each bin:
h1, binE1 = np.histogram(data1, bins=30)
h2, binE2 = np.histogram(data2, bins=30)

bin_centers1 = (binE1[:-1] + binE1[1:]) / 2.0
bin_centers2 = (binE2[:-1] + binE2[1:]) / 2.0
and simply loop over the bin centers:
for bin1, bin2 in zip(bin_centers1, bin_centers2):
    	chi2_stat += ((bin2+r) - bin1)**2 / (bin1+ 1e-10)
    
    return chi2_stat
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020