Help deciphering code

mmoe · Jul-23-2019, 04:35 PM

Hello,

I have been given some python code that I am trying to understand, and I am not a python user (I know some R, but zero python). I would really appreciate a 'plain English' reading of what the code is doing exactly, so I can make modifications or be able to explain it to others.

Here is the code:

def calcPrior(n, N):
    l = [N/i for i in range(1, N+1)]
    tot = sum(l)
    l = [i/tot for i in l]
    return l[int(n*N)]

def calcL(s, initialF, finalF):
    N = 180
    eps = 0.0005
    mu = initialF
    sd = (mu*(1-mu))**0.5#/N
    for i in range(50):
        old_mu = mu
        mu = (1+s)*mu/(1+s*mu)
        sd = (1.0/N)*(mu*(1-mu)) + (((1+s)/(1+s*old_mu))**2)*sd
    prob = quad(norm(loc=mu, scale=sd).pdf, finalF-eps, finalF+eps)
    return prob[0]*calcPrior(initialF, N)

p value of s=0.5
p = 1-chi2.cdf(2*(np.log10(calcL(0.5, start, 1))-np.log10(findS(0, start, 1))), 1)

The last line is especially important, thanks for your time.

**scidam** · (This post was last modified: Jul-29-2019, 04:37 PM by Yoriz.)

I don't understand this code too, but I made some comments, hope that helps:

def calcPrior(n, N):
    # this line produces a list [N/1, N/2, ... , 1], i.e. a part of harmonic sequence.
    # in Python3 would consist of floats, in Python2 it would consist of integers
    l = [N/i for i in range(1, N+1)]
    
    # this is sum of the harmonic sequence 
    tot = sum(l) # tot is a number
    
    #  each value in harmonic series was divided onto
    # the sum of all values; l variable is overridden
    l = [i/tot for i in l]
    
    # this line assumes that 0<=n<=1; e.g. if N = 10, n=0.5, then
    # we return l[int(10*0.5)] = l[int(5.0)] = l[5], i.e. fifth
    # element of the array l. int(....) just drops mantissa of a float number, e.g.
    # int(1.919) = 1;
    return l[int(n*N)]
 
def calcL(s, initialF, finalF):
    N = 180  
    eps = 0.0005  # some magic numbers here
    mu = initialF  # this string wouldn't be needed, if finalF was declared as mu. (this is for convenience, mu is shorter than finalF)
    sd = (mu*(1-mu))**0.5#/N
    
    # This is likely iterative method of finding a solution of some equation (system of two equations)
    for i in range(50):
        old_mu = mu  
        mu = (1+s)*mu/(1+s*mu)
        sd = (1.0/N)*(mu*(1-mu)) + (((1+s)/(1+s*old_mu))**2)*sd
    # these lines could be rewritten, e.g. as
    # for i in range(50): # 50 is another magic number
    #    mu, sd =(1+s)*mu/(1+s*mu), (1.0/N)*(mu*(1-mu)) + (((1+s)/(1+s*old_mu))**2)*sd
    # e.g. without introducing old_mu
        
    
    # As we found a solution of equations, we integrate probability density function of normal
    # distribution over interval (finalF-eps, finalF+eps)
    #I completely don't understand why we do so. integral of pdf is cdf, we could just use
    # cdf instead. eps seems to be small, so this integral is almost equal to 
    #something like `norm(loc=mu, scale=sd).cdf(finalF)*2*eps`
    prob = quad(norm(loc=mu, scale=sd).pdf, finalF-eps, finalF+eps)
    # ok, integration is done, as a result we got some probability
    
    # Finally, we returns that probability being multiplied on [initialF*N] - index of normalized harmonic sequence
    return prob[0]*calcPrior(initialF, N)

# to get this code worked some import should be included at the beginning of the script,
# import numpy as np
# from scipy.stats import norm, chi2
# from scipy.integrate import quad
# function findS is not defined in this code., start variable is not defined too.
p value of s=0.5
p = 1-chi2.cdf(2*(np.log10(calcL(0.5, start, 1))-np.log10(findS(0, start, 1))), 1)

jefsummers · Jul-24-2019, 01:24 AM

The code is incomplete. The use of np. suggests that numpy has been imported, as well as the chi2 module. Given that the last line is most important to you, I suggest reading the docs on both of these modules as a place to start.

mmoe · Jul-24-2019, 02:21 PM

Thank you scidam and jefsummers, this helps greatly! Having a target to read about (the modules) and the step by step explanation from scidam is just what I needed - thanks again!

Help deciphering code

User Panel Messages

Announcements