Python Forum

Full Version: Regarding Defined Functions
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I wrote a Python code for what's called rerandomization in statistics. See below:

number_data = int(input("Enter size of data for each group (they will be same): "))
sim_size = int(input("Enter number of sims: "))
alpha = float(input("Set alpha/significance level: "))
print()

list_treatment = []
list_control = []

def collect_data(number_data):
    print("Enter treatment data points")
    for i in range(number_data):
        entry = float(input("Enter treatment data point: "))
        list_treatment.append(entry)
    print()
    print("Enter control data points")
    for i in range(number_data):
        entry = float(input("Enter control data point: "))
        list_control.append(entry)
    return list_treatment, list_control
So the above chunk of code (function) returns list_treatment and list_control

I call this function somewhere downstream and then use what it returns for the function compute_mean(list_treatment, list_control)

collect_data(number_data)
print()

difference = compute_mean(list_treatment, list_control)
I was really worried that I'll get an error message but I didn't. It works. Just wondering if this is ok per the standards of good coding practice or worse, it can lead to errors that I haven't encountered because my code was simple.

Gracias.
(Nov-05-2024, 01:10 AM)Hudjefa Wrote: [ -> ]Just wondering if this is ok per the standards of good coding practice or worse, it can lead to errors that I haven't encountered because my code was simple.
In this code you don't use at all the values returned by collect_data(). Instead, you are using the global variables list_treatment and list_control. This would be a problem if your code called collect_data() more than once. Because of this, it is better to use local variables and return them
number_data = int(input("Enter size of data for each group (they will be same): "))
sim_size = int(input("Enter number of sims: "))
alpha = float(input("Set alpha/significance level: "))
print()
 
def collect_data(number_data):
    print("Enter treatment data points")
    list_treatment = []
    for i in range(number_data):
        entry = float(input("Enter treatment data point: "))
        list_treatment.append(entry)
    print()
    print("Enter control data points")
    list_control = []
    for i in range(number_data):
        entry = float(input("Enter control data point: "))
        list_control.append(entry)
    return list_treatment, list_control
Later in your code, use the values returned
list_treatment, list_control = collect_data(number_data)
print()

difference = compute_mean(list_treatment, list_control)
When number_data is large, it will very soon become tedious to enter the values interactively. A much more flexible solution is to read the values from a file containing the data points, such as a csv file or a numpy data file.