Python Forum

Full Version: Help with correlation coefficient
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Trying to calculate the correlation coefficient between the two lists, so far no luck. Any suggestions or help would be greatly appreciated?

from table_utils import *
from stats_utils import *

cars = [29.9, 30.2, 30.3, 30.1, 30.3, 30.4, 30.5, 30.9, 31.1, 31.5, 31.7]
papers = [337, 322, 311, 299, 264, 242, 213, 191, 190, 174, 164]

car_list = []
papers_list = []

for car_row in cars:
    area = car_row[0]
    for papers_row in papers:
        if area == papers_row[1]: 
             car_list.append(float(car_row[1]))
             papers.append(float(papers_row[2])) 

corr = round(corr_coef(car_list, papers_list), 2)
print('Correlation coefficient =', corr)
What is the problem you are facing... Do you get an error? Is the result wrong?
I assume you're getting an error on line 11. I don't understand what you are trying to do with car_list and papers_list. Why not just run corr_coef on cars and papers?
Maybe this can help you:
Finding-correlation-coefficient-from-2-lists

With this solution the Correlation Coefficient is -0.9068908297699143.
Sorry this is my first time working with correlation in python. Was just working with the example I had. Errors I kept getting where;

Traceback (most recent call last):
File "C:\Users\User\Desktop\q6b.py", line 4, in <module>
from table_utils import *
ModuleNotFoundError: No module named 'table_utils'

Line 11 error would have shown up in syntax. But guess its good practice to proof read my code before posting. Was thinking I had to write my own functions as the imports failed?

I'll try using example in the link provided, thanks
These imports in your codes are not being used. You can just remove them.

The link that I suggested implements the solution like this:

def mean(someList):
    total = 0
    for a in someList:
        total += float(a)
    mean = total / len(someList)
    return mean


def standDev(someList):
    listMean = mean(someList)
    dev = 0.0
    for i in range(len(someList)):
        dev += (someList[i] - listMean)**2
    dev = dev**(1 / 2.0)
    return dev


def correlCo(someList1, someList2):

    # First establish the means and standard deviations for both lists.
    xMean = mean(someList1)
    yMean = mean(someList2)
    xStandDev = standDev(someList1)
    yStandDev = standDev(someList2)
    # r numerator
    rNum = 0.0
    for i in range(len(someList1)):
        rNum += (someList1[i] - xMean) * (someList2[i] - yMean)

    # r denominator
    rDen = xStandDev * yStandDev

    r = rNum / rDen
    return r


cars = [29.9, 30.2, 30.3, 30.1, 30.3, 30.4, 30.5, 30.9, 31.1, 31.5, 31.7]
papers = [337, 322, 311, 299, 264, 242, 213, 191, 190, 174, 164]

print(correlCo(cars, papers))
PS: This is not my code, just added your 2 lists.
Those are not standard modules, you would have to install them before you could import them. There should be a tutorial about how to do that in the tutorials section of this board. Or you could use the example gontajones showed.