Nov-25-2018, 02:48 PM
I got the following numbers:
498 50 67 1069 22 145 182 348 580 906 753 23 4 782 818 26 85 863 762 791 319 94 17 650 969 752 423 720 26 804 640 5 134 868 249 23 131 939 210 479 149 328 938 887 28 573 679 729 693 159
I want to test normality of this dataset.
I want to do this via a chisquare test.
I performed it in excel, and I get the following results (which I want to achieve in Python):
chisquare = 5.53
p-value = 0.14
The formulas how I did it in excel is as follows:
I first calculated the z-value for every value in each column.
I then calculated the expected values below -1, between -1 and 0, between 0 and 1 and bigger than +1, based on the standard normal distribution (n(0,1)). I then calculated the real values (based on the z-values) in these ranges.
I could then calculate chi-squared values for each range by (observed values-expected values)^2/expected values.
The chi-squared values are summed and this gives 5.53.
Because I used 4 ranges, I used 3 degrees of freedom. 5.53 and 3 degrees of freedom gives a p-value of 0.14 (formula is called CHIDIST(chi-squared value, degrees of freedom).
The question is: how can I perform this calculation in Python and do it with multiple rows?
498 50 67 1069 22 145 182 348 580 906 753 23 4 782 818 26 85 863 762 791 319 94 17 650 969 752 423 720 26 804 640 5 134 868 249 23 131 939 210 479 149 328 938 887 28 573 679 729 693 159
I want to test normality of this dataset.
I want to do this via a chisquare test.
I performed it in excel, and I get the following results (which I want to achieve in Python):
chisquare = 5.53
p-value = 0.14
The formulas how I did it in excel is as follows:
I first calculated the z-value for every value in each column.
I then calculated the expected values below -1, between -1 and 0, between 0 and 1 and bigger than +1, based on the standard normal distribution (n(0,1)). I then calculated the real values (based on the z-values) in these ranges.
I could then calculate chi-squared values for each range by (observed values-expected values)^2/expected values.
The chi-squared values are summed and this gives 5.53.
Because I used 4 ranges, I used 3 degrees of freedom. 5.53 and 3 degrees of freedom gives a p-value of 0.14 (formula is called CHIDIST(chi-squared value, degrees of freedom).
The question is: how can I perform this calculation in Python and do it with multiple rows?