Python Forum
Comparing means in dataFrames
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Comparing means in dataFrames
#1
Dear community,

we are working on an assignment with different statistical exercises that need to be programmed in Python.
It's going pretty well, we are now only stuck with a technical question.
We imported a dataset from sklearn, we defined dataframes, now we want to test the means in two different columns with each other (knowing which statistical test has to be used is definitely not our problem). We simply do not know how to compare two columns from different data frames. Can you please help us? Our attempt was using
Chas_0 = regressors[regressors['CHAS'] == 0.0]['DIS']
Chas_1 = regressors[regressors['CHAS'] == 1.0][outcome[outcome'MEDV']]
print(Chas_0.mean(),Chas_1.mean())
as we once learned it in an exercise, but in that we had two columns in one data frame.

For more background, our whole answer to the exercise:

from sklearn import datasets 
import pandas as pd

boston = datasets.load_boston()
regressors = pd.DataFrame(boston.data, columns=boston.feature_names)
outcome = pd.DataFrame(boston.target, columns=["MEDV"]).values[:]
NOX = pd.DataFrame(boston.target, columns=["NOX"]).values[:]

## 2 Two sided T-Test

import numpy as np
import scipy.stats as stats

Chas_0 = regressors[regressors['CHAS'] == 0.0]['DIS']
Chas_1 = regressors[regressors['CHAS'] == 1.0][outcome[outcome'MEDV']]
print(Chas_0.mean(),Chas_1.mean())

##3 Wilcoxon Test

stats.wilcoxon(regressors[regressors['DIS']],outcome[outcome['MEDV']])
Thank you in advance,

Holly
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Printing k-means clustered clusters ottabe_h 3 3,047 May-19-2021, 09:53 AM
Last Post: piotrkuras
  K means clustering using heatmap Rifscape 0 4,188 Sep-10-2017, 11:08 PM
Last Post: Rifscape

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020