help for Kaggle Titanic Set fill the missing Age by median age of Pclass and Sex

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

help for Kaggle Titanic Set fill the missing Age by median age of Pclass and Sex

scidam

Super Moderators

Posts: 817

Threads: 1

Joined: Mar 2018

Reputation: 111

Nov-21-2018, 01:30 AM (This post was last modified: Nov-21-2018, 01:31 AM by scidam.)

You are probably looking for this:

train_df.Age.fillna(train_df.groupby(['Sex','Pclass]).transform('median').Age, inplace=True)

# from now train_df.Age doesn't contain nans

I would suggest you to take into account 'title' property, e.g. Masters are young people, etc.
Another suggestion is to use combined dataset (from train and test ones) to get 'median' estimations, i.e.
something like this

train_df.Age.fillna(pd.concat([train_df, test_df]).groupby(['Sex','Pclass']).transform('median').Age.iloc[:train_df.shape[0]], inplace=True)

Find

Messages In This Thread

help for Kaggle Titanic Set fill the missing Age by median age of Pclass and Sex - by Parthasarathi009 - Nov-21-2018, 12:32 AM

RE: help for Kaggle Titanic Set fill the missing Age by median age of Pclass and Sex - by scidam - Nov-21-2018, 01:30 AM

RE: help for Kaggle Titanic Set fill the missing Age by median age of Pclass and Sex - by Parthasarathi009 - Nov-21-2018, 06:50 PM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to generate rows based on values in a column to fill missing values	codesmatter	1	2,854	Oct-31-2020, 12:05 AM Last Post: Larz60+
	titanic from Seaborn	matador	3	6,176	Aug-20-2020, 12:13 PM Last Post: buran
	importing zip file on kaggle??	GuJu	4	5,534	Mar-10-2019, 02:21 PM Last Post: buran

Users browsing this thread: 1 Guest(s)

View a Printable Version

help for Kaggle Titanic Set fill the missing Age by median age of Pclass and Sex

User Panel Messages

Announcements