'Age' categorical (years -months -days ) to numeric - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: 'Age' categorical (years -months -days ) to numeric (/thread-21761.html) |
'Age' categorical (years -months -days ) to numeric - Smiling29 - Oct-13-2019 I have a dataset with Age column which has data as follows: df_s7['Age'].unique()array(['28 Years', '10 Month(s) 15 Day(s)', '46 Years', '65 Years', '45 Years', '30 Years', '47 Years', '17 Years', '55 Years', '50 Years', '39 Years', '42 Years', '38 Years', '40 Years', '20 Years', ' < 1 Year', '29 Years', '43 Years', '31 Years', '36 Years', '11 Years', '48 Years', '23 Years', '25 Years', '32 Years', '82 Years', '44 Years', '37 Years', '52 Years', '35 Years', '18 Years', '19 Years', '49 Years', '62 Years', '51 Years', '72 Years', '26 Years', '54 Years', '24 Years', '59 Years', '34 Years', '53 Years', '14 Years', '71 Years', '27 Years', '66 Years', '33 Years', '22 Years', '70 Years', '60 Years', '21 Years', '3 Month(s) 11 Day(s)', '58 Years', '56 Years', '63 Years', '5 Years', '64 Years', '10 Years', '16 Years', '15 Years', '75 Years', '57 Years', '2 Years', '83 Years', '77 Years', '74 Years', '13 Years', '41 Years', '69 Years', '1 Month(s) 29 Day(s)', '8 Years', '7 Month(s) 16 Day(s)', '61 Years', '67 Years', '1 Month(s) 30 Day(s)', '84 Years', '1 Month(s) 12 Day(s)', '6 Month(s) 26 Day(s)', '12 Years', '5 Month(s) 18 Day(s)', '68 Years', '80 Years', '3 Month(s) 19 Day(s)', '76 Years', '86 Years', '7 Month(s) 2 Day(s)', '1 Years', '73 Years', '90 Years', '6 Month(s) 20 Day(s)', '79 Years', '89 Years', '9 Years', '3 Month(s) 29 Day(s)', '8 Month(s) 21 Day(s)', '4 Years', '6 Month(s) 8 Day(s)', '78 Years', '6 Years', '87 Years', '7 Years', '6 Month(s) 9 Day(s)', '4 Month(s) 20 Day(s)', '10 Month(s) 16 Day(s)', '4 Month(s) 11 Day(s)', '6 Month(s) 18 Day(s)', '4 Month(s) 13 Day(s)'], dtype=object) I want to create groups and then visualize histogram groups something like below: def age_buckets(x): if x < 1: return '0-1' if x < 17: return '1-17' if x < 30: return '18-29' elif x < 40: return '30-39' elif x < 50: return '40-49' elif x < 60: return '50-59' elif x < 70: return '60-69' elif x >=70: return '70+' else: return 'other'There is also a "Sex" column- Male, Female, Transgender 1) I want to plot(1D) histogram only on Age col based different age groups and color code 2) plot based on Age & Sex column for different age groups and color code Please advise RE: 'Age' categorical (years -months -days ) to numeric - Smiling29 - Oct-14-2019 Additional details: What I expect in the output: or Which ever is a good approach to plot. Please advise
RE: 'Age' categorical (years -months -days ) to numeric - Smiling29 - Oct-16-2019 I was thinking to split Age column to 2 different columns to separate years, months , days but Iam not able to identify which one to use for split. Can someone please help on this RE: 'Age' categorical (years -months -days ) to numeric - perfringo - Oct-17-2019 If you need years then simplest way is: In [1]: lst = ['28 Years', '10 Month(s) 15 Day(s)', '46 Years', '65 Years', '45 Years', '30 Year ...: s', '47 Years', '17 Years', '55 Years', '50 Years', '39 Years', '42 Years', '38 Years', ...: '40 Years', '20 Years', ' < 1 Year', '29 Years', '43 Years', '31 Years', '36 Years', '11 ...: Years', '48 Years', '23 Years', '25 Years', '32 Years', '82 Years', '44 Years', '37 Yea ...: rs', '52 Years', '35 Years', '18 Years', '19 Years', '49 Years', '62 Years', '51 Years', ...: '72 Years', '26 Years', '54 Years', '24 Years', '59 Years', '34 Years', '53 Years', '14 ...: Years', '71 Years', '27 Years', '66 Years', '33 Years', '22 Years', '70 Years', '60 Yea ...: rs', '21 Years', '3 Month(s) 11 Day(s)', '58 Years', '56 Years', '63 Years', '5 Years', ...: '64 Years', '10 Years', '16 Years', '15 Years', '75 Years', '57 Years', '2 Years', '83 Y ...: ears', '77 Years', '74 Years', '13 Years', '41 Years', '69 Years', '1 Month(s) 29 Day(s) ...: ', '8 Years', '7 Month(s) 16 Day(s)', '61 Years', '67 Years', '1 Month(s) 30 Day(s)', '8 ...: 4 Years', '1 Month(s) 12 Day(s)', '6 Month(s) 26 Day(s)', '12 Years', '5 Month(s) 18 Day ...: (s)', '68 Years', '80 Years', '3 Month(s) 19 Day(s)', '76 Years', '86 Years', '7 Month(s ...: ) 2 Day(s)', '1 Years', '73 Years', '90 Years', '6 Month(s) 20 Day(s)', '79 Years', '89 ...: Years', '9 Years', '3 Month(s) 29 Day(s)', '8 Month(s) 21 Day(s)', '4 Years', '6 Month(s ...: ) 8 Day(s)', '78 Years', '6 Years', '87 Years', '7 Years', '6 Month(s) 9 Day(s)', '4 Mon ...: th(s) 20 Day(s)', '10 Month(s) 16 Day(s)', '4 Month(s) 11 Day(s)', '6 Month(s) 18 Day(s) ...: ', '4 Month(s) 13 Day(s)'] ...: In [2]: [int(age.split(' Years')[0]) if 'Years' in age else 0 for age in lst]Every age which doesn't have year is zero years and from others you take year as int. RE: 'Age' categorical (years -months -days ) to numeric - Smiling29 - Oct-17-2019 This is great! I am able to get years always but Iam struggling with row values with Months+ Days and only Days now for all those rows which have zeros I want the year instead of months/days. Example is the row has 4 Months 13 Days => (4/12) + (13/365) = 0.3689497716894977 should be my row value. I am trying but not able to get get results using function yet. |