Python Forum
'Age' categorical (years -months -days ) to numeric
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
'Age' categorical (years -months -days ) to numeric
#1
I have a dataset with Age column which has data as follows:

df_s7['Age'].unique()
array(['28 Years', '10 Month(s) 15 Day(s)', '46 Years', '65 Years', '45 Years', '30 Years', '47 Years', '17 Years', '55 Years', '50 Years', '39 Years', '42 Years', '38 Years', '40 Years', '20 Years', ' < 1 Year', '29 Years', '43 Years', '31 Years', '36 Years', '11 Years', '48 Years', '23 Years', '25 Years', '32 Years', '82 Years', '44 Years', '37 Years', '52 Years', '35 Years', '18 Years', '19 Years', '49 Years', '62 Years', '51 Years', '72 Years', '26 Years', '54 Years', '24 Years', '59 Years', '34 Years', '53 Years', '14 Years', '71 Years', '27 Years', '66 Years', '33 Years', '22 Years', '70 Years', '60 Years', '21 Years', '3 Month(s) 11 Day(s)', '58 Years', '56 Years', '63 Years', '5 Years', '64 Years', '10 Years', '16 Years', '15 Years', '75 Years', '57 Years', '2 Years', '83 Years', '77 Years', '74 Years', '13 Years', '41 Years', '69 Years', '1 Month(s) 29 Day(s)', '8 Years', '7 Month(s) 16 Day(s)', '61 Years', '67 Years', '1 Month(s) 30 Day(s)', '84 Years', '1 Month(s) 12 Day(s)', '6 Month(s) 26 Day(s)', '12 Years', '5 Month(s) 18 Day(s)', '68 Years', '80 Years', '3 Month(s) 19 Day(s)', '76 Years', '86 Years', '7 Month(s) 2 Day(s)', '1 Years', '73 Years', '90 Years', '6 Month(s) 20 Day(s)', '79 Years', '89 Years', '9 Years', '3 Month(s) 29 Day(s)', '8 Month(s) 21 Day(s)', '4 Years', '6 Month(s) 8 Day(s)', '78 Years', '6 Years', '87 Years', '7 Years', '6 Month(s) 9 Day(s)', '4 Month(s) 20 Day(s)', '10 Month(s) 16 Day(s)', '4 Month(s) 11 Day(s)', '6 Month(s) 18 Day(s)', '4 Month(s) 13 Day(s)'], dtype=object)

I want to create groups and then visualize histogram groups something like below:

def age_buckets(x):
    if x < 1:
        return '0-1'
    if x < 17:
        return '1-17'
    if x < 30:
        return '18-29'
    elif x < 40:
        return '30-39'
    elif x < 50:
        return '40-49'
    elif x < 60:
        return '50-59'
    elif x < 70:
        return '60-69'
    elif x >=70:
        return '70+'
    else:
        return 'other'
There is also a "Sex" column- Male, Female, Transgender 1) I want to plot(1D) histogram only on Age col based different age groups and color code 2) plot based on Age & Sex column for different age groups and color code

Please advise
Reply
#2
Additional details:
What I expect in the output:
Output:
Age(years) Age_group 1 0-1 0.2 0-1 16 1-17
or
Output:
Age(years) Age_group 1 Infant 0 Infant 16 Teen
Which ever is a good approach to plot. Please advise
Reply
#3
I was thinking to split Age column to 2 different columns to separate years, months , days but Iam not able to identify which one to use for split. Can someone please help on this
Reply
#4
If you need years then simplest way is:

In [1]: lst = ['28 Years', '10 Month(s) 15 Day(s)', '46 Years', '65 Years', '45 Years', '30 Year
   ...: s', '47 Years', '17 Years', '55 Years', '50 Years', '39 Years', '42 Years', '38 Years', 
   ...: '40 Years', '20 Years', ' < 1 Year', '29 Years', '43 Years', '31 Years', '36 Years', '11
   ...:  Years', '48 Years', '23 Years', '25 Years', '32 Years', '82 Years', '44 Years', '37 Yea
   ...: rs', '52 Years', '35 Years', '18 Years', '19 Years', '49 Years', '62 Years', '51 Years',
   ...:  '72 Years', '26 Years', '54 Years', '24 Years', '59 Years', '34 Years', '53 Years', '14
   ...:  Years', '71 Years', '27 Years', '66 Years', '33 Years', '22 Years', '70 Years', '60 Yea
   ...: rs', '21 Years', '3 Month(s) 11 Day(s)', '58 Years', '56 Years', '63 Years', '5 Years', 
   ...: '64 Years', '10 Years', '16 Years', '15 Years', '75 Years', '57 Years', '2 Years', '83 Y
   ...: ears', '77 Years', '74 Years', '13 Years', '41 Years', '69 Years', '1 Month(s) 29 Day(s)
   ...: ', '8 Years', '7 Month(s) 16 Day(s)', '61 Years', '67 Years', '1 Month(s) 30 Day(s)', '8
   ...: 4 Years', '1 Month(s) 12 Day(s)', '6 Month(s) 26 Day(s)', '12 Years', '5 Month(s) 18 Day
   ...: (s)', '68 Years', '80 Years', '3 Month(s) 19 Day(s)', '76 Years', '86 Years', '7 Month(s
   ...: ) 2 Day(s)', '1 Years', '73 Years', '90 Years', '6 Month(s) 20 Day(s)', '79 Years', '89 
   ...: Years', '9 Years', '3 Month(s) 29 Day(s)', '8 Month(s) 21 Day(s)', '4 Years', '6 Month(s
   ...: ) 8 Day(s)', '78 Years', '6 Years', '87 Years', '7 Years', '6 Month(s) 9 Day(s)', '4 Mon
   ...: th(s) 20 Day(s)', '10 Month(s) 16 Day(s)', '4 Month(s) 11 Day(s)', '6 Month(s) 18 Day(s)
   ...: ', '4 Month(s) 13 Day(s)'] 
   ...:
In [2]: [int(age.split(' Years')[0]) if 'Years' in age else 0 for age in lst] 
Every age which doesn't have year is zero years and from others you take year as int.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
This is great! I am able to get years always but Iam struggling with row values with Months+ Days and only Days now for all those rows which have zeros I want the year instead of months/days.



Example is the row has 4 Months 13 Days => (4/12) + (13/365) = 0.3689497716894977 should be my row value.

I am trying but not able to get get results using function yet.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Numeric Enigma Machine idev 8 209 9 hours ago
Last Post: idev
  havent programmed in years - confused by why RETURN is not returning stmoose 5 1,128 Mar-26-2023, 05:51 AM
Last Post: buran
  How split N days between specified start & end days SriRajesh 2 1,302 May-06-2022, 02:12 PM
Last Post: SriRajesh
Question Numeric Anagrams - Count Occurances monty024 2 1,475 Nov-13-2021, 05:05 PM
Last Post: monty024
  How to get datetime from numeric format field klllmmm 3 1,961 Nov-06-2021, 03:26 PM
Last Post: snippsat
  Illegal instruction? working code for months? korenron 4 12,770 Aug-05-2021, 09:57 AM
Last Post: korenron
  How to convert dates in odd format to months lokhtar 2 2,183 Apr-17-2021, 11:54 AM
Last Post: lokhtar
  Extract continuous numeric characters from a string in Python Robotguy 2 2,582 Jan-16-2021, 12:44 AM
Last Post: snippsat
  How to calculate a months' 1st, 4th, 7th day and also 1st again? cananb 3 2,175 Nov-12-2020, 08:23 AM
Last Post: perfringo
  change numerical values to categorical names JoeOpdenaker 3 2,897 Nov-02-2020, 01:32 PM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020