Posts: 1
Threads: 1
Joined: Mar 2019
Hello all,
Brand new to Python! I've tried search keywords for this solution but came up with unrelated examples with commas / delimiters. Any assistance would be greatly appreciated.
Loading from csv into a df,
Data set column = "Month of Year" - values (201801,201802,201803)
I would like to split up this column into a Month and a Year column ( Month = 01, Year = 2018 ).
In excel, this would be a text to column function, any idea on what the equivalent would be in python?
Thank you
Posts: 1,950
Threads: 8
Joined: Jun 2018
I suspect that Excel 'text-to-columns' function doesn't split value to year and month without delimiter as well.
If somebody is brand new to Python it's not good to start with Pandas right away. Learning Python basics will serve you well in future.
There appears to be no delimiter to split value. However, it seems that there is pattern: first four represent year, next two represent month. If values coming from csv then by default they are strings and slicing can retrieve needed data:
>>> values = '201801,201802,201803'
>>> [(el[4:], el[:4]) for el in values.split(',')]
[('01', '2018'), ('02', '2018'), ('03', '2018')]
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 7,324
Threads: 123
Joined: Sep 2016
Mar-11-2019, 10:37 AM
(This post was last modified: Mar-11-2019, 10:38 AM by snippsat.)
To take @ perfringo code a step further and make it datetime object.
>>> values = '201801,201802,201803'
>>> d = [(el[4:], el[:4]) for el in values.split(',')]
>>> d
[('01', '2018'), ('02', '2018'), ('03', '2018')]
>>> date_str = ['-'.join(i) for i in d]
>>> date_str
['01-2018', '02-2018', '03-2018']
>>> date_str[0]
'01-2018' So a this point has a delimiter( - ) and now can use strftime().
>>> from datetime import datetime, timedelta
>>>
>>> d = date_str[0]
>>> d
'01-2018'
>>> my_date = datetime.strptime(d, '%m-%Y')
>>> my_date
datetime.datetime(2018, 1, 1, 0, 0)
>>> # So at this point is a datetime object
>>> type(my_date)
<class 'datetime.datetime'>
>>> my_date.year
2018
>>> my_date.month
1
>>> # 5 days from my_date
>>> my_date + timedelta(days=5)
datetime.datetime(2018, 1, 6, 0, 0)
# With print() don't show "repr" object info
>>> print(my_date + timedelta(days=5))
2018-01-06 00:00:00 Pandas do of course have own Time Series / Date functionality which has a lot functionally.
Posts: 1,950
Threads: 8
Joined: Jun 2018
Just because it can be done: based on snippsat code one can develop following dense and 'nice' code:
>>> from datetime import datetime
>>> values = '201801,201802,201803'
>>> [datetime.strptime(f'{el[4:]}-{el[:4]}', '%m-%Y') for el in values.split(',')]
[datetime.datetime(2018, 1, 1, 0, 0),
datetime.datetime(2018, 2, 1, 0, 0),
datetime.datetime(2018, 3, 1, 0, 0)]
>>> [(date.month, date.year) for date in [datetime.strptime(f'{el[4:]}-{el[:4]}', '%m-%Y') for el in values.split(',')]]
[(1, 2018), (2, 2018), (3, 2018)]
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
|