Split Column Text by Number of Characters

cgoldstein · Mar-11-2019, 02:34 AM

Hello all,

Brand new to Python! I've tried search keywords for this solution but came up with unrelated examples with commas / delimiters. Any assistance would be greatly appreciated.

Loading from csv into a df,

Data set column = "Month of Year" - values (201801,201802,201803)
I would like to split up this column into a Month and a Year column ( Month = 01, Year = 2018 ).

In excel, this would be a text to column function, any idea on what the equivalent would be in python?

Thank you

**perfringo** · Mar-11-2019, 09:21 AM

I suspect that Excel 'text-to-columns' function doesn't split value to year and month without delimiter as well.

If somebody is brand new to Python it's not good to start with Pandas right away. Learning Python basics will serve you well in future.

There appears to be no delimiter to split value. However, it seems that there is pattern: first four represent year, next two represent month. If values coming from csv then by default they are strings and slicing can retrieve needed data:

>>> values = '201801,201802,201803'
>>> [(el[4:], el[:4]) for el in values.split(',')]
[('01', '2018'), ('02', '2018'), ('03', '2018')]

***snippsat*** · (This post was last modified: Mar-11-2019, 10:38 AM by snippsat.)

To take @perfringo code a step further and make it datetime object.

>>> values = '201801,201802,201803'
>>> d = [(el[4:], el[:4]) for el in values.split(',')]
>>> d
[('01', '2018'), ('02', '2018'), ('03', '2018')]
>>> date_str = ['-'.join(i) for i in d]
>>> date_str
['01-2018', '02-2018', '03-2018']
>>> date_str[0]
'01-2018'

So a this point has a delimiter(-) and now can use strftime().

>>> from datetime import datetime, timedelta
>>> 
>>> d = date_str[0]
>>> d
'01-2018'
>>> my_date = datetime.strptime(d, '%m-%Y')
>>> my_date
datetime.datetime(2018, 1, 1, 0, 0)

>>> # So at this point is a datetime object
>>> type(my_date)
<class 'datetime.datetime'>
>>> my_date.year
2018
>>> my_date.month
1

>>> # 5 days from my_date
>>> my_date + timedelta(days=5)
datetime.datetime(2018, 1, 6, 0, 0)

# With print() don't show "repr" object info
>>> print(my_date + timedelta(days=5))
2018-01-06 00:00:00

Pandas do of course have own Time Series / Date functionality which has a lot functionally.

**perfringo** · Mar-11-2019, 01:45 PM

Just because it can be done: based on snippsat code one can develop following dense and 'nice' code:

>>> from datetime import datetime
>>> values = '201801,201802,201803'
>>> [datetime.strptime(f'{el[4:]}-{el[:4]}', '%m-%Y') for el in values.split(',')]
[datetime.datetime(2018, 1, 1, 0, 0),
 datetime.datetime(2018, 2, 1, 0, 0),
 datetime.datetime(2018, 3, 1, 0, 0)]
>>> [(date.month, date.year) for date in [datetime.strptime(f'{el[4:]}-{el[:4]}', '%m-%Y') for el in values.split(',')]]
[(1, 2018), (2, 2018), (3, 2018)]

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Number stored as text with openpyxl	CAD79	2	5,050	Apr-17-2024, 10:17 AM Last Post: CAD79
	doing string split with 2 or more split characters	Skaperen	22	6,429	Aug-13-2023, 01:57 AM Last Post: Skaperen
	How to split a String from Text Input into 40 char chunks?	lastyle	7	2,677	Aug-01-2023, 09:36 AM Last Post: Pedroski55
	How to remove patterns of characters from text	aaander	4	2,227	Nov-19-2022, 03:34 PM Last Post: snippsat
	Two text files, want to add a column value	zxcv101	8	3,467	Jun-20-2022, 03:06 PM Last Post: deanhystad
	Split a number to list and list sum must be number	sunny9495	5	3,877	Apr-28-2022, 09:32 AM Last Post: Dexty
	How to split file by same values from column from imported CSV file?	Paqqno	5	5,802	Mar-24-2022, 05:25 PM Last Post: Paqqno
	Find if chain of characters or number	Frankduc	4	2,743	Feb-11-2022, 01:55 PM Last Post: Frankduc
	Search text in PDF and output its page number.	atomxkai	21	17,169	Jan-21-2022, 06:20 AM Last Post: snippsat
	Split single column to multiple columns	SriRajesh	1	1,930	Jan-07-2022, 06:43 PM Last Post: jefsummers

Split Column Text by Number of Characters

User Panel Messages

Announcements