Python Forum
Dividing a single column of dataframe into multiple columns based on char length
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Dividing a single column of dataframe into multiple columns based on char length
#1
Question 
Hi Guys,
I am importing a dataset from a url using below function:
## Import from the url
import pandas as pd
dftemp=pd.read_csv("http://users.stat.ufl.edu/~winner/data/agedeath.dat")
print (dftemp)
print (type(dftemp))
#####
It works fine and give me a result like:
Output:
aris km 21 1 0 aris km 21 2 1 aris km 21 3 2 aris km 21 4 6183 sovr va 100 1439 6184 sovr va 101 1440 [6185 rows x 1 columns] <class 'pandas.core.frame.DataFrame'> #########
my objective:
I want to divide this single column in multiple columns -- eg.. for first row ----first column is for "aris km", 2nd for 21 and third for 1 . This should be based on the characters, like first value "aris km or sovr va is confined within 6 characters -- then 21 is occupying between 8-14 characters and last one is occupying 16 to 18 characters.

If I use split function, I will be able to break it using the space and thus 'aris" and 'km" will also be different columns while it should be same column.

How to break in this way?
Reply
#2
You posted a while ago and no responses, so I am going to stab at it. First, you need to add "header = None" to your import statement, as the first row is being read as the column names, which it isn't. Next, for your question, I would use the "one hot encoding" technique (can read about it multiple sites) using .map . Others will hopefully be able to give you better ideas, but this can give you a start.
Reply
#3
There is Series.str.extract method in Pandas which can
split data into columns. In your case it could be applied as follows:

dftemp.iloc[:,0].str.extract(r'([a-zA-Z\s]+)(\d+)(\d+)')
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Pandas - Creating additional column in dataframe from another column Azureaus 2 122 Jan-11-2021, 09:53 PM
Last Post: Azureaus
  Multi-Indexing in Single Column illmattic 2 200 Oct-16-2020, 06:36 PM
Last Post: illmattic
  Pandas: summing columns conditional on the column labels ddd2332 0 465 Sep-10-2020, 05:58 PM
Last Post: ddd2332
  Adapting a dataframe to the some of columns flyway 2 490 Aug-12-2020, 07:21 AM
Last Post: flyway
  Extracting rows based on condition on one column Robotguy 2 302 Aug-07-2020, 02:27 AM
Last Post: Robotguy
  Filter data based on a value from another dataframe column and create a file using lo pawanmtm 1 622 Jul-15-2020, 06:20 PM
Last Post: pawanmtm
  Pandas DataFrame and unmatched column sritsv19 0 610 Jul-07-2020, 12:52 PM
Last Post: sritsv19
  Assigning Column nunique values to another DataFrame column Pythonito 0 382 Jun-25-2020, 05:04 PM
Last Post: Pythonito
  Difference of two columns in Pandas dataframe zinho 2 846 Jun-17-2020, 03:36 PM
Last Post: zinho
  Issue with dataframe column nsadams87xx 0 482 May-29-2020, 02:00 AM
Last Post: nsadams87xx

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020