Python Forum

Full Version: substring function to create new column
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi There,
I am new to python so please be kind to me.
Below is the data frame and the requirement:

data = {'id': ['aa11bc', 'bb22cd', 'cc33ef', 'dd44gh', 'ee55ij','ff66kl','gg77mn','hh88op'], 
        'direction': ["north, south, east, west", "north, south, east, west", "north, south, east, west", "north, south, east, west",
                      "north, south, east, west","north, south, east, west","north, south, east, west"
                     ,"north, south, east, west"]}
df = pd.DataFrame(data, columns = ['id','direction'])
df

Requirement:

#if id (2nd and 3rd letter) in ('a1','b2') then new_col should have north as an observation from direction column
#else if id (2nd and 3rd letter) in ('c3','d4') then new_col should have south as an observation from direction column
#else if id (2nd and 3rd letter) in ('e5','f6') then new_col should have east as an observation from direction column
#else if id (2nd and 3rd letter) in ('g7','h8') then new_col should have west as an observation from direction column
Appreciate if you guys can assist me with this.
What have you tried so far?
please post latest code working or not, and specify where it is failing
Thank you
Hi There,
I tried with the below code, but the results are not what I am expecting.

new_col=[]

for i in df["id"]:
    if i[1:3].lower()in ('a1','b2'):
        new_col=df["direction"].str.split(',').str[0]
    elif i[1:3].lower()in ('c3','d4'):
        new_col=df["direction"].str.split(',').str[1]
    elif i[1:3].lower()in ('e5','f6'):
        new_col=df["direction"].str.split(',').str[2]
    elif i[1:3].lower()in ('g7','h8'):
        new_col=df["direction"].str.split(',').str[3]
        
df["position"]=new_col
print(df)
Output:
id direction position 0 aa11bc north, south, east, west west 1 bb22cd north, south, east, west west 2 cc33ef north, south, east, west west 3 dd44gh north, south, east, west west 4 ee55ij north, south, east, west west 5 ff66kl north, south, east, west west 6 gg77mn north, south, east, west west 7 hh88op north, south, east, west west
Please advise.
what were you expecting?
Hi There,

I am expecting the final table should be like below:
Output:
id direction position aa11bc north, south, east, west north bb22cd north, south, east, west north cc33ef north, south, east, west south dd44gh north, south, east, west south ee55ij north, south, east, west east ff66kl north, south, east, west east gg77mn north, south, east, west west hh88op north, south, east, west west
Please advise.
well I'm no expert at pandas, but you can make your code more readable by adding just after line 3:
    key = i[1:3].lower()
then for your elif's
    if key in ...
Another moderator can help with the pandas part
I would solve the problem as follows:

import pandas as pd
data = {'id': ['aa11bc', 'bb22cd', 'cc33ef', 'dd44gh', 'ee55ij','ff66kl','gg77mn','hh88op'], 
        'direction': ["north, south, east, west", "north, south, east, west", "north, south, east, west", "north, south, east, west",
                      "north, south, east, west","north, south, east, west","north, south, east, west"
                     ,"north, south, east, west"]}
df = pd.DataFrame(data, columns = ['id','direction'])

def get_direction(row):
    mapper = {'%s%s' % (a, n): k // 2 for k, (a, n) in enumerate(zip('abcdefgh', '12345678'))}
    index = mapper.get(row['id'][1:3].lower())
    if index is not None:
        return row['direction'].split(',')[index]

df['position'] = df.apply(get_direction, axis=1)
However, direction column doesn't change, this probably can be used to get completely vectorized solution of the problem,
which would be faster.