Nov-19-2020, 01:15 PM
Changed the title to be more specific. I have an update if anyone else stumbles across this searching for an answer.
First I'll detail better what is still NOT working:
def fix(x):
x.replace('€', '')
if 'M' in x:
x.replace('M', '')
return x
if 'K' in x:
x.replace('K', '')
return x
df1 = pd.DataFrame(df, columns=['Name', 'Value', 'Wage'])
df1['Wage'] = df1['Wage'].apply(fix)
df1['Value'] = df1['Value'].apply(fix)
df1.head(6)
Here I have not added the conversion because this is not even working on its own. The head() command works and things are printed, but without the characters being replaced. However, if I write the same code with a lambda function, it works. Pandas is really annoying me
Now for what DOES work:
df1['Wage'] = df1['Wage'].replace('[€MK]', '', regex=True).astype(float)*1000
df1['Value'] = df1['Value'].replace('[€MK]', '', regex=True).astype(float)*1000000
I guess that is nicer to get it all done in two lines, and here we have some Pandas specific methods, which is nice to know, but I still don't understand what happened with the other variation.
If someone could explain why one of these works and the other does not it would be very helpful to my understanding and sanity
First I'll detail better what is still NOT working:
def fix(x):
x.replace('€', '')
if 'M' in x:
x.replace('M', '')
return x
if 'K' in x:
x.replace('K', '')
return x
df1 = pd.DataFrame(df, columns=['Name', 'Value', 'Wage'])
df1['Wage'] = df1['Wage'].apply(fix)
df1['Value'] = df1['Value'].apply(fix)
df1.head(6)
Here I have not added the conversion because this is not even working on its own. The head() command works and things are printed, but without the characters being replaced. However, if I write the same code with a lambda function, it works. Pandas is really annoying me
Now for what DOES work:
df1['Wage'] = df1['Wage'].replace('[€MK]', '', regex=True).astype(float)*1000
df1['Value'] = df1['Value'].replace('[€MK]', '', regex=True).astype(float)*1000000
I guess that is nicer to get it all done in two lines, and here we have some Pandas specific methods, which is nice to know, but I still don't understand what happened with the other variation.
If someone could explain why one of these works and the other does not it would be very helpful to my understanding and sanity