Python Forum
Pandas - Creating additional column in dataframe from another column - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Pandas - Creating additional column in dataframe from another column (/thread-31952.html)



Pandas - Creating additional column in dataframe from another column - Azureaus - Jan-11-2021

I currently have a some financial data (Date, Price, Return). Assume the number of rows is 20,000.

Date Price Return
03/01/1950 16.66
04/01/1950 16.85 0.01140
05/01/1950 16.93 0.00475
06/01/1950 16.98 0.00295
09/01/1950 17.08 0.00589
10/01/1950 17.030001 -0.00293
11/01/1950 17.09 0.00352

The Return column is calculated using the formula (today price - yesterdays price)/(yesterdays Price)

In excel for example the first Return cell (0.01140) would be =(B3-B2)/B2. (16.85-16.66)/16.66

Let's assume I only have the Date and Price in a Pandas data frame and I want to calculate the "Return" column and add it to the dataframe. What is the best way to do this?

I could do something like

temp = [None]
for i in range(1, len(df['Price'])):
    result = (df['Price'][i]-df['Price'][i-1])/df['Price'][i-1]
    temp.append(result)
df['Return'] = temp
But that seems look a really messy solution. I worry that for a a large data set the for loop could be real slow. I'm think I should be able to define some sort of transformation function that could be applied more efficiently?

Apologies if this is a really beginner question. Happy to read any documentation you could point me towards. Currently going through a python book as we speak.

Thanks for your help

P.S. is there a way to paste the data as a nice table? I'm sorry about the formatting


RE: Pandas - Creating additional column in dataframe from another column - buran - Jan-11-2021

print(df)
df['Return'] = df.Price.pct_change()
df['Return2'] = (df.Price - df.Price.shift(1))/df.Price.shift(1)
print(df)
Output:
Date Price 0 03/01/1950 16.660000 1 04/01/1950 16.850000 2 05/01/1950 16.930000 3 06/01/1950 16.980000 4 09/01/1950 17.080000 5 10/01/1950 17.030001 6 11/01/1950 17.090000 Date Price Return Return2 0 03/01/1950 16.660000 NaN NaN 1 04/01/1950 16.850000 0.011405 0.011405 2 05/01/1950 16.930000 0.004748 0.004748 3 06/01/1950 16.980000 0.002953 0.002953 4 09/01/1950 17.080000 0.005889 0.005889 5 10/01/1950 17.030001 -0.002927 -0.002927 6 11/01/1950 17.090000 0.003523 0.003523
you may want to use Date as index


RE: Pandas - Creating additional column in dataframe from another column - Azureaus - Jan-11-2021

Thank you - so simple when you know how. This is exactly what I needed and also answered one of my follow-up questions.

What would you do if mapping from one column to another was a more complex custom function?