Python Forum
Pandas - Creating additional column in dataframe from another column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas - Creating additional column in dataframe from another column
#1
Question 
I currently have a some financial data (Date, Price, Return). Assume the number of rows is 20,000.

Date Price Return
03/01/1950 16.66
04/01/1950 16.85 0.01140
05/01/1950 16.93 0.00475
06/01/1950 16.98 0.00295
09/01/1950 17.08 0.00589
10/01/1950 17.030001 -0.00293
11/01/1950 17.09 0.00352

The Return column is calculated using the formula (today price - yesterdays price)/(yesterdays Price)

In excel for example the first Return cell (0.01140) would be =(B3-B2)/B2. (16.85-16.66)/16.66

Let's assume I only have the Date and Price in a Pandas data frame and I want to calculate the "Return" column and add it to the dataframe. What is the best way to do this?

I could do something like

temp = [None]
for i in range(1, len(df['Price'])):
    result = (df['Price'][i]-df['Price'][i-1])/df['Price'][i-1]
    temp.append(result)
df['Return'] = temp
But that seems look a really messy solution. I worry that for a a large data set the for loop could be real slow. I'm think I should be able to define some sort of transformation function that could be applied more efficiently?

Apologies if this is a really beginner question. Happy to read any documentation you could point me towards. Currently going through a python book as we speak.

Thanks for your help

P.S. is there a way to paste the data as a nice table? I'm sorry about the formatting
Reply
#2
print(df)
df['Return'] = df.Price.pct_change()
df['Return2'] = (df.Price - df.Price.shift(1))/df.Price.shift(1)
print(df)
Output:
Date Price 0 03/01/1950 16.660000 1 04/01/1950 16.850000 2 05/01/1950 16.930000 3 06/01/1950 16.980000 4 09/01/1950 17.080000 5 10/01/1950 17.030001 6 11/01/1950 17.090000 Date Price Return Return2 0 03/01/1950 16.660000 NaN NaN 1 04/01/1950 16.850000 0.011405 0.011405 2 05/01/1950 16.930000 0.004748 0.004748 3 06/01/1950 16.980000 0.002953 0.002953 4 09/01/1950 17.080000 0.005889 0.005889 5 10/01/1950 17.030001 -0.002927 -0.002927 6 11/01/1950 17.090000 0.003523 0.003523
you may want to use Date as index
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Thank you - so simple when you know how. This is exactly what I needed and also answered one of my follow-up questions.

What would you do if mapping from one column to another was a more complex custom function?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Column Transformer with Mixed Types - sklearn aaldb 0 190 Feb-22-2024, 03:27 PM
Last Post: aaldb
  concat 3 columns of dataframe to one column flash77 2 752 Oct-03-2023, 09:29 PM
Last Post: flash77
  HTML Decoder pandas dataframe column mbrown009 3 931 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  attempt to split values from within a dataframe column mbrown009 8 2,189 Apr-10-2023, 02:06 AM
Last Post: mbrown009
  Finding the median of a column in a huge CSV file markagregory 5 1,707 Jan-24-2023, 04:22 PM
Last Post: DeaD_EyE
  Use pandas to obtain cartesian product between a dataframe of int and equations? haihal 0 1,080 Jan-06-2023, 10:53 PM
Last Post: haihal
  Make unique id in vectorized way based on text data column with similarity scoring ill8 0 855 Dec-12-2022, 03:22 AM
Last Post: ill8
  Impute 1 if previous row of 'days' column is between 0 & 7 JaneTan 2 1,037 Dec-08-2022, 07:42 PM
Last Post: deanhystad
  Increase df column values decimals SriRajesh 2 1,074 Nov-14-2022, 05:20 PM
Last Post: deanhystad
  pandas column percentile nuncio 7 2,357 Aug-10-2022, 04:41 AM
Last Post: nuncio

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020