Python Forum
Pandas - Creating additional column in dataframe from another column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas - Creating additional column in dataframe from another column
#1
Question 
I currently have a some financial data (Date, Price, Return). Assume the number of rows is 20,000.

Date Price Return
03/01/1950 16.66
04/01/1950 16.85 0.01140
05/01/1950 16.93 0.00475
06/01/1950 16.98 0.00295
09/01/1950 17.08 0.00589
10/01/1950 17.030001 -0.00293
11/01/1950 17.09 0.00352

The Return column is calculated using the formula (today price - yesterdays price)/(yesterdays Price)

In excel for example the first Return cell (0.01140) would be =(B3-B2)/B2. (16.85-16.66)/16.66

Let's assume I only have the Date and Price in a Pandas data frame and I want to calculate the "Return" column and add it to the dataframe. What is the best way to do this?

I could do something like

temp = [None]
for i in range(1, len(df['Price'])):
    result = (df['Price'][i]-df['Price'][i-1])/df['Price'][i-1]
    temp.append(result)
df['Return'] = temp
But that seems look a really messy solution. I worry that for a a large data set the for loop could be real slow. I'm think I should be able to define some sort of transformation function that could be applied more efficiently?

Apologies if this is a really beginner question. Happy to read any documentation you could point me towards. Currently going through a python book as we speak.

Thanks for your help

P.S. is there a way to paste the data as a nice table? I'm sorry about the formatting
Reply
#2
print(df)
df['Return'] = df.Price.pct_change()
df['Return2'] = (df.Price - df.Price.shift(1))/df.Price.shift(1)
print(df)
Output:
Date Price 0 03/01/1950 16.660000 1 04/01/1950 16.850000 2 05/01/1950 16.930000 3 06/01/1950 16.980000 4 09/01/1950 17.080000 5 10/01/1950 17.030001 6 11/01/1950 17.090000 Date Price Return Return2 0 03/01/1950 16.660000 NaN NaN 1 04/01/1950 16.850000 0.011405 0.011405 2 05/01/1950 16.930000 0.004748 0.004748 3 06/01/1950 16.980000 0.002953 0.002953 4 09/01/1950 17.080000 0.005889 0.005889 5 10/01/1950 17.030001 -0.002927 -0.002927 6 11/01/1950 17.090000 0.003523 0.003523
you may want to use Date as index
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Thank you - so simple when you know how. This is exactly what I needed and also answered one of my follow-up questions.

What would you do if mapping from one column to another was a more complex custom function?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  New Dataframe Column Based on Several Conditions nb1214 1 164 Nov-16-2021, 10:52 PM
Last Post: jefsummers
  pandas: Compute the % of the unique values in a column JaneTan 1 300 Oct-25-2021, 07:55 PM
Last Post: jefsummers
  Putting column name to dataframe, can't work. jonah88888 1 367 Sep-28-2021, 07:45 PM
Last Post: deanhystad
  What if a column has about 90% of data as outliers? Asahavey17 1 439 Aug-23-2021, 04:55 PM
Last Post: jefsummers
  update values in one dataframe based on another dataframe - Pandas iliasb 2 1,383 Aug-14-2021, 12:38 PM
Last Post: jefsummers
  Pandas Data frame column condition check based on length of the value aditi06 1 641 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  empty row in pandas dataframe rwahdan 3 762 Jun-22-2021, 07:57 PM
Last Post: snippsat
  How to move each team row to a new column. Pandas vladiwnl 0 577 Jun-13-2021, 08:10 AM
Last Post: vladiwnl
  first project read from xslx get only 1 column\row? korenron 9 1,161 Jun-13-2021, 07:21 AM
Last Post: korenron
  Adding a new column to a Panda Data Frame rsherry8 2 653 Jun-06-2021, 06:49 PM
Last Post: jefsummers

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020