Python Forum
Pandas - Creating additional column in dataframe from another column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas - Creating additional column in dataframe from another column
#1
Question 
I currently have a some financial data (Date, Price, Return). Assume the number of rows is 20,000.

Date Price Return
03/01/1950 16.66
04/01/1950 16.85 0.01140
05/01/1950 16.93 0.00475
06/01/1950 16.98 0.00295
09/01/1950 17.08 0.00589
10/01/1950 17.030001 -0.00293
11/01/1950 17.09 0.00352

The Return column is calculated using the formula (today price - yesterdays price)/(yesterdays Price)

In excel for example the first Return cell (0.01140) would be =(B3-B2)/B2. (16.85-16.66)/16.66

Let's assume I only have the Date and Price in a Pandas data frame and I want to calculate the "Return" column and add it to the dataframe. What is the best way to do this?

I could do something like

temp = [None]
for i in range(1, len(df['Price'])):
    result = (df['Price'][i]-df['Price'][i-1])/df['Price'][i-1]
    temp.append(result)
df['Return'] = temp
But that seems look a really messy solution. I worry that for a a large data set the for loop could be real slow. I'm think I should be able to define some sort of transformation function that could be applied more efficiently?

Apologies if this is a really beginner question. Happy to read any documentation you could point me towards. Currently going through a python book as we speak.

Thanks for your help

P.S. is there a way to paste the data as a nice table? I'm sorry about the formatting
Reply
#2
print(df)
df['Return'] = df.Price.pct_change()
df['Return2'] = (df.Price - df.Price.shift(1))/df.Price.shift(1)
print(df)
Output:
Date Price 0 03/01/1950 16.660000 1 04/01/1950 16.850000 2 05/01/1950 16.930000 3 06/01/1950 16.980000 4 09/01/1950 17.080000 5 10/01/1950 17.030001 6 11/01/1950 17.090000 Date Price Return Return2 0 03/01/1950 16.660000 NaN NaN 1 04/01/1950 16.850000 0.011405 0.011405 2 05/01/1950 16.930000 0.004748 0.004748 3 06/01/1950 16.980000 0.002953 0.002953 4 09/01/1950 17.080000 0.005889 0.005889 5 10/01/1950 17.030001 -0.002927 -0.002927 6 11/01/1950 17.090000 0.003523 0.003523
you may want to use Date as index
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Thank you - so simple when you know how. This is exactly what I needed and also answered one of my follow-up questions.

What would you do if mapping from one column to another was a more complex custom function?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [Solved] Formatting cells of a pandas dataframe into an OpenDocument ods spreadsheet Calab 1 477 Mar-01-2025, 04:51 AM
Last Post: Calab
  Find duplicates in a pandas dataframe list column on other rows Calab 2 1,896 Sep-18-2024, 07:38 PM
Last Post: Calab
  Find strings by index from a list of indexes in a different Pandas dataframe column Calab 3 1,532 Aug-26-2024, 04:52 PM
Last Post: Calab
Question SOLVED: TTP match when final column may or may not be present Calab 1 988 Jul-03-2024, 02:45 PM
Last Post: Calab
  Create new column in dataframe Scott 10 3,331 Jun-30-2024, 10:18 PM
Last Post: Scott
  attempt to split values from within a dataframe column mbrown009 9 5,697 Jun-20-2024, 07:59 PM
Last Post: AdamHensley
  Putting column name to dataframe, can't work. jonah88888 2 3,202 Jun-18-2024, 09:19 PM
Last Post: AdamHensley
  Add NER output to pandas dataframe dg3000 0 1,110 Apr-22-2024, 08:14 PM
Last Post: dg3000
  Column Transformer with Mixed Types - sklearn aaldb 0 1,253 Feb-22-2024, 03:27 PM
Last Post: aaldb
  concat 3 columns of dataframe to one column flash77 2 2,073 Oct-03-2023, 09:29 PM
Last Post: flash77

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020