Python Forum
Average values on duplicate records
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Average values on duplicate records
#1
I have the following input table

| Custid | transactions | price | revenue |
| ------ | ------------ | ----- | ------- |
| 12456 | 2300 | 21 | 48300 |
| 12456 | 5200 | 29 | 150800 |
| 78901 | 3000 | 30 | 90000 |
| 676512 | 5200 | 24 | 124800 |
| 676512 | 7400 | 18 | 133200 |

where the custid has duplicates . There is a need to remove the duplicates and find the average of the transactions field and the price field for the custid that has more than 1 record and recalculate the revenue as ((new avg)transactions *(newavg) price)

Output table

| Custid | transactions | price | revenue |
| ------ | ------------ | ----- | ------- |
| 12456 | 3750 | 25 | 93750 |
| 78901 | 3000 | 30 | 90000 |
| 676512 | 6300 | 21 | 132300 |

[b]Python code tried


import pandas as pd

df1 .pd.read_csv("file1.csv")

df2=df1[custid].value_counts()

df1[transactions]=np.where(df2[count]>1,df1[transactions].mean())
df1[price]=np.where(df2[count]>1,df1[price].mean())
df1[revenue]=df1[transactions]*df1[price]

[/b]
Reply


Messages In This Thread
Average values on duplicate records - by chandramouliarun - Jul-26-2022, 09:52 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Get an average of the unique values of a column with group by condition and assign it klllmmm 0 578 Feb-17-2024, 05:53 PM
Last Post: klllmmm
  What is the better way of avoiding duplicate records after aggregation in pandas ? jagasrik 0 1,780 Aug-30-2020, 05:26 PM
Last Post: jagasrik
  Getting duplicate column values but same row values from a csv smidtol 1 1,944 Feb-10-2020, 08:43 PM
Last Post: Larz60+
  Combine Duplicate CVS Column Values Jay 3 5,147 Nov-03-2016, 07:25 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020