Python Forum
Get an average of the unique values of a column with group by condition and assign it
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Get an average of the unique values of a column with group by condition and assign it
#1
I'm trying to get an average of the unique values of a column with a group by condition.

In the method, I followed I had to create a new data frame for unique values and then merge the results into the main data frame.

Is there a better way to use the pandas transform method to achieve my result?

In below reproducible example, I'm getting 1. the average of the count column unique values under each user and 2. then getting the average of the count column unique values under each user for value in category column-wise.

df1 = pd.DataFrame({'user_id':['A','A','A', 'B','B','B', 'D','D','D', 'E','E'],
                'Category':['ABC','ABC','ABC','ABC','ABC','ABC','XYZ','XYZ','XYZ','XYZ','XYZ'],
                'counts':[3,3,3,2,2,1,2,1,2,2,2]})

df1_unique = df1.drop_duplicates(["user_id","Category",'counts'], keep='first')

df1["Overall_average__Unique_Counts"] = df1_unique["Overall_average__Unique_Counts"]= df1_unique['counts'].mean()

df1_unique["Categorywise_average_Unique_counts"] = df1_unique.groupby(["Category"])['counts'].transform('mean')

df2 = pd.merge(df1,df1_unique[["user_id","Category","counts","Categorywise_average_Unique_counts"]],on=["user_id","Category","counts"],how='left')
This the resultant table

Output:
print(df2) user_id Category counts Overall_average__Unique_Counts \ 0 A ABC 3 1.83 1 A ABC 3 1.83 2 A ABC 3 1.83 3 B ABC 2 1.83 4 B ABC 2 1.83 5 B ABC 1 1.83 6 D XYZ 2 1.83 7 D XYZ 1 1.83 8 D XYZ 2 1.83 9 E XYZ 2 1.83 10 E XYZ 2 1.83 Categorywise_average_Unique_counts 0 2.00 1 2.00 2 2.00 3 2.00 4 2.00 5 2.00 6 1.67 7 1.67 8 1.67 9 1.67 10 1.67
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Converting column of values into muliple columns of counts highland44 0 257 Feb-01-2024, 12:48 AM
Last Post: highland44
  Python Alteryx QS-Passing pandas dataframe column inside SQL query where condition sanky1990 0 747 Dec-04-2023, 09:48 PM
Last Post: sanky1990
  Add a row per group and depending of values Menthix 0 519 Mar-20-2023, 11:20 AM
Last Post: Menthix
  Replacing values ​​in Mysql with a condition stsxbel 0 642 Mar-05-2023, 08:20 PM
Last Post: stsxbel
  create new column based on condition arvin 12 2,255 Dec-13-2022, 04:53 PM
Last Post: jefsummers
  PowerBI: Using Python Regex to look for values in column MarcusR44 1 980 Oct-14-2022, 01:03 PM
Last Post: ibreeden
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 849 Sep-08-2022, 06:32 AM
Last Post: klllmmm
  How to combine multiple column values into 1? cubangt 15 2,872 Aug-11-2022, 08:25 PM
Last Post: cubangt
  Average values on duplicate records chandramouliarun 1 837 Jul-27-2022, 03:13 AM
Last Post: deanhystad
  df column mean and count of unique SriRajesh 0 1,121 May-07-2022, 08:19 AM
Last Post: SriRajesh

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020