![]() |
What is the better way of avoiding duplicate records after aggregation in pandas ? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: What is the better way of avoiding duplicate records after aggregation in pandas ? (/thread-29373.html) |
What is the better way of avoiding duplicate records after aggregation in pandas ? - jagasrik - Aug-30-2020 I want to know the better way of selecting the top revenue generating groups. This is the data i am using Here is my code, i want to see which are the top genre that is having high revenue. import pandas as pd df=pd.read_csv('Downloads\gpdset\google-play-store-11-2018.csv') df['Top_revenue']=df.groupby('genre_id')['price'].transform('sum') df[['genre_id','Top_revenue']].drop_duplicates().sort_values(by=['Top_revenue'],ascending=[False])I am able to get the correct and intended results, but i feel this is not the right way to do it, because i am doing a aggregation using transform('sum') and again dropping the duplicates, i think this is very bad design, if there is a better way of doing it please do let me know. Thanks in advance. |