SORTED.group by and count average of two columns [ sum of col 1 / sum of col 2 ]

BSDevo · (This post was last modified: Oct-26-2023, 04:30 PM by BSDevo.)

Hi, im trying to group by and get an average of two columns in Pandas then display inside Streamlit_Folium chropleth.geojson map but my KPI and Map averages are different.Trying to group by states and get each state average.
I have a RPM column which is only for dataframe purpose and i created by doing this:

        
              df["RPM"] = df["Rate"]/df["Miles"]

for KPI i did this:

        
              filtered_average_rpm = round(filtered_total_earnings / filtered_total_miles, 3)

For chorplet map my code is like this:

        
              filtered_df["StateRPM"] = filtered_df["RPM"].groupby(filtered_df["PuState"]).transform('mean')

        
              for feature in choropleth.geojson.data["features"]:
    state_name = feature["properties"]["stusab"]
    feature["properties"]["rpm"] = "Average RPM: $" + str(round(indexed_df.loc[state_name, "StateRPM"].mean(), 2)) if state_name in list(indexed_df.index) else ''

But this is no go. KPI average 2.212 and Map average 2.26
KPI average is money / miles and Map average is sum of values and divides by the count and i need sum of money / sum of miles - same way as KPI

If i do:

        
              filtered_df["StateRPM"] = (filtered_df["Rate"]/filtered_df["Miles"]).groupby(filtered_df["PuState"]).transform('mean')

I get this:

Error:
TypeError: ufunc 'divide' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

If i do:

        
              filtered_df["StateRPM"] = filtered_df["Rate"]/filtered_df["Miles"].groupby(filtered_df["PuState"])

i get this:

Error:
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (44, 2) + inhomogeneous part.

I also tried like this:

        
              filtered_df["RPM"] = filtered_df["RPM"].groupby(filtered_df["PuState"])

got this:

Error:
ValueError: Length of values (1) does not match length of index (58)

How do i get KPI`s average ?
Thank You.

BSDevo · (This post was last modified: Oct-23-2023, 09:47 PM by BSDevo.)

Solved.

        
              filtered_df["RPM"] = filtered_df["Rate"]/filtered_df["Miles"].groupby(filtered_df["PuState"]).transform('mean')

Turns out i solved it by removing ()

EDIT: No , it makes incorrect calculations

EDIT#2 - sorted.

        
              filtered_df["StateRate"] = filtered_df["Rate"].groupby(filtered_df["PuState"]).transform('mean')
filtered_df["StateRPM"] = filtered_df["StateRate"]/filtered_df["Miles"].groupby(filtered_df["PuState"]).transform('mean')

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Remove extra count columns created by pandas groupby	spyf8	1	3,829	Feb-10-2021, 09:19 AM Last Post: Naheed
	Merging sorted dataframes using Pandas	Robotguy	1	2,952	Aug-12-2020, 07:11 PM Last Post: jefsummers
	How to group variables & check correlation of group variables wrt single variable	SriRajesh	2	3,706	May-23-2018, 03:01 PM Last Post: SriRajesh

SORTED.group by and count average of two columns [ sum of col 1 / sum of col 2 ]

User Panel Messages

Announcements