Python Forum
Feature Scaling with Partitions
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Feature Scaling with Partitions
#1
I would like to apply normalization to a column in a Pandas DataFrame. However, I would like partition the table into shop_id values and apply separate normalization of item_cnt_day within each shop_id. Here's the dataset link if you're interested.

Does anyone know a method to achieve this result? Wall Custom code is welcome! Thanks.

rocketfish
Reply
#2
Something like the following should work,

df['item_cnt_day'] = df.groupby('shop_id')['item_cnt_day'].transform(lambda x: (x-x.mean())/x.std())

However, if the group (a set of records with the same shop_id value) consist of one element only, this per-group scaling
will yield to NaN value (since x.std()= 0 if x is an array consisting of only one element).
Reply
#3
@scidam - Thank you for your quick response! This is exactly the elegant solution I was hoping for.
Reply
#4
For any others who are interested, here's a helpful explanation of how transform() works.

https://pbpython.com/pandas_transform.html
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Scaling of mapped vectors? sricha1217 1 2,332 Apr-10-2018, 10:26 AM
Last Post: sricha1217

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020