Python Forum
Monthly sales, standard deviation
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Monthly sales, standard deviation
#1
Hi, there. I am back to the knowledge well!

Using pandas I have a dataframe with Date, Item#, Qty and 12 months of data. I want to do a std dev calc to understand the monthly variability for each Item#.

The problem is for many Item#'s the sales are very sporadic: I may only sell Item XYZ in Feb and Mar, and zero the rest of the months of the year.

I am using this code to do the heavy lifting:

df.groupby('Item#').resample('M').sum()

For some Item#'s it is forcing zeros into the empty months (GOOD!), but in many cases it is just showing, say, two months of data, instead of 12, which then makes the std dev calculation incorrect.

Can someone help me understand why this erractic behaviour of the resample method? How can I workaround this problem?

Many thanks in advance for your help!
Reply
#2
uh-oh. no responses. Was this a really dumb question? :(
Reply
#3
Maybe they are waiting for an example?
Reply
#4
Right, makes sense. It's just tough to make a concise example b/c the data is so lengthy.

I did some more experimenting and I've figured out what's going on. Resample() will interpolate, but not extrapolate. So if I have sales of an Item in Jan and Dec, I will get 12 values, even if there is zero sales in Feb-thru-Nov. If I have sales in Mar and Jun only, but none in Jan-Feb or Jul-Dec, I will get 4 values (Mar, Apr, May, Jun).

So, because I have sales of something in every month of the year, my solution was to unstack() the dataframe after the resample(). That forced the resampled 12 months along the top as columns with a ton of NaN's. I filled the NaN's with zeros, then stack(), and presto, 12 values for every item.

Seems a little kludgy, maybe there's a better way?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Resample from monthly to weekly works, but not vice versa JaneTan 0 686 Dec-14-2022, 12:58 AM
Last Post: JaneTan
  Winsorized Mean and Standard Deviation Wheeliam 0 1,900 Jul-11-2020, 05:27 PM
Last Post: Wheeliam
  standard library modules chpyel 4 2,987 May-10-2020, 02:58 PM
Last Post: snippsat
  plot multiple employee sales data in a single graph pitanshu 0 1,986 Oct-24-2019, 01:56 PM
Last Post: pitanshu
  Is there a standard for autocommit In PEP 249 zatlas1 10 5,585 Feb-06-2019, 04:56 PM
Last Post: buran
  Graphics and standard deviation rocioaraneda 3 2,868 Jan-09-2019, 10:53 PM
Last Post: micseydel
  How to calculate variance(standard deviation) by column SriRajesh 2 2,932 Dec-27-2018, 12:35 PM
Last Post: SriRajesh
  standard data types rombertus 3 85,398 Dec-23-2018, 08:52 PM
Last Post: rombertus
  Create a monthly mean column in multiple years fyec 1 4,151 Jun-21-2018, 03:57 AM
Last Post: scidam
  Standard library code BerryK 2 4,142 Apr-29-2017, 10:32 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020