Python Forum

Full Version: resample grouping pr0blem
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
i have a scenario in which i have a dataframe which contains 3 columns ( date, product_id and sales_amt)

the dates are supposed to span a whole month ( eg nov 2019) but there are some missing days in the dataframe for each product_id.

does anyone have any tips on python code that can loop through the dates for a particular month AND product and add a new row to the dataframe with the missing date, product_id and a sales_amt of zero?

Goal at the end is to have an entry for every day of that month per product_id. i believe resample can handle this but what would the code look like?

thanks for any guidance on this.

Thanks
excluding dataframe manipulation, you can do something like:
from datetime import date, timedelta
datelist = [date(2019, 10, 1), date(2019, 10, 3), date(2019, 10, 15), date(2019, 10, 29)]
beg_of_month = date(2019, 10, 1)
end_of_month = date(2019, 10, 31)
date_set = set(datelist[0] + timedelta(x) for x in range((end_of_month - beg_of_month).days))
nodates = sorted(date_set - set(datelist))
for dt in nodates:
    print(dt)
which yields:
Output:
2019-10-02 2019-10-04 2019-10-05 2019-10-06 2019-10-07 2019-10-08 2019-10-09 2019-10-10 2019-10-11 2019-10-12 2019-10-13 2019-10-14 2019-10-16 2019-10-17 2019-10-18 2019-10-19 2019-10-20 2019-10-21 2019-10-22 2019-10-23 2019-10-24 2019-10-25 2019-10-26 2019-10-27 2019-10-28 2019-10-30