(Jul-24-2022, 12:41 PM)jefsummers Wrote: [ -> ]I hate dates. Ok that off my chest -
I would create new columns based on the timestamp for year, month, and date, make those integers.
Would then use groupby on the month column rather than trying to use the timestamp.
Add 'em up, calculate 90th percentile, then select the records that match 90th percentile or above and calculate the average of that subset.
Just my idea.
J
Hi Jeff,
I could group the data by months as follows
dataP=dataH[var1].groupby(pd.Grouper(freq='M'))
dataQ=dataP.quantile(0.9)
but when I filter the data for values greter than 90 percentiles
dataX=dataP[dataP >= dataQ]
get the following error
ValueError Traceback (most recent call last)
<ipython-input-128-92cfb00c3619> in <module>
3 #dataP.reset_index()
4 dataQ=dataP.quantile(0.9)
----> 5 dataX=dataP[dataP >= dataQ]
6 #for val in dataP:
7 # if (val > dataP.quantile(0.9)):
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
62 other = item_from_zerodim(other)
63
---> 64 return method(self, other)
65
66 return new_method
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops\__init__.py in wrapper(self, other)
524 rvalues = extract_array(other, extract_numpy=True)
525
--> 526 res_values = comparison_op(lvalues, rvalues, op)
527
528 return _construct_result(self, res_values, index=self.index, name=res_name)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops\array_ops.py in comparison_op(left, right, op)
251 method = getattr(lvalues, op_name)
252 with np.errstate(all="ignore"):
--> 253 res_values = method(rvalues)
254
255 if res_values is NotImplemented:
ValueError: operands could not be broadcast together with shapes (555,) (555,2)
Thank you for the help
Nuncio