Python Resample: How do I keep NaN as NaN?

Python Resample: How do I keep NaN as NaN? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Python Resample: How do I keep NaN as NaN? (/thread-38887.html)

Python Resample: How do I keep NaN as NaN? - JaneTan - Dec-07-2022

When resample from monthly data to quarterly, I want my last value NaN to remain as NaN. How should I tweak my code?

Thank you

[Image: 4NQGL.png]

df=pd.read_excel(input_file, sheet_name='Sheet1', usecols='A:D', na_values='ND', index_col=0, header=0)
df.index.names = ['Period']
df.index = pd.to_datetime(df.index)


q0= pd.Series(df['HS6P1'], index=df.index)

m1 = q0.resample('Q').sum()

Current Output
Period
1989-03-31 212.7
1989-06-30 302.1
1989-09-30 272.1
1989-12-31 163.9

Desired Output
Period
1989-03-31 212.7
1989-06-30 302.1
1989-09-30 272.1
1989-12-31 NaN

RE: Python Resample: How do I keep NaN as NaN? - deanhystad - Dec-07-2022

Try

m1 = q0.resample('Q').sum(skipna=False)

RE: Python Resample: How do I keep NaN as NaN? - JaneTan - Dec-08-2022

(Dec-07-2022, 01:50 PM)deanhystad Wrote: Try
m1 = q0.resample('Q').sum(skipna=False)

I get error:

Error:
UnsupportedFunctionCall: numpy operations are not valid with resample. Use .resample(...).sum() instead

Thanks in advance if u could help!

RE: Python Resample: How do I keep NaN as NaN? - deanhystad - Dec-08-2022

I think that answers your question. To use resample().sum() your only choices is to ignore NaN's. Looks like you'll have to do most of the work yourself.

resample() is really just a special version of groupby(). The primary difference that resample only groups by date/time. Calling sequence.resample() returns a DatetimeIndexResampler object which you can use to access the groups. Each group has a timestamp index and a series of values. The series can be summed, and when summing a series you can set skipna=False.

import pandas as pd
from numpy import nan
series = pd.Series(range(6), index=pd.date_range('1/1/2013', periods=6, freq='T'))
series[5] = nan

groups = series.resample('2T')
for x in groups:
    print(x[1].sum(skipna=False))

Output:1.0
5.0
nan

Using this info it is easy to write a resampler that doesn't ignore NaN's.

import pandas as pd
import numpy as np

def no_skip_resampler(series, period):
    groups = series.resample(period)
    sums = [x[1].sum(skipna=False) for x in groups]
    return pd.Series(sums, groups.indices)

series = pd.Series(range(6), index=pd.date_range('1/1/2013', periods=6, freq='T'))
series[5] = np.nan

print("Resampled")
print(series.resample('2T').sum())
print("\n Reconstructed")
print(no_skip_resampler(series, '2T'))

Output:Resampled
2013-01-01 00:00:00    1.0
2013-01-01 00:02:00    5.0
2013-01-01 00:04:00    4.0
Freq: 2T, dtype: float64

 Reconstructed
2013-01-01 00:00:00    1.0
2013-01-01 00:02:00    5.0
2013-01-01 00:04:00    NaN
dtype: float64