Python Forum

Full Version: Function won't apply dynamically in timeseries
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello,

I have the following function:
def custom_hurst(timeseries):
    series = timeseries.iloc[-360:,0]
    max_window = len(series)
    min_window = 15
    
    ndarray_likes = [np.ndarray]
    if "pandas.core.series" in sys.modules.keys():
        ndarray_likes.append(pd.core.series.Series)
    
    if type(series) not in ndarray_likes:
        series = np.array(series)
    
    if "pandas.core.series" in sys.modules.keys() and type(series) == pd.core.series.Series:
            if series.isnull().values.any():
                raise ValueError("Series contains NaNs")
            series = series.values  
    elif np.isnan(np.min(series)):
            raise ValueError("Series contains NaNs")
    
    def to_inc(x):
        incs = x[1:] - x[:-1]
        return incs
    
    def to_pct(x):
        pcts = x[1:] / x[:-1] - 1.
        return pcts
    
    def RS_func(series):
        incs = to_pct(series)
        mean_inc = np.sum(incs) / len(incs)
        deviations = incs - mean_inc
        Z = np.cumsum(deviations)
        R = max(Z) - min(Z)
        S = np.std(incs, ddof=1)
        return R / S
    
    err = np.geterr()
    np.seterr(all='raise')
    
    max_window = max_window or len(series)-1
    window_sizes = [15,30,45,90,180,360]
    
    RS = []
    
    for w in window_sizes:
            rs = []
            for start in range(0, len(series), w):
                if (start+w)>len(series):
                    break
                _ = RS_func(series[start:start+w])
                if _ != 0:
                    rs.append(_)
            RS.append(np.mean(rs))
    
    A = np.vstack([np.log10(window_sizes), np.ones(len(RS))]).T
    H, c = np.linalg.lstsq(A, np.log10(RS), rcond=-1)[0]
    np.seterr(**err)
    
    c = 10**c
       
    return H
but when I use it to produce a column of H values corresponding with the time series, it remains static and gives me the same data for each row.

df['Range'] = df.apply(lambda x: custom_hurst(df), axis=1)
Output:
Range Date 1983-03-30 29.40 0.672943 1983-03-31 29.29 0.672943 1983-04-04 29.44 0.672943 1983-04-05 29.71 0.672943 1983-04-06 29.90 0.672943 ... ... 2020-12-30 48.31 0.672943 2020-12-31 48.42 0.672943 2021-01-04 47.35 0.672943 2021-01-05 49.80 0.672943 2021-01-06 50.52 0.672943
how can I have it applied dynamically so that it incorporates the previous 360 values in a rolling fashion instead of just the last 360 in the df? I've tried rolling functions, for loops, changing the code, but nothing seems to work.

Any help would be appreciated
Thanks
Matt
The core problem results from the function parameter and the lambda.

df['Range'] = df.apply(lambda x: custom_hurst(df), axis=1)
The function takes the full dataframe as the argument and the return is based upon that dataframe. When provided the same dataframe for each row, it produces the same output, which is good. However, nothing is happening with 'x'. To make this calculate for each row, you'll have to refactor the function so it will accept 'x' as the argument and operate upon that.