Identifying consecutive masked values in a 3D data array

chai0404 · Jan-12-2020, 10:55 PM

I have a large 3 dimensional (time, longitude, latitude) input array of daily tmax values. I have masked the values which exceed a certain threshold. I need to find those entries where the mask is True for longer than a specific number of (3) consecutive time steps. The result should be a data array with 0s for the non-consecutive days and numbers corresponding to the length (duration of event) of consecutive elements.

Below is some pseudo-code to make myself clearer:

events = find_consecutive(input_array, duration=3)

input_array = [1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1]

events = [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0, 0, 5, 5, 5, 5, 5, 0, 0, 0]

I've had a look at scipy nd image but haven't been able to completely figure out how to use it.

Any help is appreciated :)

chai0404 · Jan-13-2020, 03:38 AM

I have a large 3 dimensional (time, longitude, latitude) input array of daily tmax values. I have masked the values which exceed a certain threshold. I need to find those entries where the mask is True for longer than a specific number of (3) consecutive time steps. The result should be a data array with 0s for the non-consecutive days and numbers corresponding to the length (duration of event) of consecutive elements.

Below is some pseudo-code to make myself clearer:

events = find_consecutive(input_array, duration=3)

input_array = [1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1]

events = [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0, 0, 5, 5, 5, 5, 5, 0, 0, 0]

I've had a look at scipy nd image but haven't been able to completely figure out how to use it.

Any help is appreciated :)

**perfringo** · (This post was last modified: Jan-13-2020, 08:08 AM by perfringo.)

One way of doing it (not that very elegant).

On row #6 unpacking is done and _ value is number of groups which is not needed. Same result can be obtained without unpacking as labels = label(arr)[0]

Rows #6 and 7 can be merged into one but slices = find_objects(label(arr)[0]) it is not so explicit what is going on.

import numpy as np
from scipy.ndimage.measurements import label, find_objects


arr = np.array([1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1])
labels, _ = label(arr)
slices = find_objects(labels)

for interval in slices:
    if 3 <= arr[interval].size:
        arr[interval] = arr[interval].size
    else:
        arr[interval] = 0

arr will be:

Output:
[0 0 0 0 0 3 3 3 0 0 0 0 5 5 5 5 5 0 0 0]

chai0404 · Jan-14-2020, 12:30 AM

Thanks for your help!
I created a function using the code you shared. However, I get the following error - TypeError: list indices must be integers or slices, not tuple

def consecutive(masked_array):
    labels, _ = label(masked_array)
    slices = find_objects(labels)
    
    for interval in slices:
        xr.where(masked_array[interval].size >= 3, masked_array[interval].size, 0)
        
    return masked_array

Do you know why this may be?

**perfringo** · (This post was last modified: Jan-14-2020, 09:10 AM by perfringo.)

I am not consider myself as numpy person. However, (I assume that xr.where is obscure np.where) np.where returns a list of indices:

>>> input_list = [1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1]
>>> arr = np.array(input_list)
>>> np.where(arr==0)                                                                                                                       
(array([ 1,  2,  3,  4,  8,  9, 11, 17, 18]),)
>>> arr
array([1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1])   # array unchanged
>>> np.where(arr[slice(5, 8, None)].size >= 3, 3, 0)                                                                                       
array(3)                                                              # we are able to set return value if condition is met
>>> arr
array([1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1])   # array unchanged

You want to apply new value to slice based on slice length. So you can assign new value to a slice (like in row #7 above):

>>> for interval in slices: 
...     arr[interval] = np.where(3 <= arr[interval].size, arr[interval].size, 0) 
...
>>> arr
array([0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0, 0, 5, 5, 5, 5, 5, 0, 0, 0])

For better readability size could be assigned to meaningful name:

for interval in slices: 
    interval_length = arr[interval].size 
    arr[interval] = np.where(3 <= interval_length, interval_length, 0)

Performance considaration aside I feel that 'pure' Python conditional expression is more understandable:

for interval in slices: 
    length = arr[interval].size 
    arr[interval] = length if 3 <= length else 0

With Python 3.8 walrus operator it becomes even more concise:

for interval in slices:
    arr[interval] = length if 3 <= (length := arr[interval].size) else 0

chai0404 · Jan-16-2020, 12:27 AM

Thanks for the explanations!

**perfringo** · Jan-16-2020, 07:03 AM

(Jan-16-2020, 12:27 AM)chai0404 Wrote: Thanks for the explanations!

You are welcome.

chai0404 · Jan-16-2020, 10:35 PM

How do I make this work for a 3D array such as the one given below?

input_array([[[1, 1, 0, 0, 0, 1, 1, 1, 1, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 1, 0, 0, 1, 0]],

[[0, 0, 0, 0, 0, 1, 0, 1, 0, 1],
[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 1, 1, 1, 1, 1]]])

output_array([[[1, 1, 0, 0, 0, 4, 4, 4, 4, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 1, 0, 0, 1, 0]],

[[0, 0, 0, 0, 0, 1, 0, 1, 0, 1],
[0, 3, 3, 3, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 5, 5, 5, 5, 5]]])

I tried this, but it doesn't seem to work -

def consec(temps):
    labels, _ = label(temps) # labels the occurrence of 1s and gives it an 'event' number 
    print (label(temps))
    slices = find_objects(labels) # find_objects - what does it do? 
    print(slices)
    new_temps = np.zeros(len(temps))
    
    for i in slices:
        if temps[i].size >= 3:
            new_temps[i] = new_temps[i].size
        else:
            new_temps[i] = 0
    return new_temps

input_array_3D=input_array
eventss_3D=np.zeros([len(input_array_3D),len(input_array_3D[0]),len(input_array_3D[0][0])]) # check numpy for a more concise 
for i in range(len(input_array_3D[0])):
    for j in range(len(input_array_3D[0][0])):    
        #### getting the timeseries of tmax at each pixel with (i,j) Coordination:
        input_array_1D=input_array_3D[:,i,j]
        ##### time series of events for each pixel at (i,j) Coordination
        eventss = consec(input_array_1D)
        #### gathering all pixels to gethere in 1 array
        eventss_3D[:,i,j]=eventss

**perfringo** · Jan-17-2020, 05:57 AM

Is output example correct? Previously consecutive less than 3 were set to 0, here I observe that in output there are 1 and also 1, 1.

chai0404 · Jan-19-2020, 10:05 PM

Sorry, you're right. It should be:

input_array([[[1, 1, 0, 0, 0, 1, 1, 1, 1, 0],
[1, 0, 1, 0, 0, 1, 0, 1, 1, 0],
[1, 0, 1, 1, 0, 1, 0, 0, 1, 0]],

[[0, 0, 0, 0, 0, 1, 0, 1, 0, 1],
[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 1, 0, 1, 1, 1, 1, 1]]])

output_array([[[0, 0, 0, 0, 0, 4, 4, 4, 4, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]],

[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 3, 3, 3, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 5, 5, 5, 5, 5]]])

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	replace sets of values in an array without using loops	paul18fr	7	3,574	Jun-20-2022, 08:15 PM Last Post: paul18fr
	Keep inner Values of 2D array	timste	0	2,048	Jul-26-2021, 09:04 AM Last Post: timste
	[machine learning] identifying a number 0-9 from a 28x28 picture, not working	SheeppOSU	0	2,418	Apr-09-2021, 12:38 AM Last Post: SheeppOSU
	Calculating consecutive days in a 3D array	chai0404	0	2,507	Aug-27-2020, 10:28 PM Last Post: chai0404
	Adding data in 3D array from 2D numpy array	asmasattar	0	2,826	Jul-23-2020, 10:55 AM Last Post: asmasattar
	Comparing and Identifying ID with Percentage	jonatasflausino	1	3,049	Jun-23-2020, 06:44 PM Last Post: hussainmujtaba
	Read json array data by pandas	vipinct	0	2,501	Apr-13-2020, 02:24 PM Last Post: vipinct
	Creating look up table/matrix from 3d data array	chai0404	3	3,834	Apr-09-2020, 04:53 AM Last Post: buran
	Replacing values for specific columns in Panda data structure	Padowan	1	15,310	Nov-27-2017, 08:21 PM Last Post: Padowan
	Match two data sets based on item values	klllmmm	7	8,261	Mar-29-2017, 02:33 PM Last Post: zivoni

Identifying consecutive masked values in a 3D data array

User Panel Messages

Announcements