Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Need help for finding cumulative values in a loop
#1
Greetings,

I would like to compute all annual cumulative 5-consecutive day maximum values in a 3D matrix using a for loop. To be more specific, the variable "Quantity" is a 3D matrix that is composed of 3 elements: the first dimension being time (in days), the second being latitude (64 lines of latitude), and the third being longitude (128 degrees of longitude). The total number of days is 51,100, making up 140 years. Previously, I converted the number of days into years (by dividing by 365, shown in the code below) in order to simply derive the maximum value every year for every grid cell.

Ultimately, the goal is now to still similarly obtain maximum values at an annual time scale, as before (which the below code will do already), but now I want to try to specifically obtain the maximum cumulative values for every year for every grid cell based on 5 consecutive day periods. So, basically, 140 maximum values (because there are 140 years) for each grid cell, but the maximums would be based on 5-consecutive day periods, rather than a single day (as I did already as my first task).

The way I define the 5-consecutive days is as follows. For example, the first set of 5 consecutive days is as follows: Day 0+Day 1+Day 2+Day 3+Day 4. For my particular case, what I was envisioning would be to begin deriving the cumulative value over Day 0 to Day 4 (so, the first 5 days in year 0), and then move to the next group of 5 consecutive days, from Day 1 to Day 5 (i.e. Day 1+Day 2+Day 3+Day 4+Day 5), and then Day 2 to Day 6 (i.e. Day 2+Day 3+Day 4+Day 5+Day 6), and then Day 3 to Day 7, and then Day 4 to Day 8.....etc....all the way to the end of the 51,100 day period (140 years). In this way, there would be about 50,000-51,000 5-consecutive day periods over the 140 years, but the existing np.max function (in the code below) would allow Python to return the maximum cumulative values for each year for each grid cell. In this case, would you use the np.sum function to derive cumulative values, and then have the previously generated code (see below) to produce the annual maximum cumulative values?

Finally, the "time-2" and "time+2" in the below code gives a range of days relative to a specified day. For example, at Day 2, the range would be Day 0 to Day 4 (because 2-2=0 and 2+2=4, so Day 0 to Day 4, relative to Day 2). Using that approach, though, I cannot use Day 0 and Day 1, as well as the final two days in the period, since that would lead to a range that contains days that do not exist. For instance, if you subtract 2 from Day 0 or Day 1, you will end up with a negative day (-2 and -1, respectively). Likewise, you will end up with days that exceed 51100 for the final two days due to time+2 (Day 51,101 and Day 51,102, which don't exist). Thus, some kind of condition would need to be used to tell the loop to ignore those particular days (the very first two days and the very last two days), but I'm just uncertain how to specify the conditions in the for loop and wanted to know of a way to use them for this case?

Quantity=Q
Quantity2=np.zeros(Quantity.shape)
Year=np.arange(Quantity.shape[0])//365
n_year = max(Year)+1
onedaymax=np.empty((n_year, )+Quantity.shape[1:3])
fivedaymax=np.empty((n_year, )+Quantity2.shape[1:3])
for time in range(Quantity.shape[0]):
                Quantity2[time,...]
                fiveconsecutivedays=np.sum(Quantity[time-2:time+2,...], axis=0) 
                        if time in Quantity.shape[0]>1 and <50,099
                                continue
                                        for y in range(n_year):
                                                onedaymax[y, ...]=np.max(Quantity[Year==y, ...], axis=0)
                                                fivedaymax[y, ...]=np.max(Quantity2[Year==y, ...], axis=0)
The problem is within the first for loop, and I'm not sure that was is specified within it is the right approach. The "if time" portion is intended to try to omit the first two days and last two days of the period, but I'm unsure if that it specified correctly for my purposes, or if it is even relevant. What would you modify there?

Many thanks, and I would GREATLY appreciate any feedback!
Quote
#2
Try this:
Quantity = Q
Quantity2 = np.zeros(Quantity.shape)
Year = np.arange(Quantity.shape[0]) // 365
n_year = max(Year) + 1
onedaymax = np.empty((n_year,) + Quantity.shape[1:3])
fivedaymax = np.empty((n_year,) + Quantity2.shape[1:3])
for time in range(Quantity.shape[0]):
    Quantity2[time, ...]
    fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0)
    if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099:
        continue
        for y in range(n_year):
            onedaymax[y, ...] = np.max(Quantity[Year == y, ...], axis=0)
            fivedaymax[y, ...] = np.max(Quantity2[Year == y, ...], axis=0)
Quote
#3
Thank you so much, Larz60!

I tried running that and received the following error:

Error:
TypeError Traceback (most recent call last) 91 Quantity2[time, ...] 92 fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0) ---> 93 if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099: 94 continue 95 for y in range(n_year): TypeError: argument of type 'int' is not iterablhttps://python-forum.io/Thread-Progress-Finished-Question?page=4e
Could it be that the first element of the matrix "Quantity" is expressed as integers (Days are expressed as integers)? If that is the case, could that part of the code still work, or is there another way around that error?

Edit: Also, does the above code, in its entirety, make sense in accordance with what I would like to do altogether? I am sure that there are multiple ways to do the same thing, but I just wanted to know if this is a good approach. :)
Quote
#4
Since you don't show enough code to know what Quantity.shape's format is, can't really help.
Quote
#5
Larz60+ is right that you don'y show enough code, but to me time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099 do not make sense
Quantity.shape[0] is integer, because otherwise for time in range(Quantity.shape[0]): would raise an exception, so you cannot iterate time in Quantity.shape[0]
Quote
#6
Hi Larz60 and buran,

Thank you for your responses, and I apologize for not formatting the error above with tags!

The format of the variable "Quantity" is in the form of a 3D matrix, which contains 3 elements. This is in the form of [Days, latitude, longitude]. "Days" is in the form of integers, from Day 0 to Day 50,099, or 0-50,099 (so, 51,100 days, by default). There are 64 lines of latitude and 128 lines of longitude.

So, yes, Days would be in the form of integers. If that is the case, would there be a way to reorganize that line where the error is occurring?

Thank you, once again.
Quote
#7
Apologies for the double message.

In addition to post #6, what I am really trying to do in that line is essentially tell Python to only consider Day 2 to Day 50098. Is there a way to specify that range of days?

Is the opening of that loop okay, as well?

Thanks, again.
Quote
#8
Hi there,

I would like to compute all annual cumulative 5-consecutive day maximum values for a 3D matrix using a for loop. To be more specific, the variable "Quantity" is a 3D matrix that is composed of 3 elements and is structured as follow:

[Days, Latitude, longitude]

The first dimension is an integer unit of time (in days), the second being latitude (64 lines of latitude), and the third being longitude (128 degrees of longitude). The total number of days is 51,100, making up 140 years. Previously, I converted the number of days into years (by dividing by 365, shown in the code below) in order to simply derive the maximum value every year for every grid cell.

Ultimately, the goal is now to still similarly obtain maximum values at an annual time scale, as before (which the below code will do already), but now I want to try to specifically obtain the maximum cumulative values for every year for every grid cell based on 5 consecutive day periods. So, basically, 140 maximum values (because there are 140 years) for each grid cell, but the maximums would be based on 5-consecutive day periods, rather than a single day (as I did already as my first task).

The way I define the 5-consecutive days is as follows. For example, the first set of 5 consecutive days is as follows: Day 0+Day 1+Day 2+Day 3+Day 4. For my particular case, what I was envisioning would be to begin deriving the cumulative value over Day 0 to Day 4 (so, the first 5 days in year 0), and then move to the next group of 5 consecutive days, from Day 1 to Day 5 (i.e. Day 1+Day 2+Day 3+Day 4+Day 5), and then Day 2 to Day 6 (i.e. Day 2+Day 3+Day 4+Day 5+Day 6), and then Day 3 to Day 7, and then Day 4 to Day 8.....etc....all the way to the end of the 51,100 day period (140 years). In this way, there would be about 50,000-51,000 5-consecutive day periods over the 140 years, but the existing np.max function (in the code below) would allow Python to return the maximum cumulative values for each year for each grid cell. In this case, would you use the np.sum function to derive cumulative values, and then have the previously generated code (see below) to produce the annual maximum cumulative values?

Finally, the "time-2" and "time+2" in the below code gives a range of days relative to a specified day. For example, at Day 2, the range would be Day 0 to Day 4 (because 2-2=0 and 2+2=4, so Day 0 to Day 4, relative to Day 2). Using that approach, though, I cannot use Day 0 and Day 1, as well as the final two days in the period, since that would lead to a range that contains days that do not exist. For instance, if you subtract 2 from Day 0 or Day 1, you will end up with a negative day (-2 and -1, respectively). Likewise, you will end up with days that exceed 51100 for the final two days due to time+2 (Day 51,101 and Day 51,102, which don't exist). Thus, some kind of condition would need to be used to tell the loop to ignore those particular days (the very first two days and the very last two days), but I'm just uncertain how to specify the conditions in the for loop and wanted to know of a way to use them for this case?

This is what I have tried already:

Quantity = Q
Quantity2 = np.zeros(Quantity.shape)
Year = np.arange(Quantity.shape[0]) // 365
n_year = max(Year) + 1
onedaymax = np.empty((n_year,) + Quantity.shape[1:3])
fivedaymax = np.empty((n_year,) + Quantity2.shape[1:3])
for time in range(Quantity.shape[0]):
    Quantity2[time, ...]
    fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0)
    if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099:
        continue
        for y in range(n_year):
            onedaymax[y, ...] = np.max(Quantity[Year == y, ...], axis=0)
            fivedaymax[y, ...] = np.max(Quantity2[Year == y, ...], axis=0)
But I end up with error:

Error:
TypeError Traceback (most recent call last) 91 Quantity2[time, ...] 92 fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0) ---> 93 if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099: 94 continue 95 for y in range(n_year): TypeError: argument of type 'int' is not iterablhttps://python-forum.io/Thread-Progress-Finished-Question?page=4e
I suspect that the error is due to the first element of the 3D matrix "Quantity" being in integers, from 0-50099 (i.e. from Day 0 to Day 50099, so a grand total of 51,100 days). If so, how would you reorganize that section in order for the error to disappear?

What I am really trying to do in that line is essentially tell Python to only consider Day 2 to Day 50098. Is there a way to specify that range of days?

Is the opening of that loop okay, as well?

Any assistance would be immensely appreciated!!!!
Quote
#9
please, don't start new threads. keep the discussion in the original thread
To be honest - I have problem understanding what you want to achieve. You don't show sample data, nor full runnable snippet and your long explanations at the beginning of each thread are unclear at least to me (probably it's my problem). sorry
Quote
#10
(May-23-2018, 07:03 PM)Lightning1800 Wrote:
time in Quantity.shape[0] > 1

What do you think that's doing?

Print out Quantity.shape[0], so we can see what it is. But I don't see how time in True could ever make sense, tbh
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  For loop prints strane values? colt 2 194 Sep-22-2019, 02:01 AM
Last Post: ichabod801
  Get all values of for loop with an index BollerwagenIng 2 308 Aug-09-2019, 07:58 AM
Last Post: BollerwagenIng
  change array column values without loop khalidreemy 2 268 May-05-2019, 09:05 AM
Last Post: DeaD_EyE
  randint stops changing values in a loop Naito 4 521 Jan-30-2019, 08:15 AM
Last Post: perfringo
  finding 2 max values in an array in python Akankshha 11 1,475 Oct-18-2018, 09:16 AM
Last Post: perfringo
  Finding all maximum values in a matrix Lightning1800 3 948 May-14-2018, 03:55 PM
Last Post: Lightning1800
  Generating list of rsquared_adj regression values for variating i with loop hpg 1 974 Apr-18-2018, 07:33 PM
Last Post: nilamo
  Using values from a list of tuples within a loop jtpy 5 986 Mar-11-2018, 08:31 AM
Last Post: jtpy
  Finding values to draw a line on a curve kesenthilkumar 2 1,116 Sep-19-2017, 09:50 AM
Last Post: kesenthilkumar
  Finding largest value in a for loop mingchew 1 808 Aug-16-2017, 12:02 PM
Last Post: sparkz_alot

Forum Jump:


Users browsing this thread: 1 Guest(s)