Python Forum
Need help for finding cumulative values in a loop
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Need help for finding cumulative values in a loop
#1
Greetings,

I would like to compute all annual cumulative 5-consecutive day maximum values in a 3D matrix using a for loop. To be more specific, the variable "Quantity" is a 3D matrix that is composed of 3 elements: the first dimension being time (in days), the second being latitude (64 lines of latitude), and the third being longitude (128 degrees of longitude). The total number of days is 51,100, making up 140 years. Previously, I converted the number of days into years (by dividing by 365, shown in the code below) in order to simply derive the maximum value every year for every grid cell.

Ultimately, the goal is now to still similarly obtain maximum values at an annual time scale, as before (which the below code will do already), but now I want to try to specifically obtain the maximum cumulative values for every year for every grid cell based on 5 consecutive day periods. So, basically, 140 maximum values (because there are 140 years) for each grid cell, but the maximums would be based on 5-consecutive day periods, rather than a single day (as I did already as my first task).

The way I define the 5-consecutive days is as follows. For example, the first set of 5 consecutive days is as follows: Day 0+Day 1+Day 2+Day 3+Day 4. For my particular case, what I was envisioning would be to begin deriving the cumulative value over Day 0 to Day 4 (so, the first 5 days in year 0), and then move to the next group of 5 consecutive days, from Day 1 to Day 5 (i.e. Day 1+Day 2+Day 3+Day 4+Day 5), and then Day 2 to Day 6 (i.e. Day 2+Day 3+Day 4+Day 5+Day 6), and then Day 3 to Day 7, and then Day 4 to Day 8.....etc....all the way to the end of the 51,100 day period (140 years). In this way, there would be about 50,000-51,000 5-consecutive day periods over the 140 years, but the existing np.max function (in the code below) would allow Python to return the maximum cumulative values for each year for each grid cell. In this case, would you use the np.sum function to derive cumulative values, and then have the previously generated code (see below) to produce the annual maximum cumulative values?

Finally, the "time-2" and "time+2" in the below code gives a range of days relative to a specified day. For example, at Day 2, the range would be Day 0 to Day 4 (because 2-2=0 and 2+2=4, so Day 0 to Day 4, relative to Day 2). Using that approach, though, I cannot use Day 0 and Day 1, as well as the final two days in the period, since that would lead to a range that contains days that do not exist. For instance, if you subtract 2 from Day 0 or Day 1, you will end up with a negative day (-2 and -1, respectively). Likewise, you will end up with days that exceed 51100 for the final two days due to time+2 (Day 51,101 and Day 51,102, which don't exist). Thus, some kind of condition would need to be used to tell the loop to ignore those particular days (the very first two days and the very last two days), but I'm just uncertain how to specify the conditions in the for loop and wanted to know of a way to use them for this case?

Quantity=Q
Quantity2=np.zeros(Quantity.shape)
Year=np.arange(Quantity.shape[0])//365
n_year = max(Year)+1
onedaymax=np.empty((n_year, )+Quantity.shape[1:3])
fivedaymax=np.empty((n_year, )+Quantity2.shape[1:3])
for time in range(Quantity.shape[0]):
                Quantity2[time,...]
                fiveconsecutivedays=np.sum(Quantity[time-2:time+2,...], axis=0) 
                        if time in Quantity.shape[0]>1 and <50,099
                                continue
                                        for y in range(n_year):
                                                onedaymax[y, ...]=np.max(Quantity[Year==y, ...], axis=0)
                                                fivedaymax[y, ...]=np.max(Quantity2[Year==y, ...], axis=0)
The problem is within the first for loop, and I'm not sure that was is specified within it is the right approach. The "if time" portion is intended to try to omit the first two days and last two days of the period, but I'm unsure if that it specified correctly for my purposes, or if it is even relevant. What would you modify there?

Many thanks, and I would GREATLY appreciate any feedback!
Reply
#2
Try this:
Quantity = Q
Quantity2 = np.zeros(Quantity.shape)
Year = np.arange(Quantity.shape[0]) // 365
n_year = max(Year) + 1
onedaymax = np.empty((n_year,) + Quantity.shape[1:3])
fivedaymax = np.empty((n_year,) + Quantity2.shape[1:3])
for time in range(Quantity.shape[0]):
    Quantity2[time, ...]
    fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0)
    if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099:
        continue
        for y in range(n_year):
            onedaymax[y, ...] = np.max(Quantity[Year == y, ...], axis=0)
            fivedaymax[y, ...] = np.max(Quantity2[Year == y, ...], axis=0)
Reply
#3
Thank you so much, Larz60!

I tried running that and received the following error:

Error:
TypeError Traceback (most recent call last) 91 Quantity2[time, ...] 92 fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0) ---> 93 if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099: 94 continue 95 for y in range(n_year): TypeError: argument of type 'int' is not iterablhttps://python-forum.io/Thread-Progress-Finished-Question?page=4e
Could it be that the first element of the matrix "Quantity" is expressed as integers (Days are expressed as integers)? If that is the case, could that part of the code still work, or is there another way around that error?

Edit: Also, does the above code, in its entirety, make sense in accordance with what I would like to do altogether? I am sure that there are multiple ways to do the same thing, but I just wanted to know if this is a good approach. :)
Reply
#4
Since you don't show enough code to know what Quantity.shape's format is, can't really help.
Reply
#5
Larz60+ is right that you don'y show enough code, but to me time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099 do not make sense
Quantity.shape[0] is integer, because otherwise for time in range(Quantity.shape[0]): would raise an exception, so you cannot iterate time in Quantity.shape[0]
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#6
Hi Larz60 and buran,

Thank you for your responses, and I apologize for not formatting the error above with tags!

The format of the variable "Quantity" is in the form of a 3D matrix, which contains 3 elements. This is in the form of [Days, latitude, longitude]. "Days" is in the form of integers, from Day 0 to Day 50,099, or 0-50,099 (so, 51,100 days, by default). There are 64 lines of latitude and 128 lines of longitude.

So, yes, Days would be in the form of integers. If that is the case, would there be a way to reorganize that line where the error is occurring?

Thank you, once again.
Reply
#7
Apologies for the double message.

In addition to post #6, what I am really trying to do in that line is essentially tell Python to only consider Day 2 to Day 50098. Is there a way to specify that range of days?

Is the opening of that loop okay, as well?

Thanks, again.
Reply
#8
Hi there,

I would like to compute all annual cumulative 5-consecutive day maximum values for a 3D matrix using a for loop. To be more specific, the variable "Quantity" is a 3D matrix that is composed of 3 elements and is structured as follow:

[Days, Latitude, longitude]

The first dimension is an integer unit of time (in days), the second being latitude (64 lines of latitude), and the third being longitude (128 degrees of longitude). The total number of days is 51,100, making up 140 years. Previously, I converted the number of days into years (by dividing by 365, shown in the code below) in order to simply derive the maximum value every year for every grid cell.

Ultimately, the goal is now to still similarly obtain maximum values at an annual time scale, as before (which the below code will do already), but now I want to try to specifically obtain the maximum cumulative values for every year for every grid cell based on 5 consecutive day periods. So, basically, 140 maximum values (because there are 140 years) for each grid cell, but the maximums would be based on 5-consecutive day periods, rather than a single day (as I did already as my first task).

The way I define the 5-consecutive days is as follows. For example, the first set of 5 consecutive days is as follows: Day 0+Day 1+Day 2+Day 3+Day 4. For my particular case, what I was envisioning would be to begin deriving the cumulative value over Day 0 to Day 4 (so, the first 5 days in year 0), and then move to the next group of 5 consecutive days, from Day 1 to Day 5 (i.e. Day 1+Day 2+Day 3+Day 4+Day 5), and then Day 2 to Day 6 (i.e. Day 2+Day 3+Day 4+Day 5+Day 6), and then Day 3 to Day 7, and then Day 4 to Day 8.....etc....all the way to the end of the 51,100 day period (140 years). In this way, there would be about 50,000-51,000 5-consecutive day periods over the 140 years, but the existing np.max function (in the code below) would allow Python to return the maximum cumulative values for each year for each grid cell. In this case, would you use the np.sum function to derive cumulative values, and then have the previously generated code (see below) to produce the annual maximum cumulative values?

Finally, the "time-2" and "time+2" in the below code gives a range of days relative to a specified day. For example, at Day 2, the range would be Day 0 to Day 4 (because 2-2=0 and 2+2=4, so Day 0 to Day 4, relative to Day 2). Using that approach, though, I cannot use Day 0 and Day 1, as well as the final two days in the period, since that would lead to a range that contains days that do not exist. For instance, if you subtract 2 from Day 0 or Day 1, you will end up with a negative day (-2 and -1, respectively). Likewise, you will end up with days that exceed 51100 for the final two days due to time+2 (Day 51,101 and Day 51,102, which don't exist). Thus, some kind of condition would need to be used to tell the loop to ignore those particular days (the very first two days and the very last two days), but I'm just uncertain how to specify the conditions in the for loop and wanted to know of a way to use them for this case?

This is what I have tried already:

Quantity = Q
Quantity2 = np.zeros(Quantity.shape)
Year = np.arange(Quantity.shape[0]) // 365
n_year = max(Year) + 1
onedaymax = np.empty((n_year,) + Quantity.shape[1:3])
fivedaymax = np.empty((n_year,) + Quantity2.shape[1:3])
for time in range(Quantity.shape[0]):
    Quantity2[time, ...]
    fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0)
    if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099:
        continue
        for y in range(n_year):
            onedaymax[y, ...] = np.max(Quantity[Year == y, ...], axis=0)
            fivedaymax[y, ...] = np.max(Quantity2[Year == y, ...], axis=0)
But I end up with error:

Error:
TypeError Traceback (most recent call last) 91 Quantity2[time, ...] 92 fiveconsecutivedays = np.sum(Quantity[time - 2:time + 2, ...], axis=0) ---> 93 if time in Quantity.shape[0] > 1 and time in Quantity.shape[0] < 50099: 94 continue 95 for y in range(n_year): TypeError: argument of type 'int' is not iterablhttps://python-forum.io/Thread-Progress-Finished-Question?page=4e
I suspect that the error is due to the first element of the 3D matrix "Quantity" being in integers, from 0-50099 (i.e. from Day 0 to Day 50099, so a grand total of 51,100 days). If so, how would you reorganize that section in order for the error to disappear?

What I am really trying to do in that line is essentially tell Python to only consider Day 2 to Day 50098. Is there a way to specify that range of days?

Is the opening of that loop okay, as well?

Any assistance would be immensely appreciated!!!!
Reply
#9
please, don't start new threads. keep the discussion in the original thread
To be honest - I have problem understanding what you want to achieve. You don't show sample data, nor full runnable snippet and your long explanations at the beginning of each thread are unclear at least to me (probably it's my problem). sorry
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#10
(May-23-2018, 07:03 PM)Lightning1800 Wrote:
time in Quantity.shape[0] > 1

What do you think that's doing?

Print out Quantity.shape[0], so we can see what it is. But I don't see how time in True could ever make sense, tbh
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Loop through values and compare edroche3rd 6 627 Oct-18-2023, 04:04 PM
Last Post: edroche3rd
  Loop through json file and reset values [SOLVED] AlphaInc 2 1,959 Apr-06-2023, 11:15 AM
Last Post: AlphaInc
  Creating a loop with dynamic variables instead of hardcoded values FugaziRocks 3 1,430 Jul-27-2022, 08:50 PM
Last Post: rob101
  How do loop over curl and 'put' different values in API call? onenessboy 0 1,196 Jun-05-2022, 05:24 AM
Last Post: onenessboy
  Loop through values in dictrionary and find the same as in previous row Paqqno 5 1,839 Mar-27-2022, 07:58 PM
Last Post: deanhystad
  How to add for loop values in variable paulo79 1 1,410 Mar-09-2022, 07:20 PM
Last Post: deanhystad
Exclamation Compare values in a for loop. penahuse 1 2,336 Feb-22-2021, 07:01 AM
Last Post: buran
  Calculate column with cumulative return tgottsc1 1 1,814 Jan-25-2021, 12:52 PM
Last Post: buran
  returning values in for loop Nickd12 4 11,839 Dec-17-2020, 03:51 AM
Last Post: snippsat
  Finding Max and Min Values Associated with Unique Identifiers in Python ubk046 1 2,006 May-08-2020, 12:04 PM
Last Post: anbu23

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020