Python Forum
Get max values based on unique values in another list - python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Get max values based on unique values in another list - python
#1
In a numpy.ndarray (2d) I want to calculate the maximum of corresponding values (second column) of repetitive values (first column) in the array. Like if the array is this:

sys_func = 

array([[126.        ,   4],
           [126.        ,  11],
           [126.        ,   2],
           [126.        ,  12],
           [126.        ,  23],
           [126.        ,   1],
           [129.        ,  11],
           [129.        ,  45],
           [129.        ,   3],
           [129.        , 125],
           [129.        ,  54],
           [129.        ,   1],
           [129.        ,   1],
           [129.        ,  53],
           [132.        ,  41],
           [132.        ,   1],
           [132.        ,   2],
           [142.        ,   6],
           [142.        ,  76        ]])

unique_days = [int(x) for x in np.unique(sys_func[:,0])]
I want to get this:

[126 23;
129 125;
132 41;
142 76]
I have tried the following:

max_sr = []
for i in range(len(unique_days)):
    s = [max(sys_func[:,1]) for x in np.where(sys_func[:,0] == unique_days[i])]
    max_sr.append(s)
and it's obv giving me the wrong answer! Any ideas how to fix this?
Reply
#2
There may be a more clever numpy way to get here but this seems to be about what you want:
The key is this line:
valid = x[x[:,0] == unique]
Where we are using advanced indexing to pull out only the values where the first value equals the particular unique value.
Reply
#3
You can obtain it in a one line:
>>> m = np.array([[126.        ,   4],...)
>>> list((x, max(m[m[:,0]==x, 1])) for x in np.unique(m[:, 0]))
[(126.0, 23.0), (129.0, 125.0), (132.0, 41.0), (142.0, 76.0)]
# Or as a numpy array
>>> np.array(list((x, max(m[m[:,0]==x, 1])) for x in np.unique(m[:, 0])))
array([[126.,  23.],
       [129., 125.],
       [132.,  41.],
       [142.,  76.]])
But it might look too much as black magic with that level of nested parenthesis and brackets... I will for sure add some comments explaining what I want to obtain.
Reply
#4
Of course you can crunch it into a list comp, but it is still the same thing.
r=[list(x[x[:,0]==u].max(0))for u in set(x[:,0])]
Written more sanely (and keeping as a numpy array) though it is more like:
results = np.array([x[x[:,0] == unique].max(0) for unique in set(x[:,0])], dtype=int)
which ends up being a little arcane and long for my taste.
Reply
#5
Maybe this is less numpy-way, but it worked for me

import itertools
from operator import itemgetter
np.array([np.array(list(grp)).max(0) 
          for _, grp in itertools.groupby(sys_func, key=itemgetter(0))])
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#6
Hmm, I was looking for something like that and didn't find what I need.
I'm surprised there isn't a sort of findgroups in numpy itself (and there may well be), but I didn't have any luck finding it.
Reply
#7
(Jun-12-2018, 09:32 AM)Mekire Wrote: Hmm, I was looking for something like that and didn't find what I need.
I'm surprised there isn't a sort of findgroups in numpy itself (and there may well be), but I didn't have any luck finding it.

There's pandas.DataFrame.groupby - so that will work too

import pandas as pd
df = pd.DataFrame(sys_func)
np.array([g.max() for _, g in df.groupby(df[0])])
Closer to home - but still not pure-numpy solution
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#8
(Jun-12-2018, 10:00 AM)volcano63 Wrote:
(Jun-12-2018, 09:32 AM)Mekire Wrote: Hmm, I was looking for something like that and didn't find what I need.
I'm surprised there isn't a sort of findgroups in numpy itself (and there may well be), but I didn't have any luck finding it.

There's pandas.DataFrame.groupby - so that will work too

import pandas as pd
df = pd.DataFrame(sys_func)
np.array([g.max() for _, g in df.groupby(df[0])])
Closer to home - but still not pure-numpy solution

I actually love this one! Thanks y'all!
Reply
#9
One last note here as I have been experimenting with pandas since Volcano pointed us in that direction.
https://pandas.pydata.org/pandas-docs/st...ggregation

It is designed such that you don't even need the loop to apply functions to the dataframe:
df = pd.DataFrame(x)
print(df.groupby(df[0]).agg(max))
Output:
1 0 126.0 23.0 129.0 125.0 132.0 41.0 142.0 76.0
In fact you can apply multiple functions in one operation and it even names the columns for you automatically:
df = pd.DataFrame(x)
print(df.groupby(df[0]).agg([max, min, sum, np.mean]))
Output:
1 max min sum mean 0 126.0 23.0 1.0 53.0 8.833333 129.0 125.0 1.0 293.0 36.625000 132.0 41.0 1.0 44.0 14.666667 142.0 76.0 6.0 82.0 41.000000
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Assigning conditional values in Pandas Scott 3 800 Dec-19-2023, 03:10 AM
Last Post: Larz60+
  attempt to split values from within a dataframe column mbrown009 8 2,354 Apr-10-2023, 02:06 AM
Last Post: mbrown009
  Make unique id in vectorized way based on text data column with similarity scoring ill8 0 889 Dec-12-2022, 03:22 AM
Last Post: ill8
  Increase df column values decimals SriRajesh 2 1,109 Nov-14-2022, 05:20 PM
Last Post: deanhystad
  replace sets of values in an array without using loops paul18fr 7 1,718 Jun-20-2022, 08:15 PM
Last Post: paul18fr
  Changing Values in a List DaveG 1 1,289 Apr-04-2022, 03:38 PM
Last Post: jefsummers
Question How does one clean a populated table in MySQL/MariaDB? Copying values across tables? BrandonKastning 2 1,573 Jan-17-2022, 05:46 AM
Last Post: BrandonKastning
  Matplotlib scatter plot in loop with None values ivan_sc 1 2,268 Nov-04-2021, 11:25 PM
Last Post: jefsummers
  pandas: Compute the % of the unique values in a column JaneTan 1 1,780 Oct-25-2021, 07:55 PM
Last Post: jefsummers
  Write a dictionary with arrays as values into JSON format paul18fr 3 5,634 Oct-20-2021, 10:38 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020