Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas question
#1
Hi

I have a dataframe that looks like this:

Output:
      1     2     4  5  3 1  0.25     0  0.75  0  0 2     0  0.75     0  0  0 4  0.75     0  0.25  0  0 5     0     0     0  1  0 3     0     0     0  0  1
Now i want to know for each index which column contains the highest score and the corresponding score. 

    def matchResult():
        match = df.max(axis=1) # shows the highest score
        match1 = df.idxmax(axis=1) # shows the column containing the highest score
        print(match)
        print(match1)
Output:
1    0.75 2    0.75 4    0.75 5    1.00 3    1.00 dtype: float64 1    4 2    2 4    1 5    5 3    3 dtype: object
Does anybody know how I can combine them, so I get one output looking like:

index - column - score
1 4 0.75
2 2 0.75
3 3 1
4 1 0.75
5 5 1

thanks!
Reply
#2
in case somebody is interested, the following is the answer to my question above.

        match = df.max(axis=1).to_frame() # shows the highest score
        match1 = df.idxmax(axis=1).to_frame() # shows the column of the highes score
        result = pd.concat([match1, match], axis=1) # combines both
New question, 

Does anybody know how to return the max value as stated above, with a minimum value condition? (e.g. exclude zeros or values below a certain amount?)
Reply
#3
it's much better if you could post code that could be run,
especially  when it comes to pandas and alike than many of use sporadically.
Here how it could be done.
import pandas as pd
from io import StringIO

data = StringIO('''\
1,2,4,5,3
0.25,0,0.75,0,0
0,0.75,0,0,0
0.75,0,0.25,0,0
0,0,0,1,0
0,0,0,0,1''')

df = pd.read_csv(data, sep=",")
print(df)
print('------------------')
# Minimum has to be over 0.1
print(df[df > .01].min(axis=1))
Output:
G:\Anaconda3 λ python pd_test.py       1     2     4  5  3 0  0.25  0.00  0.75  0  0 1  0.00  0.75  0.00  0  0 2  0.75  0.00  0.25  0  0 3  0.00  0.00  0.00  1  0 4  0.00  0.00  0.00  0  1 ------------------ 0    0.25 1    0.75 2    0.25 3    1.00 4    1.00 dtype: float64
Reply
#4
(Dec-04-2017, 04:04 PM)snippsat Wrote: it's much better if you could post code that could be run,
especially  when it comes to pandas and alike than many of use sporadically.
Here how it could be done.
 


Good point, i'll take it into account. Thank you for the help, much appreciated!. I have only added .dropna() since I got nan values when the requirements were not met.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  pandas df inside a df question mbaker_wv 4 1,185 Dec-25-2022, 01:11 AM
Last Post: mbaker_wv
  Pandas usecols question rsearing 1 1,243 Aug-20-2022, 10:10 PM
Last Post: jefsummers
  Simple pandas question mcva 4 2,648 Dec-17-2021, 04:47 PM
Last Post: mcva
  Pandas question new2datasci 0 1,951 Jan-10-2021, 01:29 AM
Last Post: new2datasci
  Pandas merge question smw10c 2 5,722 Jul-02-2020, 06:56 PM
Last Post: hussainmujtaba
  Counting Criteria in Pandas Question Koenig 1 2,166 Sep-30-2019, 05:16 AM
Last Post: perfringo
  Function question using Pandas smw10c 7 7,081 Feb-12-2019, 06:52 PM
Last Post: Nathandsn
  Simple pandas dataframe question popohoma 1 3,544 Jan-03-2019, 05:00 PM
Last Post: ashlardev
  question on pandas datareader kit12_31 3 9,216 Feb-05-2018, 11:55 PM
Last Post: snippsat
  Newbie question on how to use pandas.rolling_mean zydjohn 5 14,245 Dec-09-2017, 08:42 PM
Last Post: j.crater

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020