Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas question
#1
Hi

I have a dataframe that looks like this:

Output:
      1     2     4  5  3 1  0.25     0  0.75  0  0 2     0  0.75     0  0  0 4  0.75     0  0.25  0  0 5     0     0     0  1  0 3     0     0     0  0  1
Now i want to know for each index which column contains the highest score and the corresponding score. 

    def matchResult():
        match = df.max(axis=1) # shows the highest score
        match1 = df.idxmax(axis=1) # shows the column containing the highest score
        print(match)
        print(match1)
Output:
1    0.75 2    0.75 4    0.75 5    1.00 3    1.00 dtype: float64 1    4 2    2 4    1 5    5 3    3 dtype: object
Does anybody know how I can combine them, so I get one output looking like:

index - column - score
1 4 0.75
2 2 0.75
3 3 1
4 1 0.75
5 5 1

thanks!
Reply
#2
in case somebody is interested, the following is the answer to my question above.

        match = df.max(axis=1).to_frame() # shows the highest score
        match1 = df.idxmax(axis=1).to_frame() # shows the column of the highes score
        result = pd.concat([match1, match], axis=1) # combines both
New question, 

Does anybody know how to return the max value as stated above, with a minimum value condition? (e.g. exclude zeros or values below a certain amount?)
Reply
#3
it's much better if you could post code that could be run,
especially  when it comes to pandas and alike than many of use sporadically.
Here how it could be done.
import pandas as pd
from io import StringIO

data = StringIO('''\
1,2,4,5,3
0.25,0,0.75,0,0
0,0.75,0,0,0
0.75,0,0.25,0,0
0,0,0,1,0
0,0,0,0,1''')

df = pd.read_csv(data, sep=",")
print(df)
print('------------------')
# Minimum has to be over 0.1
print(df[df > .01].min(axis=1))
Output:
G:\Anaconda3 λ python pd_test.py       1     2     4  5  3 0  0.25  0.00  0.75  0  0 1  0.00  0.75  0.00  0  0 2  0.75  0.00  0.25  0  0 3  0.00  0.00  0.00  1  0 4  0.00  0.00  0.00  0  1 ------------------ 0    0.25 1    0.75 2    0.25 3    1.00 4    1.00 dtype: float64
Reply
#4
(Dec-04-2017, 04:04 PM)snippsat Wrote: it's much better if you could post code that could be run,
especially  when it comes to pandas and alike than many of use sporadically.
Here how it could be done.
 


Good point, i'll take it into account. Thank you for the help, much appreciated!. I have only added .dropna() since I got nan values when the requirements were not met.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  pandas.json_normalize question elsvieta 6 705 Apr-04-2025, 03:47 PM
Last Post: Pedroski55
  Pandas and MongoDB question Majority390 1 1,440 Dec-23-2024, 02:41 AM
Last Post: sakshi009
  pandas df inside a df question mbaker_wv 4 2,177 Dec-25-2022, 01:11 AM
Last Post: mbaker_wv
  Pandas usecols question rsearing 1 1,947 Aug-20-2022, 10:10 PM
Last Post: jefsummers
  Simple pandas question mcva 4 3,697 Dec-17-2021, 04:47 PM
Last Post: mcva
  Pandas question new2datasci 0 2,474 Jan-10-2021, 01:29 AM
Last Post: new2datasci
  Pandas merge question smw10c 2 6,517 Jul-02-2020, 06:56 PM
Last Post: hussainmujtaba
  Counting Criteria in Pandas Question Koenig 1 2,755 Sep-30-2019, 05:16 AM
Last Post: perfringo
  Function question using Pandas smw10c 7 8,673 Feb-12-2019, 06:52 PM
Last Post: Nathandsn
  Simple pandas dataframe question popohoma 1 4,451 Jan-03-2019, 05:00 PM
Last Post: ashlardev

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020