Python Forum
Code improvement: groupby and operation on conditionals
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Code improvement: groupby and operation on conditionals
#1
Hello. I have data in the following format. I am looking to groupby the location, and then find the maximum value for each compound where the corresponding flag is 'U'.

Quote: location compound1 compound1_flag compound2 compound2_flag compound3 compound3_flag
A 0.36 NaN 0.4 U 0.2 U
B 5 U 2.2 NaN 8 U
B 9.8 J 3.2 U 3300 NaN
B 7.2 J 1800 NaN 280 NaN
C 9 J 3200 NaN 700 NaN
C 4.5 NaN 6.1 NaN 1000 NaN
C 7.8 NaN 0.16 U 0.47 J
D 0.89 NaN 0.30 U 6.1 NaN
D 14 NaN 0.16 U 0.59 NaN
E 2 U 5.6 NaN 200 NaN

I have a half-way working solution, but I think it could be more readable, and hopefully there's a better way to get it into tabular format.

The following will print out the desired data, but I'm guessing there's a clearer method rather than nested for loops or the method of finding the values associated with 'U' flags.
analytes = ['compound1', 'compound2', 'compound3']
flags = [''.join([x, '_flag']) for x in analytes]
nondetect_dict = {k:v for (k,v) in zip(flags, analytes)}
by_location = data.groupby('location')
for key, group in by_location:
    for x in flags:
        print(key, nondetect_dict[x], by_location.get_group(key).loc[data[x].str.contains('U', regex=False, na=False), nondetect_dict[x]].max())
I tried to convert this into a dictionary with a list comprehension for the second for loop. I thought I could turn into a dataframe, but this throws a syntax error:
max_nondetect = {}
for key, group in by_location:
        max_nondetect[key] = [nondetect_dict[x], by_location.get_group(key).loc[data[x].str.contains('U', regex=False, na=False), nondetect_dict[x]].max() for x in flags]
pd.DataFrame.from_dict(max_nondetect, orient='index')
Any thoughts on a more efficient approach? Thank you in advance.
Reply


Messages In This Thread
Code improvement: groupby and operation on conditionals - by pythonidae - Dec-06-2019, 06:02 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  pandas change row value an existing column with conditionals Gigux 1 2,963 Jun-22-2019, 08:04 PM
Last Post: Gigux
  Genetic Algorithm improvement Alberto 0 4,295 Oct-19-2017, 02:13 PM
Last Post: Alberto

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020