DataFrame .xlsx max()

bnadir55 · (This post was last modified: Jan-22-2022, 06:28 PM by bnadir55.)

(Jan-22-2022, 11:47 AM)snippsat Wrote: Try:
df = df[df.groupby(['order id'],as_index=False)[['order date']].max()]

(Jan-22-2022, 05:46 PM)snippsat Wrote: You should post a sample of the DataFrame so can run it,then it easier to test stuff out.
If i make example,so can run your line of code.

import pandas as pd

mydataset = {
  'order id': [3, 7, 2],
  'order date': ['2020-01-01', '2020-01-02', '2020-01-03'],
  'order requester': ['a', 'b', 'c'],
  'order urgency': ['slow', 'fast', 'now']
}

df_1 = pd.DataFrame(mydataset)
df_1["order date"] = pd.to_datetime(df_1["order date"])
df_2 = df_1.groupby(['order id'],as_index=False)[['order date']].max()

print(df_1)
print('-' * 40)
print(df_2)

Output:   order id order date order requester order urgency
0         3 2020-01-01               a           slow
1         7 2020-01-02               b           fast
2         2 2020-01-03               c            now
----------------------------------------
   order id order date
0         2 2020-01-03
1         3 2020-01-01
2         7 2020-01-02

So this is just guess of your data,can try combine back group date(may not be what you want).

>>> df_2.combine_first(df_1)
  order date  order id order requester order urgency
0 2020-01-03         2               a          slow
1 2020-01-01         3               b          fast
2 2020-01-02         7               c           now

Thx,
here is what I need return the left table with all the rows and columns that carry the max(Order_date) grouped by order_ID (see results on the right table)
see in attachment :

DataFrame .xlsx max()

User Panel Messages

Announcements