(Jan-22-2022, 11:47 AM)snippsat Wrote: Try:
df = df[df.groupby(['order id'],as_index=False)[['order date']].max()]
(Jan-22-2022, 05:46 PM)snippsat Wrote: You should post a sample of the DataFrame so can run it,then it easier to test stuff out.
If i make example,so can run your line of code.
import pandas as pd mydataset = { 'order id': [3, 7, 2], 'order date': ['2020-01-01', '2020-01-02', '2020-01-03'], 'order requester': ['a', 'b', 'c'], 'order urgency': ['slow', 'fast', 'now'] } df_1 = pd.DataFrame(mydataset) df_1["order date"] = pd.to_datetime(df_1["order date"]) df_2 = df_1.groupby(['order id'],as_index=False)[['order date']].max() print(df_1) print('-' * 40) print(df_2)So this is just guess of your data,can try combine back group date(may not be what you want).
Output:order id order date order requester order urgency 0 3 2020-01-01 a slow 1 7 2020-01-02 b fast 2 2 2020-01-03 c now ---------------------------------------- order id order date 0 2 2020-01-03 1 3 2020-01-01 2 7 2020-01-02
>>> df_2.combine_first(df_1) order date order id order requester order urgency 0 2020-01-03 2 a slow 1 2020-01-01 3 b fast 2 2020-01-02 7 c now
Thx,
here is what I need return the left table with all the rows and columns that carry the max(Order_date) grouped by order_ID (see results on the right table)
see in attachment :