Jun-30-2023, 12:46 PM
I have following code:
import pandas as pd from datetime import datetime # Create the DataFrame data = {'dt': ['1990-06-05 15:59:57', '1990-06-05 21:59:54', '1990-02-20 14:19:08', '1990-03-01 23:01:44', '1990-05-27 21:59:55', '1990-05-27 22:59:55', '1990-05-28 02:59:57', '1990-05-28 22:59:56', '1990-05-29 02:59:56', '1990-05-30 21:59:54', '1990-06-10 22:59:54', '1990-06-11 02:59:55', '1990-06-11 09:59:55', '1990-06-11 22:59:55'], 'type': ['event'] * 14, 'value': [1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0]} df = pd.DataFrame(data) # Identify transitions from 1 to 0 m = (df['value'].diff() == -1) df['dt']=pd.to_datetime(df['dt']) df.sort_values(['dt'], inplace=True) df.reset_index(drop=True, inplace=True) # Identify transitions from 1 to 0 # Count transitions per group df['Transition_Count'] = m.groupby((df['type'] != df['type'].shift()).cumsum()).cumsum() # Calculate difference between 'dt' values for increasing transitions df['Transition_Difference'] = df.groupby('Transition_Count')['dt'].diff() # Filter rows where transition occurs and 'Transition_Difference' is not null transition_df = df[(df['Transition_Count'] > 0) & ~df['Transition_Difference'].isnull()] print(transition_df[['Transition_Count', 'dt', 'Transition_Difference']])I want a dataframe (transition_df), where all the rows where the value of 'value' column changes from 1 to 0. Exception is: whenever there are two 0s in the column, it should take the last occurence of 0, not the one just below it. The resulting dataframe i have has two entries missing: '1990-05-28 22:59:56', '1990-05-30 21:59:54'. How can I write mask condition to achieve the required result ?