Python Forum
Pandas dataframe comparing
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Pandas dataframe comparing
#1
Hello,

I am reading a csv file using pandas.

my dataframe consist of 3.7 million records and has two column: Date, Subscribers_ID

my dataframe data is the list of active subscribers per day.

I want to check what subscribers_id exist in day X and does not exist in day X + 1 so i can have a list of the subscribers_ID that are not inactive in day X + 1. And i want to do that for each day of the existing days.

is there any comparative function that do this or i have to create a new dataframe for each day and compare dataframes to each others. Because i have more than 75 days.

here is a sample of my data and what i want as result:

import pandas as pd

data = {'date':['22-Jan-22', '22-Jan-22', '22-Jan-22', '22-Jan-22', '23-Jan-22', '23-Jan-22', '23-Jan-22', '23-Jan-22', '23-Jan-22', '23-Jan-22', '23-Jan-22', '24-Jan-22', '24-Jan-22', '24-Jan-22', '24-Jan-22', '24-Jan-22', '24-Jan-22'], 'Subscriber_ID':['a', 'b', 'c', 'd', 'e', 'f', 'b', 'c', 'd', 'h', 'g', 'c', 'd', 'h', 'j', 'i', 'k']}

df = pd.DataFrame(data)

print(df)
I want to have the following result:

Subscribers_ID lost in 23-Jan-22 is/are: a
Subscribers_ID lost in 24-Jan-22 is/are: e, f, b, g
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question [Solved] Formatting cells of a pandas dataframe into an OpenDocument ods spreadsheet Calab 1 468 Mar-01-2025, 04:51 AM
Last Post: Calab
  Find duplicates in a pandas dataframe list column on other rows Calab 2 1,881 Sep-18-2024, 07:38 PM
Last Post: Calab
  Find strings by index from a list of indexes in a different Pandas dataframe column Calab 3 1,527 Aug-26-2024, 04:52 PM
Last Post: Calab
  Add NER output to pandas dataframe dg3000 0 1,106 Apr-22-2024, 08:14 PM
Last Post: dg3000
  HTML Decoder pandas dataframe column mbrown009 3 2,550 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  Use pandas to obtain cartesian product between a dataframe of int and equations? haihal 0 1,957 Jan-06-2023, 10:53 PM
Last Post: haihal
  Pandas Dataframe Filtering based on rows mvdlm 0 2,026 Apr-02-2022, 06:39 PM
Last Post: mvdlm
  Pandas dataframe: calculate metrics by year mcva 1 3,332 Mar-02-2022, 08:22 AM
Last Post: mcva
  PANDAS: DataFrame | Replace and others questions moduki1 2 2,570 Jan-10-2022, 07:19 PM
Last Post: moduki1
  PANDAS: DataFrame | Saving the wrong value moduki1 0 2,038 Jan-10-2022, 04:42 PM
Last Post: moduki1

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020