Python Forum
Separating unique, stable, samples using pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Separating unique, stable, samples using pandas
#1
I'm using pandas to process a simple two column table that has a series of input and output integer data samples. Not all samples are useful, because I have to wait for the data to stabilize. I'm looking to keep unique I/O pairs that repeat back-to-back for at least 3 rows.

My thoughts are to treat this normally with how I would process a list, by iterating through, and then compare the current row with "previous_row", and "row_before_that" and if all three are equal, then add it to a separate list. Then deduplicate the separate list.

But this doesn't seem to be the pandas way with the general vectorization principles I've been reading about. I thought about adding new columns that are shifted once and then twice, and then comparing across them.

The traditional way would work, but is there a better way?

Thanks
Reply
#2
For future googlers, here's the method I figured out:

df['InputStable'] = df['InputBus'] == df['InputBus'].shift(periods=1, fill_value=0) & df['InputBus'].shift(periods=2, fill_value=0)

This creates a new boolean column, which contains true/false based on whether the current value equals the previous value and the one before that.

I think I'm binary bitwise ANDing two integers here and then comparing that result, which is probably breaking a thousand rules, but it seems to be working just fine in practice.

This utilizes the shift() function in pandas.

I also used drop() and drop_duplicates() to get rid repeated rows.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  pandas: Compute the % of the unique values in a column JaneTan 1 1,756 Oct-25-2021, 07:55 PM
Last Post: jefsummers
  ValueError: Found input variables with inconsistent numbers of samples: [5, 6] bongielondy 6 25,285 Jun-28-2021, 05:23 AM
Last Post: ricslato
  Pandas + Groupby + Filter unique values JosepMaria 1 2,838 Jun-15-2020, 08:15 AM
Last Post: JosepMaria
  ValueError: Found array with 0 samples marcellam 1 5,078 Apr-22-2020, 04:12 PM
Last Post: jefsummers
  ValueError: Found input variables with inconsistent numbers of samples: [0, 3] ayaz786amd 2 9,563 Nov-27-2018, 07:12 AM
Last Post: ayaz786amd
  How to get first and last row index of each unique names in pandas dataframe SriRajesh 1 4,448 Oct-13-2018, 07:04 AM
Last Post: perfringo
  pandas: assemble data to have samples sdcompanies 2 3,264 Jan-19-2018, 09:45 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020