Python Forum
Need help getting unique values across two columns of a dataframe
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Need help getting unique values across two columns of a dataframe
#1
Hi there, my question is pretty much what the title says. I have a pandas dataframe with four columns, and I want to be able to get the unique values across pairs of columns. Here's my code:

df_concat = pd.concat([df1, df2, df3, df4], axis=1)

len(df_concat['K-mers A'].unique().tolist())
This code works well for getting the number of unique values in one column, but I need to get the values across two columns. Since columns will have some of the same values, I can't just find them separately and add them together. I'd really appreciate any help, as I'm struggling to figure this out :)

Bonus question: How would I go about finding not the number of unique values, but the number of values that reoccur across both columns? :) Once I get this my work will be done :D
Reply
#2
(Jun-29-2019, 03:06 PM)a_real_phoenix Wrote: Since columns will have some of the same values, I can't just find them separately and add them together. I'd really appreciate any help, as I'm struggling to figure this out :)
You can convert each column to list, concatenate lists and apply numpy.unique, e.g.
import pandas as pd
df = pd.DataFrame({'col1': ['a', 'b', 'c'], 'col2': ['c', 'b', 'd']})
pd.np.unique(df['col1'].tolist() + df['col2'].tolist())
Hope, you can solve the bonus question by yourself.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Obtaining values from columns oncebuddy 1 2,873 Jan-22-2018, 06:42 AM
Last Post: j.crater
  creating date/time stamp from dataframe values kiki1113 1 2,501 Dec-06-2017, 05:43 PM
Last Post: gruntfutuk
  pandas dataframe substracting columns: key error metalray 2 7,024 Feb-24-2017, 07:59 AM
Last Post: metalray

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020