Sep-10-2019, 08:22 AM
Good morning, this is my first message in the forum so Hi
.
I have a doubt and I don't know how to proceed, I'm trying to compare two columns in two different data frames.
Maybe it's very easy but I don't know how to do it.
Thanks,
Iván

I have a doubt and I don't know how to proceed, I'm trying to compare two columns in two different data frames.
import pandas as pd import numpy as np from sklearn import linear_model from sklearn import model_selection from sklearn.metrics import classification_report from sklearn.metrics import confusion_matrix from sklearn.metrics import accuracy_score import matplotlib.pyplot as plt import seaborn as sb df = pd.read_csv(r"Prueba1CSV.csv", sep=";") df1= pd.read_csv(r"Prueba2CSV.csv", sep=";") def is_in_column(data, values): if data in values: return True else: return False df['Present_in_other'] = df[Cabeza].apply(is_in_column, values=df1[Cabeza2].tolist()) df3=df[df['Present_in_other']==False]The problem is that now what I'm returning in "is_in_column" is True and what I want to return is the index in which it founds the coincidence.
Maybe it's very easy but I don't know how to do it.
Thanks,
Iván