Aug-26-2024, 03:24 PM
I don't think pandas is good for that kind of thing. I would convert to dictionaries.
import pandas as pd df1 = pd.DataFrame( data=[ { "id": "One", "served_sites": ["1", "2", "3"], }, { "id": "Two", "served_sites": ["4", "5", "6"], }, ] ) df2 = pd.DataFrame( data=[ {"site_id": "1", "name": "Toronto"}, {"site_id": "2", "name": "Calgary"}, {"site_id": "3", "name": "Chicago"}, {"site_id": "4", "name": "New York"}, {"site_id": "5", "name": "Los Angeles"}, ] ) site_names = dict(zip(df2.site_id, df2.name)) area_sites = dict(zip(df1.id, df1.served_sites)) for name, sites in area_sites.items(): area_sites[name] = [site_names.get(site, "undefined") for site in sites] print(area_sites) print("Is Calgary in area One?", "Calgary" in area_sites["One"]) print("Is Calgary in area Two?", "Calgary" in area_sites["Two"]) print("All sites common to areas One and Two", set(area_sites["One"]) & set(area_sites["Two"]))