Find a string from a column of one table in another table - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Find a string from a column of one table in another table (/thread-40674.html) |
Find a string from a column of one table in another table - visedwings049 - Sep-05-2023 I am using Python 3.9 I have created two pandas data frames from csv files product and supplier 1. I have created a product table that splits a description out into multiple columns. 2. I have created a supplier table that has the supplier product. Many times a supplier product code is in the description of a product. 3. I want to populate the product.supplier code column with any string that is contained in the supplier.product column. In this example we would have found the code widget in column 4 in the supplier table and returned the word widget in the supplier code table. 4. So i want to do this on a loop for column 0 then move on to column 1 and so on. There will never be two examples of a supplier code in the same string, so i am not worried about overwriting a first instance with a second. I have tried the str.contains function but this just returns true or false. RE: Find a string from a column of one table in another table - deanhystad - Sep-05-2023 What is a table? Is it a spreadsheet? Is it a table in a PDF? Is it a CSV file? Is it a pandas DataFrame, Is it a table in a database? RE: Find a string from a column of one table in another table - visedwings049 - Sep-05-2023 (Sep-05-2023, 03:03 PM)visedwings049 Wrote: I am using Python 3.9 I apologize as this is my first post but they are both being read as pandas dataframes from CSV. RE: Find a string from a column of one table in another table - deanhystad - Sep-05-2023 Something like this maybe? import pandas as pd from string import ascii_letters as letters from random import choice, choices, randint def find_supplier(description): """Return word if word in description matches a supplier code, else None.""" intersection = set(description.split()) & suppliers return list(intersection)[0] if intersection else None # Make some random table thing that we can use to search for words in the description # that match a supplier code. product_table = pd.DataFrame( [ { "Product": i, "Supplier Code": choice("ABCDE"), "Description": " ".join(choices(letters, k=randint(5, 10))), } for i in range(100, 120) ] ) # Get set of suppliers. suppliers = set(product_table["Supplier Code"].values) # Make supplier table. Supplier table contains rows from product_table # where one of the words in the description matches a supplier code. supplier_table = product_table[["Description"]] supplier_table["Product"] = supplier_table["Description"].map(find_supplier) supplier_table = supplier_table[~supplier_table["Product"].isna()][ ["Product", "Description"] ] print(supplier_table) This is easy to break up into individual supplier tables.for supplier in suppliers: print( supplier, supplier_table[supplier_table["Product"] == supplier].reset_index(drop=True), sep="\n", end="\n\n", )
RE: Find a string from a column of one table in another table - visedwings049 - Sep-06-2023 Wow that is an amazing way to do it. This is very helpful i am going to play with this code and see if i can duplicate it with my data set. Thank you so much for this example i had not done anything with defining a function prior to this and that is extremely cool. RE: Find a string from a column of one table in another table - visedwings049 - Sep-06-2023 (Sep-05-2023, 04:21 PM)visedwings049 Wrote:(Sep-05-2023, 03:03 PM)visedwings049 Wrote: I am using Python 3.9 Worked beautifully and its way faster than sending everything to a different column. Had no idea you could do this in python. I was able to compare thousands of rows of data in less than 30 seconds. Thanks Again!! RE: Find a string from a column of one table in another table - visedwings049 - Sep-06-2023 (Sep-05-2023, 06:57 PM)deanhystad Wrote: Something like this maybe?Worked beautifully and its way faster than sending everything to a different column. Had no idea you could do this in python. I was able to compare thousands of rows of data in less than 30 seconds. Thanks Again!! RE: Find a string from a column of one table in another table - deanhystad - Sep-06-2023 Thousands of rows in 30 seconds is really slow. I modified my code to process 100,000 products and it did that in 0.1 seconds. So I doubled the number of suppliers, and that only increased the time about about 10% (0.11 seconds). Next I tripled the length of the description, and that doubled the time (0.22 seconds). What could be making your program run so slow? RE: Find a string from a column of one table in another table - deanhystad - Sep-07-2023 I was getting a warning that I ignored in my code. It doesn't cause any problems in my example, but it has potential for causing very confusing behaviors. In my example I did this to make a new dataframe for suppliers: supplier_table = product_table[["Description"]]This does not create a new dataframe. It creates a slice of the product_table dataframe. What I should have done is make a copy of that slice so that supplier_table and product_table are independent. supplier_table = product_table[["Description"]].copy() |