May-30-2024, 04:26 PM
Hi,
I have a dataset which consists of relationships between old product and new product. I would like to have them grouped into relationship groups. I have written a script with some sample data. The expected result are after each row with a number. This column is not a part of the original data, there you will only find two columns Old product and New product.
As you can see I don't get the expected result on this row: "9000825_ENDOS025", "9000825_NEXO025", 3),
This relationsship between these two products are put in group 5 (see under result), but I want it grouped in group 3, because you can find this product key 9000825_NEXO025 on both left and right side.
I tried to sort the dataset, which gave me the right result, but I don't think I can rely on sorting the dataset which consists of 134.000 rows. How to change the code to get the desired result?
Best regards
Morten
I have a dataset which consists of relationships between old product and new product. I would like to have them grouped into relationship groups. I have written a script with some sample data. The expected result are after each row with a number. This column is not a part of the original data, there you will only find two columns Old product and New product.
As you can see I don't get the expected result on this row: "9000825_ENDOS025", "9000825_NEXO025", 3),
This relationsship between these two products are put in group 5 (see under result), but I want it grouped in group 3, because you can find this product key 9000825_NEXO025 on both left and right side.
I tried to sort the dataset, which gave me the right result, but I don't think I can rely on sorting the dataset which consists of 134.000 rows. How to change the code to get the desired result?
Best regards
Morten
def group_related_products(data): groups = [] for row in data: old_product, new_product, group = row found_group = False for existing_group in groups: if old_product in existing_group: existing_group.add(new_product) found_group = True break if not found_group: groups.append({old_product, new_product}) return groups # Sample data data = [ ("9000002_88008621", "9000002_88008621", 1), ("9000002_88008621", "9000463_2526534", 1), ("9000002_88008625", "9000002_88008625", 2), ("9000002_88008625", "9000463_160159", 2), ("9000825_NEXO025", "9000756_13002", 3), ("9000756_13002", "9000756_13004", 3), ("9000756_42420", "9000756_42431", 4), ("9000002_88008621", "9006274_88008621", 1), ("9000825_ENDOS025", "9000825_NEXO025", 3), ("9032273_006899", "9000048_1000010123", 6), ("9032273_006899", "9000035_KZDC120003", 6), ("9032273_006899", "9000028_IV-9001B", 6), ("9032272_BH-EGF", "9000048_1000010123", 7), ("9032272_BH-EGF", "9000035_KZDC120003", 7), ("9032272_BH-EGF", "9000028_IV-9001B", 7), ] # Group related products related_groups = group_related_products(data) # Print the groups for i, group in enumerate(related_groups, 1): print(f"Group {i}: {group}")
Output:Group 1: {'9000463_2526534', '9006274_88008621', '9000002_88008621'}
Group 2: {'9000463_160159', '9000002_88008625'}
Group 3: {'9000756_13002', '9000756_13004', '9000825_NEXO025'}
Group 4: {'9000756_42431', '9000756_42420'}
Group 5: {'9000825_ENDOS025', '9000825_NEXO025'}
Group 6: {'9000048_1000010123', '9000035_KZDC120003', '9000028_IV-9001B', '9032273_006899'}
Group 7: {'9000048_1000010123', '9032272_BH-EGF', '9000035_KZDC120003', '9000028_IV-9001B'}