Python Forum

Full Version: Merge dicts without override
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Just as an FYI I'm still new to working with dicts so the way I have these structured could probably be improved. Basically I have 3 dicts with different information pertaining to individuals contained in each. Each individual has an ID and that is used as the keys. I've tried multiple ways of combining them but each method results in the data being overridden. Also, some ID keys may not be in all of the dicts.

Below are examples of each dict. If anyone has a way of accomplishing this task and/or advice on how to improve the structures I would appreciate it.

#alias dict
{
    "10084": {
        "aliases": {
            "first_last": "Loren Meyer",
            "last_first": "Meyer, Loren"
        }
    },
    "10111": {
        "aliases": {
            "first_last": "Cory Higgins",
            "last_first": "Higgins, Cory"
        }
    },
    "10163": {
        "aliases": {
            "first_last": "Antoine Wright",
            "last_first": "Wright, Antoine"
        }
    }
}

#vitals dict
{
    "10084": {
        "vitals": {
            "ht": 82,
            "wt": 257,
            "birth_date": "12/30/1972"
        }
    },
    "10111": {
        "vitals": {
            "ht": 77,
            "wt": 180,
            "birth_date": "6/14/1989"
        }
    },
    "10163": {
        "vitals": {
            "ht": 79,
            "wt": 210,
            "birth_date": "2/6/1984"
        }
    }
}

#combine dict
{
    "10084": {
        "combine": {
            "span": 81.0,
            "reach": 100.0,
            "body_fat": 4.1,
        }
    },
    "10111": {
        "combine": {
            "span": 81.0,
            "reach": 100.0,
            "body_fat": 9.1,
        }
    },
    "10163": {
        "combine": {
            "span": 75.0,
            "reach": 85.0,
            "body_fat": 6.1,
        }
    }
}
Loop through the keys (10084, 10111, and 10163). For each key do aliases[key].update(vitals[key]) and aliases[key].update(combine[key]). You can use 'if key in dict' or try/except to handle keys that aren't in vitals or combine.
ichabod801 algorithm expressed in Python code:

combined = dict()
                                                      
for d in (alias, vitals, combine): 
    for key, value in d.items(): 
        try: 
            combined[key].update(value) 
        except KeyError: 
            combined[key] = dict(value)
If defaultdict is OK then code can be even simpler:

combined = defaultdict(dict)                                               

for d in (alias, vitals, combine): 
    for key, value in d.items(): 
        combined[key].update(value) 
Generalized version to use as many dicts as you want.
All values of the dicts must be dicts, otherwise this code won't work.

import pprint


d1 = {
    "10084": {
        "aliases": {
            "first_last": "Loren Meyer",
            "last_first": "Meyer, Loren"
        }
    },
    "10111": {
        "aliases": {
            "first_last": "Cory Higgins",
            "last_first": "Higgins, Cory"
        }
    },
    "10163": {
        "aliases": {
            "first_last": "Antoine Wright",
            "last_first": "Wright, Antoine"
        }
    }
}
 
#vitals dict
d2 = {
    "10084": {
        "vitals": {
            "ht": 82,
            "wt": 257,
            "birth_date": "12/30/1972"
        }
    },
    "10111": {
        "vitals": {
            "ht": 77,
            "wt": 180,
            "birth_date": "6/14/1989"
        }
    },
    "10163": {
        "vitals": {
            "ht": 79,
            "wt": 210,
            "birth_date": "2/6/1984"
        }
    }
}
 
#combine dict
d3 = {
    "10084": {
        "combine": {
            "span": 81.0,
            "reach": 100.0,
            "body_fat": 4.1,
        }
    },
    "10111": {
        "combine": {
            "span": 81.0,
            "reach": 100.0,
            "body_fat": 9.1,
        }
    },
    "10163": {
        "combine": {
            "span": 75.0,
            "reach": 85.0,
            "body_fat": 6.1,
        }
    }
}


def combine_subdicts(*dicts):
    if len(dicts) < 2:
        raise ValueError('Minimum two dicts are required to combine')
    result = {}
    for uid, data in dicts[0].items():
        try:
            for subdict in dicts[1:]:
                data.update(subdict[uid])
        except KeyError:
            print(f'Inconsistent. Missing id {uid} in {subdict}')
            continue
        result[uid] = data
    return result


combined_dict = combine_subdicts(d1, d2, d3)
pprint.pprint(combined_dict, indent=2)
@DeaD_EyE - I like this approach but there is 1 issue. It works perfect for the alias/vitals dicts (they have identical ID groups) but some individuals do not have combine data (and I would anticipate adding a few other dicts that may not have all ID's). I tried replacing continue with pass below the KeyError line but that didn't seem to make a difference. Any ideas on how to alter the script to better handle missing ID's?