Python Forum
Merging Dictionaries - Optimum Style?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Merging Dictionaries - Optimum Style?
#1
Based upon available information on the internet, it seems that different dictionaries can be merged either by using double asterisk operator or by update method.

While the double asterisk method looks simplest and straight forward one liner, the update method can generally be deployed via a function, which in turn can be direct or one embedded in a class.

Prima-facie, double asterisk method looks most tempting. Experienced members are requested to advise whether this style could be adopted universally as the preferred one or whether there could be reasons to use update method instead.

For ready reference, sample code is placed below. Sample-1 covers double asterisk method while Sample-2 utilizes a straight function based upon update method.

Sample-3 has an update based function embedded in a class. This style facilitates the use of plus operator while merging the dictionaries, as seen from the last but one line (just before the print statement).

# Merging Dictionaries: Different Styles
a = {'de': 'Germany'}
b = {'sk': 'Slovakia'}
c = {'fr': 'France'}

# Sample-1: Double Asterisk Method
md = {**a, **b, **c}
print(md)

# Sample-2: Update Method
def merge_dics(diclist):
    td = {}
    for d in diclist:
        td.update(d)
    return td

md = merge_dics([a, b, c])
print(md)

# Sample-3: Update Method Via Class
# (Facilitating Use Of + operator in lieu of __add__)
class MergeDics(dict):
    def __add__(self, other):
        self.update(other)
        return MergeDics(self)

md = MergeDics(a)+b+c
print(md)
The output in each case is as follows:
Output:
{'de': 'Germany', 'sk': 'Slovakia', 'fr': 'France'}
A.D.Tejpal
Reply
#2
More or less it's a preference between 1 and 2 and will depend on use case
The third one is tricky as at least first term should be MergeDict, i.e. I think it's overkill
Also note that __add__() can return self, no need to do MergeDict(self), self is already instance of MergeDict class
Just to add timeing for methods 1 and 2
# Merging Dictionaries: Different Styles
a = {'de': 'Germany'}
b = {'sk': 'Slovakia'}
c = {'fr': 'France'}

from timeit import timeit
# Sample-2: Update Method
def merge_dicts(diclist):
    td = {}
    for d in diclist:
        td.update(d)
    return td
 

print(timeit(stmt='md = {**a, **b, **c}', setup='from __main__ import a, b, c', number=100000))
print(timeit(stmt='md = merge_dicts([a, b, c])', setup='from __main__ import a, b, c, merge_dicts', number=100000))
Output:
0.05819004499994662 0.24759990400002607
clearly, star unpacking is faster
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
Just to keep in mind that dictionary keys are/must be unique and if there are same keys in dictionaries to be merged they will be overwritten without warning and only the last will remain.

>>> a = {'fr': 'France'} 
>>> b = {'sk': 'Slovakia'} 
>>> c = {'fr': 'Francia'}                                                                  
>>> {**a, **b, **c}                                                                        
{'fr': 'Francia', 'sk': 'Slovakia'}
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#4
(Oct-09-2019, 07:07 AM)buran Wrote: clearly, star unpacking is faster

Thanks for your kind confirmation along with the speed test. It is nice to know that star unpacking, apart from being the simplest, is faster too.

As rightly pointed out by you, the function works smoothly without qualifying self with class name.

(Oct-09-2019, 07:24 AM)perfringo Wrote: if there are same keys in dictionaries to be merged they will be overwritten without warning and only the last will remain.

Thanks for your kind input. In such a situation, if straight overwriting can't be permitted, custom function might be needed as star unpacking as well as update get ruled out.
A.D.Tejpal
Reply
#5
For educational purposes I would like to recast my post in one of the earlier threads:

Quote:“The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.” --Donald Knuth, The Art of Computer Programming

BDFL was heavily influenced by Knuth and therefore Python's Design Philosophy includes among other things:

Quote:- Don’t fret too much about performance--plan to optimize later when needed.
- Don’t try for perfection because “good enough” is often just that.
....
Rather than striving for perfection, early adopters found that Python worked "well enough" for their purposes. As the user-base grew, suggestions for improvement were gradually incorporated into the language. As we will seen in later sections, many of these improvements have involved substantial changes and reworking of core parts of the language.

While it's good idea to find most efficient way one should not get too obsessed with it in early stages of writing the code.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#6
For merging multiple dictionaries, proposed function dics_merge(), that alerts the user regarding presence of duplicate keys, affording option to permit overwriting or not, is placed below:

# Merge Dictionaries:
da = {"a":5, "b":10, "c":15, "d":20, "e":25}
db = {"f":55, "g":60, "h":65, "i":70, "j":75}
dc = {"k":105, "b":110, "m":115, "d":120, "n":80}
dd = {"p":155, "a":160, "b":165, "q":170, "r":180}
de = {"k":100, "p":60, "j":70, "r":90}

def dics_merge(diclist):
    dk = []  # List of duplicate keys
    owr = False  # Overwrite Permission
    dmg1 = {**diclist[0]}  # Merge Result - No Overwrite
    dmg2 = {**diclist[0]}  # Merge Result - With Overwrite

    for d in diclist[1:]:
        dmg2 = {**dmg2, **d}
        for k, v in d.items():
            if k in dmg1:
                dk.append(k)
            else:
                dmg1.setdefault(k, v)

    if len(dk) > 0:
        print("Duplicate Keys Detected: ", dk)
        ow = input("Shall Overwrite Existing Contents? Y/N  ")
        if ow.lower() == "y":
            owr = True

    if owr == True:
        return dmg2
    else:
        return dmg1

d_merge = dics_merge([da, db, dc, dd, de])
print(d_merge)
Outputs are as follows:
(a) Duplicate Keys List:
Output:
['b', 'd', 'a', 'b', 'k', 'p', 'j', 'r']
(b) Merge Result With No OverWrite:
Output:
{'a': 5, 'b': 10, 'c': 15, 'd': 20, 'e': 25, 'f': 55, 'g': 60, 'h': 65, 'i': 70, 'j': 75, 'k': 105, 'm': 115, 'n': 80, 'p': 155, 'q': 170, 'r': 180}
© Merge Result With OverWrite:
Output:
{'a': 160, 'b': 165, 'c': 15, 'd': 120, 'e': 25, 'f': 55, 'g': 60, 'h': 65, 'i': 70, 'j': 70, 'k': 100, 'm': 115, 'n': 80, 'p': 60, 'q': 170, 'r': 90}
Suggestions for further fine tuning of the proposed function would be most welcome.
A.D.Tejpal
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  optimum chess endgame with D=3 pieces doesn't give an exact moves_to_mate variable max22 1 194 Mar-21-2024, 09:31 PM
Last Post: max22
  merging three dictionaries Skaperen 3 1,933 Oct-20-2020, 10:06 PM
Last Post: Skaperen
  merging dictionaries Skaperen 3 2,438 Nov-13-2018, 06:26 AM
Last Post: Skaperen
  merging two dictionaries Skaperen 17 10,512 Oct-05-2017, 12:47 AM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020