Python Forum

Full Version: remove duplicates from dicts with list values
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
Hi,

i need to remove duplicates from dicts with list values, this weird structure comes from defaultdict btw...

example:

dict1 = {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']], 'SAP': [], 'C11_RG': [], 'W11_RG': []}
dict2 = {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IP', '172.17.10.112'], ['IP', '10.111.160.119'], ['IP', '10.111.160.68'], ['IP', '10.111.160.66'], ['IP', '10.95.0.112'], ['IP', '10.111.162.119']], 'SAP': [], 'C11_RG': [], 'W11_RG
': []}

dict1 = {k: v for k, v in dict1.items() if v not in dict2.values()}
dict2 = {k: v for k, v in dict2.items() if v not in dict1.values()}

print(dict1)
print(dict2)
does not work unfortunately

Output:
root@ssap: /tmp # /opt/freeware/bin/python3 bla.py {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']]} {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IP', '172.17.10.112'], ['IP', '10.111.160.119'], ['IP', '10.111.160.68'], ['IP', '10.111.160.66'], ['IP', '10.95.0.112'], ['IP', '10.111.162.119']], 'SAP': [], 'C11_RG': [], 'W11_RG': []}
duplicates remain...any better ideas?

wbr

chris
Quote:this weird structure comes from defaultdict btw...
Should the first step be to change the way the dictionaries are structured? How would you like them to look? Do you have control of the code that generates the dictionary? If so, post the code with some example data and a description of what you want for output. defaultdict does what it is told to do, so odd dictionary structure is the fault of the code using defaultdict. Solve the real problem instead of putting a band-aid on it.

And can you define what you mean by "duplicate"? Since your dictionary has 1 key and 1 value, and the values in the two dictionaries are different, there are no duplicates.
as i need defaultdict functionality the structure remains as it its....still quallifies as weird.

about duplicates you are right...only values differ...so how to remove duplicate values then?

i guess i can do something like this, but looks not very efficient...

result = {}

for key,value in input_raw.items():
    if value not in result.values():
        result[key] = value

print result
Could you please post the code that makes the dictionaries. There is no reason the result needs to look as it does. As I said, there is nothing in defaultdict that results in the dictionary you are getting, it is how your code is using the defaultdict that makes them that way. I don't see any way that you can blame this on a defaultdict.
{'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']]}
Should it be this?
{'SAG01112_SSAP_HA_LPM': {OS_TYPE': 'AIX', 'IS_COBOL': '1'}}
(May-23-2024, 05:25 PM)deanhystad Wrote: [ -> ]Could you please post the code that makes the dictionaries. There is no reason the result needs to look as it does. As I said, there is nothing in defaultdict that results in the dictionary you are getting, it is how your code is using the defaultdict that makes them that way. I don't see any way that you can blame this on a defaultdict.
{'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']]}
Should it be this?
{'SAG01112_SSAP_HA_LPM': {OS_TYPE': 'AIX', 'IS_COBOL': '1'}}

nothing special...defaultdict is initialized with the list factory and lists with gathered values are appended during script runtime.

my_dict = defaultdict(list) 
list is required because keys are not guearanteed unique
Quote:list is required because keys are not guearanteed unique
You are not showing how you use the dictionary. The usage is what is making a funky dictionary.

If keys are not guaranteed to be unique, why are you using a dictionary at all?
(May-23-2024, 07:27 PM)deanhystad Wrote: [ -> ]
Quote:list is required because keys are not guearanteed unique
You are not showing how you use the dictionary. The usage is what is making a funky dictionary.

If keys are not guaranteed to be unique, why are you using a dictionary at all?

what do you mean with, "how to yo use the dict", i am appending lists, what else do you need to know? i think i elaborated enough about off topic stuff, so is
there some usefull hint how to remove the duplicate values or not? t
I think the problem is, you are modifying dict1, but still want to use it to modify dict2 afterwards.

Make new dictionaries, dict3 and dict4 and leave dict1 and dict2 untouched.

I assume you only want to exclude duplicate values, not duplicate keys.

This works for me:

import string
from random import choice

def makeme(num):
    value = [choice("ABCDEFGH")]
    key = alph[num]
    return (key, value)


# keys are A,B,C,D,E
dict1 = {makeme(j)[0]:makeme(j)[1] for j in range(5)}
# keys are D,E,F,G,H
# keys D and E are also in dict1
dict2 = {makeme(j+3)[0]:makeme(j+3)[1] for j in range(5)}

# from this you can see dict3 must lose [F] 2 times and lose [E] 1 time
# dict1.values = dict_values([['F'], ['E'], ['G'], ['F'], ['G']])
# dict2.values = dict_values([['F'], ['H'], ['E'], ['A'], ['A']])

dict3 = {k: v for k, v in dict1.items() if v not in dict2.values()} # dict3 = {'C': ['G'], 'E': ['G']}

# from this you can see dict4 must lose [F] 1 time and lose [E] 1 time
# dict1.values = dict_values([['F'], ['E'], ['G'], ['F'], ['G']])
# dict2.values = dict_values([['F'], ['H'], ['E'], ['A'], ['A']])

dict4 = {k: v for k, v in dict2.items() if v not in dict1.values()} # dict4 = {'E': ['H'], 'G': ['A'], 'H': ['A']}
this looks good, will try it out, thank you!
I'm not sure if this is right since you never clarified what "duplicate" means in this particular case, but here's your band-aid.
dict1 = {"SAG01112_SSAP_HA_LPM": [["OS_TYPE", "AIX"], ["IS_COBOL", "1"]], "SAP": [], "C11_RG": [], "W11_RG": []}
dict2 = {
    "SAG01112_SSAP_HA_LPM": [
        ["OS_TYPE", "AIX"],
        ["IP", "172.17.10.112"],
        ["IP", "10.111.160.119"],
        ["IP", "10.111.160.68"],
        ["IP", "10.111.160.66"],
        ["IP", "10.95.0.112"],
        ["IP", "10.111.162.119"],
    ],
    "SAP": [],
    "C11_RG": [],
    "W11_RG": [],
}


def remove_common_items(dict_a, dict_b):
    """Remove values that are common to a and b."""
    # For common keys
    for key in set(dict_a) & set(dict_b):
        a = dict_a[key]
        b = dict_b[key]
        # Remove items common to both value lists.
        for item in [x for x in a if x in b]:  # Make list of common before iterating
            a.remove(item)
            b.remove(item)


remove_common_items(dict1, dict2)
print(dict1)
print(dict2)
I still think you dictionaries are messed up and should look like this:
Pages: 1 2 3