Python Forum
remove duplicates from dicts with list values
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
remove duplicates from dicts with list values
#1
Hi,

i need to remove duplicates from dicts with list values, this weird structure comes from defaultdict btw...

example:

dict1 = {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']], 'SAP': [], 'C11_RG': [], 'W11_RG': []}
dict2 = {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IP', '172.17.10.112'], ['IP', '10.111.160.119'], ['IP', '10.111.160.68'], ['IP', '10.111.160.66'], ['IP', '10.95.0.112'], ['IP', '10.111.162.119']], 'SAP': [], 'C11_RG': [], 'W11_RG
': []}

dict1 = {k: v for k, v in dict1.items() if v not in dict2.values()}
dict2 = {k: v for k, v in dict2.items() if v not in dict1.values()}

print(dict1)
print(dict2)
does not work unfortunately

Output:
root@ssap: /tmp # /opt/freeware/bin/python3 bla.py {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']]} {'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IP', '172.17.10.112'], ['IP', '10.111.160.119'], ['IP', '10.111.160.68'], ['IP', '10.111.160.66'], ['IP', '10.95.0.112'], ['IP', '10.111.162.119']], 'SAP': [], 'C11_RG': [], 'W11_RG': []}
duplicates remain...any better ideas?

wbr

chris
Reply
#2
Quote:this weird structure comes from defaultdict btw...
Should the first step be to change the way the dictionaries are structured? How would you like them to look? Do you have control of the code that generates the dictionary? If so, post the code with some example data and a description of what you want for output. defaultdict does what it is told to do, so odd dictionary structure is the fault of the code using defaultdict. Solve the real problem instead of putting a band-aid on it.

And can you define what you mean by "duplicate"? Since your dictionary has 1 key and 1 value, and the values in the two dictionaries are different, there are no duplicates.
Reply
#3
as i need defaultdict functionality the structure remains as it its....still quallifies as weird.

about duplicates you are right...only values differ...so how to remove duplicate values then?

i guess i can do something like this, but looks not very efficient...

result = {}

for key,value in input_raw.items():
    if value not in result.values():
        result[key] = value

print result
Reply
#4
Could you please post the code that makes the dictionaries. There is no reason the result needs to look as it does. As I said, there is nothing in defaultdict that results in the dictionary you are getting, it is how your code is using the defaultdict that makes them that way. I don't see any way that you can blame this on a defaultdict.
{'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']]}
Should it be this?
{'SAG01112_SSAP_HA_LPM': {OS_TYPE': 'AIX', 'IS_COBOL': '1'}}
Reply
#5
(May-23-2024, 05:25 PM)deanhystad Wrote: Could you please post the code that makes the dictionaries. There is no reason the result needs to look as it does. As I said, there is nothing in defaultdict that results in the dictionary you are getting, it is how your code is using the defaultdict that makes them that way. I don't see any way that you can blame this on a defaultdict.
{'SAG01112_SSAP_HA_LPM': [['OS_TYPE', 'AIX'], ['IS_COBOL', '1']]}
Should it be this?
{'SAG01112_SSAP_HA_LPM': {OS_TYPE': 'AIX', 'IS_COBOL': '1'}}

nothing special...defaultdict is initialized with the list factory and lists with gathered values are appended during script runtime.

my_dict = defaultdict(list) 
list is required because keys are not guearanteed unique
Reply
#6
Quote:list is required because keys are not guearanteed unique
You are not showing how you use the dictionary. The usage is what is making a funky dictionary.

If keys are not guaranteed to be unique, why are you using a dictionary at all?
Gribouillis likes this post
Reply
#7
(May-23-2024, 07:27 PM)deanhystad Wrote:
Quote:list is required because keys are not guearanteed unique
You are not showing how you use the dictionary. The usage is what is making a funky dictionary.

If keys are not guaranteed to be unique, why are you using a dictionary at all?

what do you mean with, "how to yo use the dict", i am appending lists, what else do you need to know? i think i elaborated enough about off topic stuff, so is
there some usefull hint how to remove the duplicate values or not? t
Reply
#8
I think the problem is, you are modifying dict1, but still want to use it to modify dict2 afterwards.

Make new dictionaries, dict3 and dict4 and leave dict1 and dict2 untouched.

I assume you only want to exclude duplicate values, not duplicate keys.

This works for me:

import string
from random import choice

def makeme(num):
    value = [choice("ABCDEFGH")]
    key = alph[num]
    return (key, value)


# keys are A,B,C,D,E
dict1 = {makeme(j)[0]:makeme(j)[1] for j in range(5)}
# keys are D,E,F,G,H
# keys D and E are also in dict1
dict2 = {makeme(j+3)[0]:makeme(j+3)[1] for j in range(5)}

# from this you can see dict3 must lose [F] 2 times and lose [E] 1 time
# dict1.values = dict_values([['F'], ['E'], ['G'], ['F'], ['G']])
# dict2.values = dict_values([['F'], ['H'], ['E'], ['A'], ['A']])

dict3 = {k: v for k, v in dict1.items() if v not in dict2.values()} # dict3 = {'C': ['G'], 'E': ['G']}

# from this you can see dict4 must lose [F] 1 time and lose [E] 1 time
# dict1.values = dict_values([['F'], ['E'], ['G'], ['F'], ['G']])
# dict2.values = dict_values([['F'], ['H'], ['E'], ['A'], ['A']])

dict4 = {k: v for k, v in dict2.items() if v not in dict1.values()} # dict4 = {'E': ['H'], 'G': ['A'], 'H': ['A']}
Reply
#9
this looks good, will try it out, thank you!
Reply
#10
I'm not sure if this is right since you never clarified what "duplicate" means in this particular case, but here's your band-aid.
dict1 = {"SAG01112_SSAP_HA_LPM": [["OS_TYPE", "AIX"], ["IS_COBOL", "1"]], "SAP": [], "C11_RG": [], "W11_RG": []}
dict2 = {
    "SAG01112_SSAP_HA_LPM": [
        ["OS_TYPE", "AIX"],
        ["IP", "172.17.10.112"],
        ["IP", "10.111.160.119"],
        ["IP", "10.111.160.68"],
        ["IP", "10.111.160.66"],
        ["IP", "10.95.0.112"],
        ["IP", "10.111.162.119"],
    ],
    "SAP": [],
    "C11_RG": [],
    "W11_RG": [],
}


def remove_common_items(dict_a, dict_b):
    """Remove values that are common to a and b."""
    # For common keys
    for key in set(dict_a) & set(dict_b):
        a = dict_a[key]
        b = dict_b[key]
        # Remove items common to both value lists.
        for item in [x for x in a if x in b]:  # Make list of common before iterating
            a.remove(item)
            b.remove(item)


remove_common_items(dict1, dict2)
print(dict1)
print(dict2)
I still think you dictionaries are messed up and should look like this:
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  unable to remove all elements from list based on a condition sg_python 3 672 Jan-27-2024, 04:03 PM
Last Post: deanhystad
  Copying the order of another list with identical values gohanhango 7 1,441 Nov-29-2023, 09:17 PM
Last Post: Pedroski55
  Search Excel File with a list of values huzzug 4 1,464 Nov-03-2023, 05:35 PM
Last Post: huzzug
  Comparing List values to get indexes Edward_ 7 1,485 Jun-09-2023, 04:57 PM
Last Post: deanhystad
  Adding values with reduce() function from the list of tuples kinimod 10 3,077 Jan-24-2023, 08:22 AM
Last Post: perfringo
  user input values into list of lists tauros73 3 1,260 Dec-29-2022, 05:54 PM
Last Post: deanhystad
  remove partial duplicates from csv ledgreve 0 943 Dec-12-2022, 04:21 PM
Last Post: ledgreve
  Remove values for weekend in a panda series JaneTan 0 782 Dec-12-2022, 01:50 AM
Last Post: JaneTan
  Remove numbers from a list menator01 4 1,632 Nov-13-2022, 01:27 AM
Last Post: menator01
  Remove if similar values available based on two columns klllmmm 1 1,498 Feb-20-2022, 06:55 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020