Python Forum

Full Version: Python find the minimum length of string to differentiate dictionary items
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
dTeamNames = {'A': 'Man City', 'B': 'Man United', 'C': 'West Brom', 'D': 'West Ham' }
Hello:
I have a dictionary contains a key and a team name as the example above.
Now, I want to find a way to create another dictionary which uses the same key as dTeams, but use only the shortest length of string of the original team names as the value.
For example, I want to have the following new dictionary using the above dTeamNames:
dShortNames = {'A': 'Man Ci', 'B': 'Man Un', 'C': 'West B', 'D': 'West H' }
The rule is: for the minimum length of string which can tell the difference for each team names.
For “Man City” and “Man United”, the minimum length is 5, so I can use “Man C” and “Man U” to differentiate the both team names; however, for “West Brom” and “West Ham”, the minimum length is 6, so I can use “West B” and “West H” to differentiate the both team names.
I want to write a function to find how to find the minimum length of string to differentiate all the team names in the dictionary and create the new dictionary.
Thanks for advice.
John
If you sort the names, it suffices to always differentiate two consecutive names.
Hello:
I don't quite understand your meaning.
Please show me your code if you know how to do this.
As I am rather new for python programming, I can't figure this out by myself now.
Thanks,
Here is a solution with module itertools
import itertools as it

def index_two(a, b):
    '''return the first index where the two words differ
    If they don't differ, return the length of the smallest'''
    return next(
        it.dropwhile(lambda t: t[0]==t[1], zip(a, b, it.count(0))),
        (None, None, min(len(a), len(b))))[2]

def index_many(words):
    '''return the first index where all the words differ
    identical words are considered to differ on the next character'''
    seq = sorted(words)
    if not seq: return 0
    a, b = it.tee(seq)
    next(b, None)
    return max((index_two(*t) for t in zip(a, b)), default=0)


def shorten_names(dic):
    n = index_many(dic.values())
    return {k: v[:1+n] for k, v in dic.items()}
    
dTeamNames = {'A': 'Man City', 'B': 'Man United', 'C': 'West Brom', 'D': 'West Ham'}
dShortNames = shorten_names(dTeamNames)
print(dShortNames)