May-07-2021, 06:59 PM
How do you guys feel about a single-pass version? As we look for the longest item, we can yield items that we already know aren't that long. The performance can be dramatically improved if the order of the items doesn't matter (small items can be yielded immediately).
def remove_long_words_gen(items): max_len = 0 cache = [] for item in items: size = len(item) if size > max_len: # this item is now the longest item max_len = size # anything previously seen can be yielded and purged from the cache yield from (cached for _, cached in cache) cache = [(size, item)] else: # this item might be smaller than the max item, # but cannot be yielded yet to preserve the collection's order cache.append((size, item)) # and now cleanup for size, item in cache: if size != max_len: yield item def remove_long_words(items): return list(remove_long_words_gen(items)) words_list = ['fish', 'barrel', 'like', 'shooting', 'sand', 'bank'] actual = remove_long_words(words_list) expected = ['fish', 'barrel', 'like', 'sand', 'bank'] assert actual == expectedIt is slower, though, at least for such a small sample size:
def remove_long_words_nilamo(items): return list(remove_long_words_gen(items)) def remove_long_words_perfringo(list_): max_lenght = max(len(word) for word in list_) return [word for word in list_ if len(word) < max_lenght] words_list = ['fish', 'barrel', 'like', 'shooting', 'sand', 'bank'] print("single-pass:", timeit.timeit("remove_long_words_nilamo(words_list)", globals=globals())) print("two-pass:", timeit.timeit("remove_long_words_perfringo(words_list)", globals=globals()))
Output:single-pass: 3.5252245
two-pass: 2.3057698