Python Forum

Full Version: Removing all strings in a list that are of x length
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hey all,

One of my revision practices involves creating a function that removes all strings that has the highest string length in a list.

Expected Output:

words_list = ['fish', 'barrel', 'like', 'shooting', 'sand', 'bank']
    print(remove_long_words(words_list))
    ['fish', 'barrel', 'like', 'sand', 'bank']
Code so far:

def remove_long_words(words_list):
        length_long = get_longest_string_length(words_list)
        
        for ele in words_list:
            if len(ele) == length_long:
                #???
                words_list.pop(???)
        
        return words_list
I first made a function that returns the length of the longest string in the list, then used a for loop to iterate through every element in the list, and from there used an if statement to see if the length of the element is equal to the longest string length. I'm having trouble going on from there, how do I use the .pop method to remove the right elements from the list?

Do I have to convert the list to a string then use .find to find the index position of the element that meets the required length? And how would I make it that it finds all occurrences, not just the first one it finds.
See [Basic] Various Python Gotchas the part titled Modifying a list (or other container) while iterating over it
As I see it there are two subtasks: find length of longest word and return list where longest word(s) are eliminated. As yoriz already pointed out - don't mutate list while iterating over it. List comprehension will create new object and is suited for task at hand:

words_list = ['fish', 'barrel', 'like', 'shooting', 'sand', 'bank']

def remove_long_words(list_):
    max_lenght = max(len(word) for word in list_)
    return [word for word in list_ if len(word) < max_lenght]

assert remove_long_words(words_list) == ['fish', 'barrel', 'like', 'sand', 'bank']
How do you guys feel about a single-pass version? As we look for the longest item, we can yield items that we already know aren't that long. The performance can be dramatically improved if the order of the items doesn't matter (small items can be yielded immediately).

def remove_long_words_gen(items):
    max_len = 0
    cache = []
    for item in items:
        size = len(item)
        if size > max_len:
            # this item is now the longest item
            max_len = size
            # anything previously seen can be yielded and purged from the cache
            yield from (cached for _, cached in cache)
            cache = [(size, item)]
        else:
            # this item might be smaller than the max item,
            # but cannot be yielded yet to preserve the collection's order
            cache.append((size, item))

    # and now cleanup
    for size, item in cache:
        if size != max_len:
            yield item


def remove_long_words(items):
    return list(remove_long_words_gen(items))


words_list = ['fish', 'barrel', 'like', 'shooting', 'sand', 'bank']
actual = remove_long_words(words_list)
expected = ['fish', 'barrel', 'like', 'sand', 'bank']
assert actual == expected
It is slower, though, at least for such a small sample size:
def remove_long_words_nilamo(items):
    return list(remove_long_words_gen(items))


def remove_long_words_perfringo(list_):
    max_lenght = max(len(word) for word in list_)
    return [word for word in list_ if len(word) < max_lenght]


words_list = ['fish', 'barrel', 'like', 'shooting', 'sand', 'bank']
print("single-pass:",
      timeit.timeit("remove_long_words_nilamo(words_list)", globals=globals()))
print("two-pass:", timeit.timeit("remove_long_words_perfringo(words_list)", globals=globals()))
Output:
single-pass: 3.5252245 two-pass: 2.3057698
This version is fast and short
my_list = ['apple','pear','banana','orange','shrimp','plum']
biggest = len(max(my_list, key=len))
my_list1 = [x for x in my_list if len(x) != biggest]
print(my_list1)
Output:
['apple', 'pear', 'plum']
biggest is the length of the longest item in my_list. I then use a list comprehension to eliminate items of that length. Banana, orage, and shrimp are all 6 letters long.
this is my code
example 1:
fruit = ['apple', 'pear', 'banana', 'orange', 'shrimp', 'plum']
fruit = [x for x in fruit if len(x) != sorted([len(i) for i in fruit])[-1]]
print(fruit)
Output:
['apple', 'pear', 'plum']
example 2:
fruit = [] # empty list
fruit = [x for x in fruit if len(x) != sorted([len(i) for i in fruit])[-1]]
print(fruit)
Output:
[]
When your code is as follows, with empty list fruit = [], an exception error will occur
my_list = []
biggest = len(max(my_list, key=len)) # try dont use max() here, max() arg cant be empty
my_list1 = [x for x in my_list if len(x) != biggest]
print(my_list1)
Error:
biggest = len(max(my_list, key=len)) ValueError: max() arg is an empty sequence