Python Forum

Full Version: Removing items from list if containing a substring
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm trying to identify a file within a directory by removing list items containing an unwanted substring. When the script finishes running, list2 only contains 'test.test1-test-sample.mp4' when it's expected to contain 'test.test-test.mp4' Any help with where my mistake is would be greatly appreciated.

File's in directory: test.test-test.mp4 (intended target file), test.test1-test-sample.mp4, test.test2-test-Sample.mp4

This is not the entire script, just the relevant part. Please ignore the irrelevant imports.
import tmdbsimple as tmdb
import requests
import locale
import os
import subprocess
import shlex
import json
tmdb.API_KEY = ''
api_key = ''


user_input = input("Enter a video file location: ")
dest = input("Enter a destination directory: ")

list1 = []
list2 = []
    
if os.path.isfile(user_input):
    list1.append(os.path.basename(user_input))
    base1 = os.path.splitext(list1[0])
    basename = base1[0]
else:
    list1.append(os.path.basename(user_input))
    basename = list1[0]
    for file in os.scandir(user_input):
        if file.name.endswith(('.mp4', '.mkv', '.avi')):
            list2.append(file.name)
    for j in list2:
        if "-sample" or "-Sample" in j:
            list2.remove(j)
    #target_file = os.path.join(user_input, list2[0])

#print(list1)                
print(list2)
#print(target_file)
You have a couple of problems. Your most serious is the logic on line 29. By precedence, it will be parsed as the following:

        if ("-sample") or ("-Sample" in j):
As "-sample" is a string that is not empty, it is always true. This branch is always taken. If it's always taken, why is everything not removed?

Your second issue is trying to remove elements from a list while you're using it as an iterator. The order gets messed up.

This should remove 2, 3, and 4 from the list, but fails because the iterator is upset.
>>> l = list(range(6))
>>> for i in l:
...   if 2 <= i <= 4:
...     l.remove(i)
...
>>> print(l)
[0, 1, 3, 5]
While there are lots of ways around this (make a copy, use a comprehension to copy the bits you do want), an easier method here is to roll your exclusion into the previous loop. You don't have to remove something that you never stored in the first place.

    for file in os.scandir(user_input):
        if "-sample" in file.name or "-Sample" in file.name:
            continue
        if file.name.endswith(('.mp4', '.mkv', '.avi')):
            list2.append(file.name)
Thanks for the thorough explanation. The if logic mistake makes sense. I'll read more into remove() so I understand this better. I also forgot continue was a thing. Ugh. Pressing on!

Your solution is simple and straightforward. Thanks again!