Python Forum
Removing items from list if containing a substring
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Removing items from list if containing a substring
#1
I'm trying to identify a file within a directory by removing list items containing an unwanted substring. When the script finishes running, list2 only contains 'test.test1-test-sample.mp4' when it's expected to contain 'test.test-test.mp4' Any help with where my mistake is would be greatly appreciated.

File's in directory: test.test-test.mp4 (intended target file), test.test1-test-sample.mp4, test.test2-test-Sample.mp4

This is not the entire script, just the relevant part. Please ignore the irrelevant imports.
import tmdbsimple as tmdb
import requests
import locale
import os
import subprocess
import shlex
import json
tmdb.API_KEY = ''
api_key = ''


user_input = input("Enter a video file location: ")
dest = input("Enter a destination directory: ")

list1 = []
list2 = []
    
if os.path.isfile(user_input):
    list1.append(os.path.basename(user_input))
    base1 = os.path.splitext(list1[0])
    basename = base1[0]
else:
    list1.append(os.path.basename(user_input))
    basename = list1[0]
    for file in os.scandir(user_input):
        if file.name.endswith(('.mp4', '.mkv', '.avi')):
            list2.append(file.name)
    for j in list2:
        if "-sample" or "-Sample" in j:
            list2.remove(j)
    #target_file = os.path.join(user_input, list2[0])

#print(list1)                
print(list2)
#print(target_file)
Reply
#2
You have a couple of problems. Your most serious is the logic on line 29. By precedence, it will be parsed as the following:

        if ("-sample") or ("-Sample" in j):
As "-sample" is a string that is not empty, it is always true. This branch is always taken. If it's always taken, why is everything not removed?

Your second issue is trying to remove elements from a list while you're using it as an iterator. The order gets messed up.

This should remove 2, 3, and 4 from the list, but fails because the iterator is upset.
>>> l = list(range(6))
>>> for i in l:
...   if 2 <= i <= 4:
...     l.remove(i)
...
>>> print(l)
[0, 1, 3, 5]
While there are lots of ways around this (make a copy, use a comprehension to copy the bits you do want), an easier method here is to roll your exclusion into the previous loop. You don't have to remove something that you never stored in the first place.

    for file in os.scandir(user_input):
        if "-sample" in file.name or "-Sample" in file.name:
            continue
        if file.name.endswith(('.mp4', '.mkv', '.avi')):
            list2.append(file.name)
Reply
#3
Thanks for the thorough explanation. The if logic mistake makes sense. I'll read more into remove() so I understand this better. I also forgot continue was a thing. Ugh. Pressing on!

Your solution is simple and straightforward. Thanks again!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Removing element from list squall 6 156 Nov-22-2020, 09:34 PM
Last Post: jefsummers
  Python Substring muzikman 2 103 Nov-20-2020, 04:07 PM
Last Post: deanhystad
  Count number of occurrences of list items in list of tuples t4keheart 1 108 Nov-03-2020, 05:37 AM
Last Post: deanhystad
  concatenating 2 items at a time in a python list K11 3 161 Oct-21-2020, 09:34 AM
Last Post: buran
  Select the other of 2 items in a list Clunk_Head 7 449 Sep-01-2020, 05:27 PM
Last Post: Clunk_Head
  Print the number of items in a list on ubuntu terminal buttercup 2 422 Jul-24-2020, 01:46 PM
Last Post: ndc85430
  removing dictionary element in list using (key, value) MelonMusk 3 432 Jun-13-2020, 02:37 PM
Last Post: buran
  How to put the items of one list in new generated lists Bobbear 1 316 Jun-12-2020, 06:08 AM
Last Post: buran
  Compare Two Lists and Replace Items In a List by Index nagymusic 2 570 May-10-2020, 05:28 AM
Last Post: deanhystad
  mydict.items() is not a list Skaperen 2 405 May-07-2020, 07:03 AM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020