There is no way that code works correctly. It adds a hanger any time a frame is different from start_frame which the opposite of a hanger. This is seen below where the function is called using the frame hash list ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']. Since the hash strings are all different, there should be no hangers. Instead, the function identifies every frame except the last as a hanger.
print(detect_hangers(range(10)))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8)]
I think this is what you want to do:
from itertools import zip_longest
def difference_count(a:str, b:str) -> int:
"""Count differences between a and b"""
return sum(1 for a, b in zip_longest(a, b) if a != b)
def detect_hangers(frame_hash_list, threshold:int = 10, min_count:int = 1):
"""Return list of "hangers" detected in frame_hash_list.
A "hanger" is consecutive frames that are the same.
frame_hash_list : list of frame hash strings. Frames are considered
same or different by counting the differences in their hash strings.
threshold : Maximum number of diffences allowed for two frames to be
considered "same".
min_count : Minimum length of a hanger. Short hangers aren't noticable
and don't have to be removed.
"""
hangers = [] # List of hanger start, stop frame indexes
start_index = 0
start_frame = frame_hash_list[0]
for index, frame in enumerate(frame_hash_list[1:], start=1):
# Are frame and start_frame disimilar enough?
if difference_count(start_frame, frame) > threshold:
if index - start_index >= min_count:
# Add hanger to list
hangers.append((start_index, index - 1))
start_frame = frame
start_index = index
# Check if we end with a hanger
if index - start_index > 10:
hangers.append([start_index, index])
return hangers
# Make some frame hashes by taking consecutive slices from this string
stuff = "aaaaabcccccccccccccccdefghijklmnopqrstufwxyzzzzzzzzzzzzzzzzzzz"
frames = [stuff[i:i+10] for i in range(len(stuff)-10)]
for hanger in detect_hangers(frames, 4, 5):
start, end = hanger
print(start, end, frames[start], frames[end])
4 13 abcccccccc ccccccccde
39 51 fwxyzzzzzz zzzzzzzzzz
My film had two hangers. 4:13 and 39:51. These are the only two places in my "film" where there are 5 or more consecutive frames where the starting and ending frame differences are less than 4.