Python Forum
detect equal sequences in list
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
detect equal sequences in list
#11
Hi,

in line 36 I added the word "reversed" for correct removal:

"for i in reversed(start_end):"

Now too short hangers (less than 10 frames (line 37)) are removed.

In line 30 and 31 I'm calculating the hamming distance and set "how sensible it is".

What I'm doing wrong with the hamming distance and with the removal of non-hangers in the list "hangers"?

The program doesn't work properly to detect hangers in a recorded hanging film...

Any help is much estimated!!

Greetings, flash77
Reply
#12
I think the detect_hangers function is wrong. I think it should look like this:
def detect_hangers(frame_hash_list):
    hangers = []  # List of hanger start, stop frame indexes 
    start_index = 0
    start_frame = list(frame_hash_list[0])
    for index, frame in enumerate(frame_hash_list):
        # Are frame and start_frame disimilar enough?
        if distance.hamming(list(frame), start_frame) >= 10:
            # Are there enough similar frames for hanger removal?
            if index - start_index > 10:
                # Add hanger to list
                hangers.append((start_index, index-1))
        start_frame = list(frame)
        start_index = index

    # Check if we end with a hanger
    if index - start_index > 10:
        hangers.append((start_index, index])
    return hangers
Where can I find more information about the distance.hamming function?
Reply
#13
Dear deanhystad,

thanks a lot for your very good answer!!

I will try your solution...

You asked where to get information about the distance.hamming function:

Just to know what is happening inside I will try your solution with the following function:

That's what I know/found about the hamming function:

It counts (=variable "count") the digits of the one string which differs from the other string.
The 2 strings have to be of the same length.
The higher the amount of the variable count, the more the one string differs from the other.
The hamming distance is a measure of how much 2 strings differs.
The hamming distance can be used for strings, binary...
In this case the to strings, which are to be compared, are the hash strings of 2 images.

def hdist(str1, str2):
    i = 0
    count = 0
    while (i<len(str1)):
        if (str1[i] != str[i]):
            count = += 1
        i += 1
    return count

str1 = "testa"
str2 = "testb"

print(hdist(str1, str2))
Because I'm currently professional restrained it will take some time until I implemented your solution.

Greetings,

flash77
Reply
#14
I don't think that list(frame) makes sense. What kind of objects are in frame_hash_list?
Reply
#15
Hello,

in line 10 and 11:
there are strings (hash strings: "abc1c2...") in frame_hash_list.
It comes from the frame comparison.

def create_phash():
    frame_hash_list = []
    p = "D:/S8_hanger_finder/neuer_Ansatz/aktueller_Versuch/phash_test/"
    obj = os.scandir(p)
    for entry in obj:
        # load frames
        frame = Image.open(p + str(entry.name))
        # create pHash
        # Compare hashes to determine whether the frames are the same or not
        frame_phash = str(imagehash.phash(frame))
        frame_hash_list.append(frame_phash)
    obj.close()
    return frame_hash_list
Reply
#16
Hi,

this worked for me:

def detect_hangers(frame_hash_list):
    hangers = []  # List of hanger start, stop frame indexes
    start_index = 0
    start_frame = frame_hash_list[0]
    for index, frame in enumerate(frame_hash_list):
        # Are frame and start_frame disimilar enough?
        i = 0
        count = 0
        while (i < len(frame)):
            if (start_frame[i] != frame[i]):
                count += 1
            i += 1
        if count > 0:
            # Add hanger to list
            hangers.append((start_index, index - 1))
            start_frame = frame
            start_index = index
     # Check if we end with a hanger
    if index - start_index > 10:
        hangers.append([start_index, index])
    return hangers
Reply
#17
There is no way that code works correctly. It adds a hanger any time a frame is different from start_frame which the opposite of a hanger. This is seen below where the function is called using the frame hash list ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']. Since the hash strings are all different, there should be no hangers. Instead, the function identifies every frame except the last as a hanger.
print(detect_hangers(range(10)))
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8)]
I think this is what you want to do:
from itertools import zip_longest

def difference_count(a:str, b:str) -> int:
    """Count differences between a and b"""
    return sum(1 for a, b in zip_longest(a, b) if a != b)

def detect_hangers(frame_hash_list, threshold:int = 10, min_count:int = 1):
    """Return list of "hangers" detected in frame_hash_list.
    A "hanger" is consecutive frames that are the same.

    frame_hash_list : list of frame hash strings.  Frames are considered
    same or different by counting the differences in their hash strings.
    
    threshold : Maximum number of diffences allowed for two frames to be
    considered "same".

    min_count : Minimum length of a hanger.  Short hangers aren't noticable
    and don't have to be removed.
    """
    hangers = []  # List of hanger start, stop frame indexes
    start_index = 0
    start_frame = frame_hash_list[0]
    for index, frame in enumerate(frame_hash_list[1:], start=1):
        # Are frame and start_frame disimilar enough?
        if difference_count(start_frame, frame) > threshold:
            if index - start_index >= min_count:
                # Add hanger to list
                hangers.append((start_index, index - 1))
            start_frame = frame
            start_index = index
     # Check if we end with a hanger
    if index - start_index > 10:
        hangers.append([start_index, index])
    return hangers

# Make some frame hashes by taking consecutive slices from this string
stuff = "aaaaabcccccccccccccccdefghijklmnopqrstufwxyzzzzzzzzzzzzzzzzzzz"
frames = [stuff[i:i+10] for i in range(len(stuff)-10)]

for hanger in detect_hangers(frames, 4, 5):
    start, end = hanger
    print(start, end, frames[start], frames[end])
4 13 abcccccccc ccccccccde
39 51 fwxyzzzzzz zzzzzzzzzz
My film had two hangers. 4:13 and 39:51. These are the only two places in my "film" where there are 5 or more consecutive frames where the starting and ending frame differences are less than 4.
Reply
#18
Dear deanhystad,

many thanks for your detailed answer!!

I do estimate your effort very much.

Unfortunatly I'm professional restrained and can test your solution only from monday...

I owe you a debt of gratitude!!

Thanks again,

flash77
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Check if two matrix are equal and of not add the matrix to the list quest 3 842 Jul-10-2023, 02:41 AM
Last Post: deanhystad
  Regarding how to randomizing list with having equal probability radraw 14 2,211 Nov-06-2022, 11:09 PM
Last Post: Pedroski55
  is there equal syntax to "dir /s /b" kucingkembar 2 1,007 Aug-16-2022, 08:26 AM
Last Post: kucingkembar
  Can a variable equal 2 things? Extra 4 1,518 Jan-18-2022, 09:21 PM
Last Post: Extra
  needleman wunsch algorithm for two sequences of different length johnny_sav1992 0 1,711 Jul-27-2020, 05:45 PM
Last Post: johnny_sav1992
  help for escape sequences NewPi 1 2,043 Dec-11-2019, 11:22 PM
Last Post: ichabod801
  Not equal a dictionary key value bazcurtis 2 1,950 Dec-11-2019, 11:15 PM
Last Post: bazcurtis
  copying parts of mutable sequences Skaperen 1 2,241 Dec-02-2019, 10:34 AM
Last Post: Gribouillis
  Convert weekly sequences to date and time. SinPy 0 1,457 Nov-23-2019, 05:20 PM
Last Post: SinPy
  Escape sequences display in python Uchikago 1 2,444 Jun-27-2019, 03:25 PM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020