Longest sequence of repeating integers in a numpy array - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Longest sequence of repeating integers in a numpy array (/thread-27433.html) |
Longest sequence of repeating integers in a numpy array - Cricri - Jun-06-2020 Hello, Python (coding in general) newbie, so please forgive me if the question is naive/ I did do a search before asking, but I have not found an answer that fits. I am trying to show than in a random series of integers, it is perfectly possible to find a sequence of repeating integers that does not look random. To do so, I create a numpy array (in a Jupyter notebook)which I populate with a random.randint to simulate dice throws: [python] seq = np.random.randint(1, 7, size = 100) [python] The size is set to 100 arbitrarily. I may want to increase or decrease the size of the array. Where I get stuck is writing the loop to check the longest running sequence of repeating integers. I am also at a loss for deciding how to deal with cases when there are several sequences of repeating integers of the same length that are the longest sequences. I would like to identify the them all in that case. Either way, I would like to determine the length of the longest sequence(s) and the integer(s) concerned. For example, here is a run generated with the randit(): -----Generated Random Array---- [2 4 3 3 1 3 1 4 4 2 1 6 5 5 5 1 4 6 3 6 1 5 2 6 3 1 1 5 4 1 3 5 1 4 2 2 6 2 3 1 5 3 1 6 4 5 4 6 5 6 5 6 6 5 5 1 4 2 3 3 5 2 5 1 3 4 3 4 6 6 5 6 1 2 3 2 2 3 2 3 1 5 6 3 3 3 5 3 1 5 6 3 2 2 1 1 4 1 4 1] If I am not mistaken, the longest running sequences of repeating integers are the 555 and 333 I have highlighted. How do I pick them both out programmatically and show both the length of the sequence and the associated integer? Thank you for your suggestions and your patience. RE: Longest sequence of repeating integers in a numpy array - scidam - Jun-07-2020 You can use np.diff , i.e. difference between the array and shifted array, and find all islands of zeros: In [36]: def get_islands(arr, mask): ...: mask_ = np.concatenate(( [False], mask, [False] )) ...: idx = np.flatnonzero(mask_ [1:] != mask_ [:-1]) ...: return [arr[idx[i]:idx[i+1] + 1] for i in range(0, len(idx), 2)] ...: ...: ...: get_islands(seq, np.r_[np.diff(seq) == 0, False]) RE: Longest sequence of repeating integers in a numpy array - Cricri - Jun-07-2020 Hello scidam, First of all, thank you very much for taking the time to reply. I really appreciate it. I have no idea how your proposed algorithm works, so I'll just go away and try things out until I understand it. I will come back when I have done my homework. Kind regards c (Jun-07-2020, 12:24 AM)scidam Wrote: You can use scidam, Thank you very much. I am struggling to understand the implementation but I get the shifting and comparison of differences resulting in zeros when integers are repeated. I will spend more time on it to get a better understanding and be able to reproduce it in different circumstances. In a run, I got the following results: [array([1, 1]), array([3, 3]), array([4, 4]), array([2, 2]), array([6, 6]), array([2, 2]), array([6, 6]), array([4, 4]), array([2, 2, 2]), array([2, 2, 2]), array([4, 4]), array([6, 6]), array([5, 5]), array([3, 3, 3, 3])] If that is not asking too much, how would I alter the code so that it returns only the longest sequence (in this case, the last entry array([3, 3, 3, 3])? Thank you. RE: Longest sequence of repeating integers in a numpy array - Cricri - Jun-07-2020 I don't seem to be able to edit a post. Apologies for bundling the end [python] tag in my initial message. Also meant to say in last message "only the longest sequences", plural. My example only generated one longest repeating sequence of 4 digits, but were there two or more of these longest repeating sequences, I would like to retain them and only them in the output. Thank you. RE: Longest sequence of repeating integers in a numpy array - bowlofred - Jun-07-2020 Here's an alternate way import random import string string = "".join((random.choice('1234567')) for x in range(100)) max_string_length = 1 max_string_members = [] current_string_member = "" current_string_length = 0 for digit in string: if digit != current_string_member: current_string_member = digit current_string_length = 1 else: current_string_length += 1 if current_string_length == max_string_length: max_string_members.append(digit) if current_string_length > max_string_length: max_string_members = [digit] max_string_length = current_string_length print(f"The longest sequence found was {max_string_length}") print(f"The number of times this length was seen was {len(max_string_members)}") print(max_string_members) print(string) RE: Longest sequence of repeating integers in a numpy array - Cricri - Jun-08-2020 (Jun-07-2020, 05:51 PM)bowlofred Wrote: Here's an alternate way Sorry about the late reply. I didn't get a notification. This does the job fine and for scidam's, I will take it apart to learn how it is built so I can reproduce it elsewhere. Thank you very much for taking the time to answer. |