Python Forum
Longest sequence of repeating integers in a numpy array
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Longest sequence of repeating integers in a numpy array
#1
Hello,

Python (coding in general) newbie, so please forgive me if the question is naive/ I did do a search before asking, but I have not found an answer that fits.

I am trying to show than in a random series of integers, it is perfectly possible to find a sequence of repeating integers that does not look random. To do so, I create a numpy array (in a Jupyter notebook)which I populate with a random.randint to simulate dice throws:

[python] seq = np.random.randint(1, 7, size = 100) [python]

The size is set to 100 arbitrarily. I may want to increase or decrease the size of the array.

Where I get stuck is writing the loop to check the longest running sequence of repeating integers. I am also at a loss for deciding how to deal with cases when there are several sequences of repeating integers of the same length that are the longest sequences. I would like to identify the them all in that case. Either way, I would like to determine the length of the longest sequence(s) and the integer(s) concerned.
For example, here is a run generated with the randit():

-----Generated Random Array----
[2 4 3 3 1 3 1 4 4 2 1 6 5 5 5 1 4 6 3 6 1 5 2 6 3 1 1 5 4 1 3 5 1 4 2 2 6
2 3 1 5 3 1 6 4 5 4 6 5 6 5 6 6 5 5 1 4 2 3 3 5 2 5 1 3 4 3 4 6 6 5 6 1 2
3 2 2 3 2 3 1 5 6 3 3 3 5 3 1 5 6 3 2 2 1 1 4 1 4 1]

If I am not mistaken, the longest running sequences of repeating integers are the 555 and 333 I have highlighted. How do I pick them both out programmatically and show both the length of the sequence and the associated integer?

Thank you for your suggestions and your patience.
Reply
#2
You can use np.diff, i.e. difference between the array and shifted array, and find
all islands of zeros:


In [36]: def get_islands(arr, mask):
    ...:     mask_ = np.concatenate(( [False], mask, [False] ))
    ...:     idx = np.flatnonzero(mask_ [1:] != mask_ [:-1])
    ...:     return [arr[idx[i]:idx[i+1] + 1] for i in range(0, len(idx), 2)]
    ...:
    ...:
    ...: get_islands(seq, np.r_[np.diff(seq) == 0, False])
Reply
#3
Hello scidam,

First of all, thank you very much for taking the time to reply. I really appreciate it.

I have no idea how your proposed algorithm works, so I'll just go away and try things out until I understand it. I will come back when I have done my homework.

Kind regards

c

(Jun-07-2020, 12:24 AM)scidam Wrote: You can use np.diff, i.e. difference between the array and shifted array, and find
all islands of zeros:


In [36]: def get_islands(arr, mask):
    ...:     mask_ = np.concatenate(( [False], mask, [False] ))
    ...:     idx = np.flatnonzero(mask_ [1:] != mask_ [:-1])
    ...:     return [arr[idx[i]:idx[i+1] + 1] for i in range(0, len(idx), 2)]
    ...:
    ...:
    ...: get_islands(seq, np.r_[np.diff(seq) == 0, False])

scidam,

Thank you very much. I am struggling to understand the implementation but I get the shifting and comparison of differences resulting in zeros when integers are repeated. I will spend more time on it to get a better understanding and be able to reproduce it in different circumstances.

In a run, I got the following results:

[array([1, 1]),
array([3, 3]),
array([4, 4]),
array([2, 2]),
array([6, 6]),
array([2, 2]),
array([6, 6]),
array([4, 4]),
array([2, 2, 2]),
array([2, 2, 2]),
array([4, 4]),
array([6, 6]),
array([5, 5]),
array([3, 3, 3, 3])]

If that is not asking too much, how would I alter the code so that it returns only the longest sequence (in this case, the last entry array([3, 3, 3, 3])?

Thank you.
Reply
#4
I don't seem to be able to edit a post. Apologies for bundling the end [python] tag in my initial message. Also meant to say in last message "only the longest sequences", plural. My example only generated one longest repeating sequence of 4 digits, but were there two or more of these longest repeating sequences, I would like to retain them and only them in the output.

Thank you.
Reply
#5
Here's an alternate way

import random
import string

string = "".join((random.choice('1234567')) for x in range(100))

max_string_length = 1
max_string_members = []
current_string_member = ""
current_string_length = 0

for digit in string:
    if digit != current_string_member:
        current_string_member = digit
        current_string_length = 1
    else:
        current_string_length += 1
    if current_string_length == max_string_length:
        max_string_members.append(digit)
    if current_string_length > max_string_length:
        max_string_members = [digit]
        max_string_length = current_string_length

print(f"The longest sequence found was {max_string_length}")
print(f"The number of times this length was seen was {len(max_string_members)}")
print(max_string_members)
print(string)
Reply
#6
(Jun-07-2020, 05:51 PM)bowlofred Wrote: Here's an alternate way

Sorry about the late reply. I didn't get a notification.

This does the job fine and for scidam's, I will take it apart to learn how it is built so I can reproduce it elsewhere. Thank you very much for taking the time to answer.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Convert numpy array to image without loading it into RAM. DreamingInsanity 7 5,868 Feb-08-2024, 09:38 AM
Last Post: paul18fr
  Why is 2/3 not just .666 repeating? DocFro 4 692 Dec-12-2023, 09:09 AM
Last Post: buran
  IPython errors for numpy array min/max methods muelaner 1 554 Nov-04-2023, 09:22 PM
Last Post: snippsat
  Python code for Longest Common Subsequence Bolt 3 949 Sep-22-2023, 08:09 AM
Last Post: Bolt
  Python implementation of Longest Common Substring problem Bolt 0 555 Sep-17-2023, 08:31 PM
Last Post: Bolt
  Expand the range of a NumPy array? PythonNPC 0 746 Jan-31-2023, 02:41 AM
Last Post: PythonNPC
  Change a numpy array to a dataframe Led_Zeppelin 3 1,106 Jan-26-2023, 09:01 PM
Last Post: deanhystad
  from numpy array to csv - rounding SchroedingersLion 6 2,160 Nov-14-2022, 09:09 PM
Last Post: deanhystad
  repeating a user_input astral_travel 17 2,264 Oct-26-2022, 04:15 PM
Last Post: astral_travel
  if else repeating Frankduc 12 2,491 Jul-14-2022, 12:40 PM
Last Post: Frankduc

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020