Python Forum
splitting numeric list based on condition - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: splitting numeric list based on condition (/thread-18639.html)



splitting numeric list based on condition - python_newbie09 - May-25-2019

I am trying to split a list of numbers into sublists once a condition is met.

num_list = [0,1,2,3,4,5,2,3,4,5,0,1,2,3,4,5,0,1,2,3,4,5]

Whenever the list reaches 5, it needs to be splitted as a sublist resulting as below:

[[0,1,2,3,4,5],[2,3,4,5],[0,1,2,3,4,5],[0,1,2,3,4,5]]

I tried the code below which seems to work but it places 5 into the following list and I can't figure out how to place it in the previous list instead.

num_list =[0,1,2,3,4,5,1,2,3,4,5,2,3,4,5]

arrays = [[num_list[0]]] # array of sub-arrays (starts with first value)

for i in range(1, len(num_list)): # go through each element after the first
    if num_list[i] != 5: # If it's larger than the previous
        arrays[len(arrays)-1].append(num_list[i]) # Add it to the last sub-array
    else: # otherwise
        arrays.append([num_list[i]]) # Make a new sub-array 
print(arrays)
used from the solution given in this link: https://stackoverflow.com/questions/52551398/slicing-a-list-into-sublists-based-on-condition


RE: splitting numeric list based on condition - michalmonday - May-25-2019

num_list = [0,1,2,3,4,5,1,2,3,4,5,2,3,4,5]
 
arrays = [[]] # array of sub-arrays

for i, num in enumerate(num_list):          # go through each element after the first
    arrays[-1].append(num)                  # Add it to the last sub-array
    if num == 5 and i != len(num_list)-1:   # if 5 encountered and not last element
        arrays.append([])
        
print(arrays)
Output:
[[0, 1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [2, 3, 4, 5]]
import numpy as np
num_arr = np.array([0,1,2,3,4,5,1,2,3,4,5,2,3,4,5])
arrays = np.split(num_arr, np.where(num_arr[:-1] == 5)[0]+1)
print(arrays)
Output:
[array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5])]



RE: splitting numeric list based on condition - python_newbie09 - May-25-2019

you are very helpful! thank you very much.

(May-25-2019, 11:27 AM)michalmonday Wrote:
num_list = [0,1,2,3,4,5,1,2,3,4,5,2,3,4,5]
 
arrays = [[]] # array of sub-arrays

for i, num in enumerate(num_list):          # go through each element after the first
    arrays[-1].append(num)                  # Add it to the last sub-array
    if num == 5 and i != len(num_list)-1:   # if 5 encountered and not last element
        arrays.append([])
        
print(arrays)
Output:
[[0, 1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [2, 3, 4, 5]]
import numpy as np
num_arr = np.array([0,1,2,3,4,5,1,2,3,4,5,2,3,4,5])
arrays = np.split(num_arr, np.where(num_arr[:-1] == 5)[0]+1)
print(arrays)
Output:
[array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5])]

btw, could explain what is actually happening in this line of code, especially the last part [0]+1

arrays = np.split(num_arr, np.where(num_arr[:-1] == 5)[0]+1)



RE: splitting numeric list based on condition - ratayr - May-26-2019

Thank you for the help.I am trying to get extra credit on an assignment and this was very helpful.


RE: splitting numeric list based on condition - michalmonday - May-26-2019

I hope this clarifies what's going on in there, but the best way is to check every part of the expression yourself by printing parts of it (or also modifying/playing with them) using interpreter

import numpy as np

# np.array allows to use "array-oriented" math operations
# For example using:
# num_arr == 5

# Doing that to a list will return single "True" or "False"
# Doing that to np.array will return another array containing "True" or "False"
# corresponding to each value in the array

# [5,10] == 5 evaluates to "False"
# np.array([5,10]) == 5 evaluates to "array([True, False])"

# So if you want to easily apply some operation (possibly using some numpy
# functions) to an array then using np.array() makes it convenient.

num_arr = np.array([0,1,2,3,4,5,1,2,3,4,5,2,3,4,5])

indices_where_5_is_found = np.where(num_arr[:-1] == 5)[0] + 1

print('Array:', num_arr, '\n')

print('At which indices 5 was found:\n', indices_where_5_is_found, '\n')

# Notice that if we didn't use [:-1] then the index of the last 5
# would also be included (resulting in additional empty sub-list),
# [:-1] removes the last item from the array.

print("At which indices 5 would be found if [:-1] wasn't used:\n", np.where(num_arr == 5)[0] + 1, '\n')
print("Result if [:-1] wasn't used:\n", np.split(num_arr, np.where(num_arr == 5)[0] + 1), '\n')


# If +1 wasn't used then 5 would be included as the first item of each
# sub-list

# np.array([1,2]) + 1 increases each item of the array by 1, resulting in:
# array([2,3])

print("Indices if +1 wasn't used:\n", np.where(num_arr[:-1] == 5)[0], '\n')
print("Result if +1 wasn't used:\n", np.split(num_arr, np.where(num_arr[:-1] == 5)[0]), '\n')

arrays = np.split(num_arr, indices_where_5_is_found)
print('Final result:\n', arrays, '\n')
Output:
Array: [0 1 2 3 4 5 1 2 3 4 5 2 3 4 5] At which indices 5 was found: [ 6 11] At which indices 5 would be found if [:-1] wasn't used: [ 6 11 15] Result if [:-1] wasn't used: [array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5]), array([], dtype=int32)] Indices if +1 wasn't used: [ 5 10] Result if +1 wasn't used: [array([0, 1, 2, 3, 4]), array([5, 1, 2, 3, 4]), array([5, 2, 3, 4, 5])] Final result: [array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5])]



RE: splitting numeric list based on condition - python_newbie09 - May-26-2019

thanks a lot for the explanation and the tip to print out the results. will try it as well

I would like to extend this problem to another level where once these sublists are created. I need to iterate through this sublist and extract the index whenever there is a break in sequence. I tried to do that myself but I am not able to come up with a good solution for it. I know it will be easier to create a separate function and to pass the results of this separated arrays to is but beyond that I am not sure how best to proceed. As an example I have the array as below:

array = [0,1,2,3,4,5,6,7,8,9,10,0,1,2,3,4,5,6,7,3,8,9,10]

and with your help, I achieved the following:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 3, 8, 9, 10]]

Now I would like to loop into this sublists and when it find a break in the sequence within the sub list it will extract the index and value of it. in this case the break of sequence occurs in the second sublist from 7 to 3. I tried something like below which resulted in TypeError: append() takes exactly one argument (0 given)

The expected output in this case is 3

arrays = [[]] # array of sub-arrays
        
for i, num in enumerate(array):          
    arrays[-1].append(num)                  
    if num == 10 and i != len(array)-1:  
        arrays.append([]) 
print(arrays)

seq_break = []
for i in arrays:
    if i[-1] != i[-1] + 1: #checks to see if the next value in the sublist equals to the previous value +1 which means a sequence, if not extract the break value
        seq_break.append()
print(seq_break)
    



RE: splitting numeric list based on condition - michalmonday - May-27-2019

You could do it like this:

array = [0,1,2,3,4,5,6,7,8,9,10,0,1,2,3,4,5,6,7,3,8,9,10]

arrays = [[]] # array of sub-arrays
         
for i, num in enumerate(array):          
    arrays[-1].append(num)                  
    if num == 10 and i != len(array)-1:  
        arrays.append([]) 
print(arrays)
 
seq_break_list = []
for array in arrays:
    seq_break = []
    for i, value in enumerate(array):
        if i == 0: continue
        
        if value != array[i-1] + 1: #checks to see if the next value in the sublist equals to the previous value +1 which means a sequence, if not extract the break value
            seq_break.append((i, value))

    seq_break_list.append(seq_break)
    
print(seq_break_list)
Output:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 3, 8, 9, 10]] [[], [(8, 3), (9, 8)]]



RE: splitting numeric list based on condition - python_newbie09 - May-27-2019

thanks i tried this approach but i realise that my data is much more complicated than initially observed.

I have situations whereby the sub arrays would look like below:

[[0,1,2,3,4,5,6,7,8,9,10],[0,1,2,3,4,5,6,6,7,7,8,8,9,10][0,1,2,3,4,5,6,7,8,9][0,1,2,3,4,5,6,7,1,8,9,10]

So there are situations where the number repeats itself or the ending number could be either 9 or 10 which makes the earlier code to split the array not work if the ending number is 9. and what i am trying to extract from each of this sequence is when there is an obvious break in it like going from 7 to 1 in the last subarray example. I am not sure how best to deal with this now given the complexity of these sub arrays.