Python Forum
splitting numeric list based on condition
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
splitting numeric list based on condition
#1
I am trying to split a list of numbers into sublists once a condition is met.

num_list = [0,1,2,3,4,5,2,3,4,5,0,1,2,3,4,5,0,1,2,3,4,5]

Whenever the list reaches 5, it needs to be splitted as a sublist resulting as below:

[[0,1,2,3,4,5],[2,3,4,5],[0,1,2,3,4,5],[0,1,2,3,4,5]]

I tried the code below which seems to work but it places 5 into the following list and I can't figure out how to place it in the previous list instead.

num_list =[0,1,2,3,4,5,1,2,3,4,5,2,3,4,5]

arrays = [[num_list[0]]] # array of sub-arrays (starts with first value)

for i in range(1, len(num_list)): # go through each element after the first
    if num_list[i] != 5: # If it's larger than the previous
        arrays[len(arrays)-1].append(num_list[i]) # Add it to the last sub-array
    else: # otherwise
        arrays.append([num_list[i]]) # Make a new sub-array 
print(arrays)
used from the solution given in this link: https://stackoverflow.com/questions/5255...-condition
Reply
#2
num_list = [0,1,2,3,4,5,1,2,3,4,5,2,3,4,5]
 
arrays = [[]] # array of sub-arrays

for i, num in enumerate(num_list):          # go through each element after the first
    arrays[-1].append(num)                  # Add it to the last sub-array
    if num == 5 and i != len(num_list)-1:   # if 5 encountered and not last element
        arrays.append([])
        
print(arrays)
Output:
[[0, 1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [2, 3, 4, 5]]
import numpy as np
num_arr = np.array([0,1,2,3,4,5,1,2,3,4,5,2,3,4,5])
arrays = np.split(num_arr, np.where(num_arr[:-1] == 5)[0]+1)
print(arrays)
Output:
[array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5])]
Reply
#3
you are very helpful! thank you very much.

(May-25-2019, 11:27 AM)michalmonday Wrote:
num_list = [0,1,2,3,4,5,1,2,3,4,5,2,3,4,5]
 
arrays = [[]] # array of sub-arrays

for i, num in enumerate(num_list):          # go through each element after the first
    arrays[-1].append(num)                  # Add it to the last sub-array
    if num == 5 and i != len(num_list)-1:   # if 5 encountered and not last element
        arrays.append([])
        
print(arrays)
Output:
[[0, 1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [2, 3, 4, 5]]
import numpy as np
num_arr = np.array([0,1,2,3,4,5,1,2,3,4,5,2,3,4,5])
arrays = np.split(num_arr, np.where(num_arr[:-1] == 5)[0]+1)
print(arrays)
Output:
[array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5])]

btw, could explain what is actually happening in this line of code, especially the last part [0]+1

arrays = np.split(num_arr, np.where(num_arr[:-1] == 5)[0]+1)
Reply
#4
Thank you for the help.I am trying to get extra credit on an assignment and this was very helpful.
Reply
#5
I hope this clarifies what's going on in there, but the best way is to check every part of the expression yourself by printing parts of it (or also modifying/playing with them) using interpreter

import numpy as np

# np.array allows to use "array-oriented" math operations
# For example using:
# num_arr == 5

# Doing that to a list will return single "True" or "False"
# Doing that to np.array will return another array containing "True" or "False"
# corresponding to each value in the array

# [5,10] == 5 evaluates to "False"
# np.array([5,10]) == 5 evaluates to "array([True, False])"

# So if you want to easily apply some operation (possibly using some numpy
# functions) to an array then using np.array() makes it convenient.

num_arr = np.array([0,1,2,3,4,5,1,2,3,4,5,2,3,4,5])

indices_where_5_is_found = np.where(num_arr[:-1] == 5)[0] + 1

print('Array:', num_arr, '\n')

print('At which indices 5 was found:\n', indices_where_5_is_found, '\n')

# Notice that if we didn't use [:-1] then the index of the last 5
# would also be included (resulting in additional empty sub-list),
# [:-1] removes the last item from the array.

print("At which indices 5 would be found if [:-1] wasn't used:\n", np.where(num_arr == 5)[0] + 1, '\n')
print("Result if [:-1] wasn't used:\n", np.split(num_arr, np.where(num_arr == 5)[0] + 1), '\n')


# If +1 wasn't used then 5 would be included as the first item of each
# sub-list

# np.array([1,2]) + 1 increases each item of the array by 1, resulting in:
# array([2,3])

print("Indices if +1 wasn't used:\n", np.where(num_arr[:-1] == 5)[0], '\n')
print("Result if +1 wasn't used:\n", np.split(num_arr, np.where(num_arr[:-1] == 5)[0]), '\n')

arrays = np.split(num_arr, indices_where_5_is_found)
print('Final result:\n', arrays, '\n')
Output:
Array: [0 1 2 3 4 5 1 2 3 4 5 2 3 4 5] At which indices 5 was found: [ 6 11] At which indices 5 would be found if [:-1] wasn't used: [ 6 11 15] Result if [:-1] wasn't used: [array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5]), array([], dtype=int32)] Indices if +1 wasn't used: [ 5 10] Result if +1 wasn't used: [array([0, 1, 2, 3, 4]), array([5, 1, 2, 3, 4]), array([5, 2, 3, 4, 5])] Final result: [array([0, 1, 2, 3, 4, 5]), array([1, 2, 3, 4, 5]), array([2, 3, 4, 5])]
Reply
#6
thanks a lot for the explanation and the tip to print out the results. will try it as well

I would like to extend this problem to another level where once these sublists are created. I need to iterate through this sublist and extract the index whenever there is a break in sequence. I tried to do that myself but I am not able to come up with a good solution for it. I know it will be easier to create a separate function and to pass the results of this separated arrays to is but beyond that I am not sure how best to proceed. As an example I have the array as below:

array = [0,1,2,3,4,5,6,7,8,9,10,0,1,2,3,4,5,6,7,3,8,9,10]

and with your help, I achieved the following:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 3, 8, 9, 10]]

Now I would like to loop into this sublists and when it find a break in the sequence within the sub list it will extract the index and value of it. in this case the break of sequence occurs in the second sublist from 7 to 3. I tried something like below which resulted in TypeError: append() takes exactly one argument (0 given)

The expected output in this case is 3

arrays = [[]] # array of sub-arrays
        
for i, num in enumerate(array):          
    arrays[-1].append(num)                  
    if num == 10 and i != len(array)-1:  
        arrays.append([]) 
print(arrays)

seq_break = []
for i in arrays:
    if i[-1] != i[-1] + 1: #checks to see if the next value in the sublist equals to the previous value +1 which means a sequence, if not extract the break value
        seq_break.append()
print(seq_break)
    
Reply
#7
You could do it like this:

array = [0,1,2,3,4,5,6,7,8,9,10,0,1,2,3,4,5,6,7,3,8,9,10]

arrays = [[]] # array of sub-arrays
         
for i, num in enumerate(array):          
    arrays[-1].append(num)                  
    if num == 10 and i != len(array)-1:  
        arrays.append([]) 
print(arrays)
 
seq_break_list = []
for array in arrays:
    seq_break = []
    for i, value in enumerate(array):
        if i == 0: continue
        
        if value != array[i-1] + 1: #checks to see if the next value in the sublist equals to the previous value +1 which means a sequence, if not extract the break value
            seq_break.append((i, value))

    seq_break_list.append(seq_break)
    
print(seq_break_list)
Output:
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [0, 1, 2, 3, 4, 5, 6, 7, 3, 8, 9, 10]] [[], [(8, 3), (9, 8)]]
Reply
#8
thanks i tried this approach but i realise that my data is much more complicated than initially observed.

I have situations whereby the sub arrays would look like below:

[[0,1,2,3,4,5,6,7,8,9,10],[0,1,2,3,4,5,6,6,7,7,8,8,9,10][0,1,2,3,4,5,6,7,8,9][0,1,2,3,4,5,6,7,1,8,9,10]

So there are situations where the number repeats itself or the ending number could be either 9 or 10 which makes the earlier code to split the array not work if the ending number is 9. and what i am trying to extract from each of this sequence is when there is an obvious break in it like going from 7 to 1 in the last subarray example. I am not sure how best to deal with this now given the complexity of these sub arrays.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Numeric Enigma Machine idev 8 125 1 hour ago
Last Post: idev
  unable to remove all elements from list based on a condition sg_python 3 373 Jan-27-2024, 04:03 PM
Last Post: deanhystad
  Sent email based on if condition stewietopg 1 803 Mar-15-2023, 08:54 AM
Last Post: menator01
  create new column based on condition arvin 12 2,132 Dec-13-2022, 04:53 PM
Last Post: jefsummers
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 797 Sep-08-2022, 06:32 AM
Last Post: klllmmm
  select Eof extension files based on text list of filenames with if condition RolanRoll 1 1,475 Apr-04-2022, 09:29 PM
Last Post: Larz60+
  Splitting strings in list of strings jesse68 3 1,702 Mar-02-2022, 05:15 PM
Last Post: DeaD_EyE
Question Numeric Anagrams - Count Occurances monty024 2 1,473 Nov-13-2021, 05:05 PM
Last Post: monty024
  How to get datetime from numeric format field klllmmm 3 1,957 Nov-06-2021, 03:26 PM
Last Post: snippsat
  How to map two data frames based on multiple condition SriRajesh 0 1,448 Oct-27-2021, 02:43 PM
Last Post: SriRajesh

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020