count occurrence of numbers in a sequence and return corresponding value - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: count occurrence of numbers in a sequence and return corresponding value (/thread-18472.html) |
count occurrence of numbers in a sequence and return corresponding value - python_newbie09 - May-19-2019 I would like to create a function whereby it goes through each row in a particular dataframe column, say X and if the same number appears consecutively for 5 times or more, it will return the value in the corresponding column, say Y. I am working with a timeseries data. for example the data would look like this. X = [2,2,2,2,2,3,4,5,6,6,6,6,6,7,8] Y = [4,4,4,4,4,0,0,0,5,5,5,5,5,0,0] so in this case it should return 4 and 6 and I would also need to sum up the occurrences of 4 and 6 if this kind of pattern repeats again throughout the time series. so the final output would be as below Y Count 4 1 6 1 thank you. RE: count occurrence of numbers in a sequence and return corresponding value - ichabod801 - May-19-2019 What have you tried? RE: count occurrence of numbers in a sequence and return corresponding value - python_newbie09 - May-19-2019 (May-19-2019, 02:03 PM)ichabod801 Wrote: What have you tried? I have tried something like below but I am stuck at how to return the corresponding value X = [2,2,2,2,2,3,4,5,6,6,6,6,6,7,8] Y = [4,4,4,4,4,0,0,0,5,5,5,5,5,0,0] count_sequence = [sum(1 for _ in group) for _, group in groupby(X)] print(count_sequence) for i in count_sequence: if i>= 5: print(Y[i]) #not sure if this is correct RE: count occurrence of numbers in a sequence and return corresponding value - michalmonday - May-19-2019 (May-19-2019, 12:46 PM)python_newbie09 Wrote: X = [2,2,2,2,2,3,4,5,6,6,6,6,6,7,8] Aren't 4's in the Y supposed to be 5's because there are 5 occurences of 2's in the X ?(May-19-2019, 12:46 PM)python_newbie09 Wrote: so in this case it should return 4 and 6 and I would also need to sum up the occurrences of 4 and 6 if this kind of pattern repeats again throughout the time series. so the final output would be as below I fail to grasp what is the origin of 4 and 6 in this final output and how their Count is 1.If the origin of 4 and 6 in the final output is based on the Y then I just don't get it. In the example you posted the Y consists of 5 occurences of 4, and 5 occurences of 5. No 6's to be seen. If the origin of 4 and 6 in the final output is based on the X then it is kinda understandable what 6 is taken from but then why 4 is also there instead of 2? ______________________________ You mentioned "consecutively", should the function check whether the repetitions are consecutive? Or X will never actually contain this kind of sequence: X = [1,1,1,1,1,1,2,1,1,3] # where 1's are separated by "2" or other numbers If X will actually contain such sequence then is the output below correct? Y = [6,6,6,6,6,6,0,0,0,0] In case if the X values don't have to be checked for consecutiveness then you could use something like this: import numpy as np X = np.array([2,2,2,2,2,3,4,5,6,6,6,6,6,7,8]) counts = np.array([(X==i).sum() for i in X]) # np.where 2nd and 3rd arguments can be single values or arrays # the return values sometimes are taken from 2nd and sometimes are taken # from 3rd array/value, depending on whether the condition from 1st argument # is met Y = np.where(counts < 5, 0, counts) print('Y =\n', Y) print('np.unique =\n', np.unique(X, return_counts=True))
RE: count occurrence of numbers in a sequence and return corresponding value - python_newbie09 - May-20-2019 (May-19-2019, 08:15 PM)michalmonday Wrote:(May-19-2019, 12:46 PM)python_newbie09 Wrote: X = [2,2,2,2,2,3,4,5,6,6,6,6,6,7,8] sorry for the confusion. I want to access the value in Y when the numbers in X are repeated consecutively. so i may have a time series as below: X = [5,5,5,5,5,0,0,0,0,0,4,4,4,4,4,1,2,3,6,6,6,6,5,5,5,5,5,2,4,6,7] Y = [1,1,1,1,1,0,0,0,0,0,3,3,3,3,3,1,2,3,4,6,7,8,5,5,5,5,5,2,4,6,7] so, i have to count in X if the number repeats itself 5 times or more, then return the value that is showing in Y, so for example the number 5 repeats itself 5 times, so it will then print the output in Y only once with the value 1. Repetitions of 0 should be excluded. so the final output would display as: [1,3, 5] as number 5 and 4 and 5 again repeated more than 5 times in X RE: count occurrence of numbers in a sequence and return corresponding value - michalmonday - May-20-2019 Now I'm even more confused to be honest. So X = [5,5,5,5,5] results in: Y = [1,1,1,1,1] but X = [4,4,4,4,4] results in Y = [3,3,3,3,3] Another thing that is confusing me is that: X = [1,2,3] results in Y = [1,2,3] but X = [6,6,6] results in Y = [4,6,7] And in the first example it used to be: X = [3,4,5] resulting in Y = [0,0,0] What's the logic behind it? RE: count occurrence of numbers in a sequence and return corresponding value - python_newbie09 - May-20-2019 (May-20-2019, 05:28 PM)michalmonday Wrote: Now I'm even more confused to be honest. i suggest to stick the latest sample data that I showed. basically i am observing data from a machine which will spit out numbers in X and when these numbers repeat themselves, it means something is wrong and it will then spit out the failure information in Y so that is why I need to know what is Y when X has repeated occurrences because the failure description is tied to the number being shown in Y. The reason why I need the information only once is because the failure in Y can also repeat itself at a later timepoint so even if I used the groupby method for Y, it will not separate these occurrences, for example Y may have [5,5,5,5,5,1,2,3,5,5,5,5,5,0,1,3,3,3,3,3] so I need to know that failure 5 occurred twice in this time series and not the sum of it. I hope this is clear and thanks for your patience. |