Python Forum
[pandas] Find the first element that is -1 - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: [pandas] Find the first element that is -1 (/thread-19077.html)



[pandas] Find the first element that is -1 - dervast - Jun-12-2019

Hi all
I have a dataset where the -1 means the point where I need to stop reading.
For example the dataset looks like that:
   
          0   1   2    3    4  5
0       58  68  58   59   -1 -1
1       59  69  59   -1   -1 -1
2       93  94  93   33   -1 -1
3       58  59  58   68   -1 -1
4       92  94  92   33   -1 -1
where the -1 at column 4 means, read from 0:3. Actually I want to return the length per row. So first row has length of 4 (4 elements until the -1). Row two has length 3. Row three has length 4 and so on.

For doing that I think I need in pandas perhaps a way to return me per row the index where the first -1 occurs.

How I can do something like that in a nice way in pandas (so avoid the long for loop option?)

I would like to thank you in advance for your help.
Regards
Alex


RE: Find the first element that is -1 - stullis - Jun-13-2019

Do the values following the first -1 always equal -1 or can they vary? For example:

25 30 22 87 -1 -1 # Always -1
25 30 22 87 -1 6 # Not always -1
If it's the former, you can use filter to find all values that do not equal -1 and then use len() on the result. Otherwise, I'm not sure you'll be able to avoid a loop.


RE: Find the first element that is -1 - noisefloor - Jun-13-2019

Hi,

you have to iterate over the rows anyway. Either explicit by manually iterating over the rows or implicit by e.g. a Pandas method which is doing that in the background for you.

I think Pandas is a bit over the top for a simple problem like that. It can be easily done with tools from the standard lib. If that data set is really large, numpy should be able to do the job as well. Assuming your data set has a uniform data type, as shown in the example.

Regards, noisefloor


RE: Find the first element that is -1 - ThomasL - Jun-14-2019

You can do this very fast with numpy
import numpy as np

a = np.array([[58,  68,  58,   59,   -1, -1], 
              [59,  69,  59,   -1,   -1, -1], 
              [93,  94,  93,   33,   -1, -1], 
              [58,  59,  58,   68,   -1, -1], 
              [92,  94,  92,   33,   -1, -1]])

print(np.sum((a != -1), axis=1))
which outputs a list with the "length" of each row
Quote:[4 3 4 4 4]