Posts: 124
Threads: 74
Joined: Apr 2017
Hi,
I have df with 10K rows, and if I use iterrows its become slower. Then I use itertuples & getattr. How ever I also need to access previous row. I use below code but it fail to access. can any one help how to access previous row using index.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
for idx,row in enumerate(df.itertuples(),1):
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
print("Previous col2 value:", getattr(df[idx-1],'col2')) erro is:
raise KeyError(key) from err
KeyError: 0
Posts: 6,779
Threads: 20
Joined: Feb 2020
Feb-04-2022, 03:57 PM
(This post was last modified: Feb-04-2022, 03:57 PM by deanhystad.)
If you start at the first row there is no previous row.
You can hang on to the previous row and print the previous row after you get the second row.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
prev = None
for row in df.itertuples():
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
if prev is not None:
print("Previous col2 value:", getattr(prev,'col2'))
prev = row Or you can start printing the second row.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
rows = df.itertuples()
prev = next(rows) # Gets first row
for row in rows: # Will start at second row
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
print("Previous col2 value:", getattr(prev,'col2'))
prev = row
Posts: 1,950
Threads: 8
Joined: Jun 2018
Just out of curiosity- why there is need to iterate over rows while you need values from one column? One can access column (serie) directly, without need to iterate rows.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 6,779
Threads: 20
Joined: Feb 2020
I think that was for demo purposes. If not, grabbing a column series would be MUCH faster
Posts: 124
Threads: 74
Joined: Apr 2017
(Feb-04-2022, 07:28 PM)deanhystad Wrote: I think that was for demo purposes. If not, grabbing a column series would be MUCH faster Yes, I need every column, but just for simplicity I shown one column. Is there any other way to access previous with good performance other than itertuples.
Posts: 124
Threads: 74
Joined: Apr 2017
(Feb-04-2022, 03:57 PM)deanhystad Wrote: If you start at the first row there is no previous row.
You can hang on to the previous row and print the previous row after you get the second row.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
prev = None
for row in df.itertuples():
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
if prev is not None:
print("Previous col2 value:", getattr(prev,'col2'))
prev = row Or you can start printing the second row.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
rows = df.itertuples()
prev = next(rows) # Gets first row
for row in rows: # Will start at second row
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
print("Previous col2 value:", getattr(prev,'col2'))
prev = row
(Feb-04-2022, 03:57 PM)deanhystad Wrote: If you start at the first row there is no previous row.
You can hang on to the previous row and print the previous row after you get the second row.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
prev = None
for row in df.itertuples():
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
if prev is not None:
print("Previous col2 value:", getattr(prev,'col2'))
prev = row Or you can start printing the second row.
import pandas as pd
d = {'col1': ['A', 'B', 'C', 'D'], 'col2': [1, 2, 3, 4]}
df = pd.DataFrame(data=d)
rows = df.itertuples()
prev = next(rows) # Gets first row
for row in rows: # Will start at second row
print("Current index:",row)
print("current col2 value:", getattr(row, 'col2'))
print("Previous col2 value:", getattr(prev,'col2'))
prev = row
I need to iterate over rows and need to access previous & current rows content on each iteration. Not just first row.
Posts: 6,779
Threads: 20
Joined: Feb 2020
Feb-05-2022, 05:55 PM
(This post was last modified: Feb-05-2022, 05:55 PM by deanhystad.)
both of my examples handle every iteration, not just the first. Run the examples and you will see. The examples differ in how they handle the first iteration.
You can't iterate over all the rows and have the previous row for each iteration. There is no "previous row" for the first iteration. In my first example I handle there not being a previous row by not printing the previous row. In the second example I skip the first row and iterate over the remaining rows. The first row becomes the first "previous row".
Posts: 1,950
Threads: 8
Joined: Jun 2018
One way is to access rows by indices. Of course, there is still question about first row - whether it should be ignored or what?
import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(35).reshape(7, 5), columns=[*'abcde'])
for i in range(1, df.shape[0]):
print("Current row", *df.iloc[i])
print("Previous row", *df.iloc[i-1])
Output: Current row 5 6 7 8 9
Previous row 0 1 2 3 4
Current row 10 11 12 13 14
Previous row 5 6 7 8 9
Current row 15 16 17 18 19
Previous row 10 11 12 13 14
Current row 20 21 22 23 24
Previous row 15 16 17 18 19
Current row 25 26 27 28 29
Previous row 20 21 22 23 24
Current row 30 31 32 33 34
Previous row 25 26 27 28 29
I am not motivated enough to find out whether its faster than itertuples and getattr. I also believe that print is not the objective as I can't see any value of printing out 20K rows.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
|