Apr-13-2017, 10:10 AM
If there are only two lines for each CHQNO and only one of withdrawal/deposit is used on each row, then you can sort your df to be "like" your example. And if there could be more than two lines with same CHQNO or more withdrawal/deposits, then your 2nd method does something else than 1st one.
As I mentioned in previous post, with pandas you can do basically same as with your first sql method - join on chqno/withdrawal/deposit and filter.
As I mentioned in previous post, with pandas you can do basically same as with your first sql method - join on chqno/withdrawal/deposit and filter.
Output:In [4]: df['idx'] = df.index
In [5]: aux_df = pd.merge(df, df, left_on=['CHQNO', 'Deposit'], right_on=['CHQNO', 'Withdrawal'])
In [6]: df['rtn'] = ""
In [7]: df.rtn[aux_df.idx_x[~aux_df.CHQNO.isnull()]] = "RTN"
In [8]: df
Out[8]:
DATE CHQNO Deposit Withdrawal idx rtn
0 2016-04-19 533247.0 1975000.00 NaN 0 RTN
1 2016-04-19 533247.0 NaN 1975000.00 1 RTN
2 2016-05-05 NaN 1947293.42 NaN 2
3 2016-05-05 NaN NaN 1947293.42 3
4 2016-06-03 535199.0 1930000.00 NaN 4 RTN
5 2016-06-04 535199.0 NaN 1930000.00 5 RTN
6 2016-08-04 NaN 195000.00 NaN 6
7 2016-08-05 628490.0 5000.00 NaN 7 RTN
8 2016-08-05 628490.0 NaN 5000.00 8 RTN
9 2016-12-30 750200.0 15000.00 NaN 9
10 2016-12-31 326500.0 NaN 15000.00 10