Feb-11-2018, 06:45 AM
Hi, I recently had an assignment whereby we had to get sums of different columns. The original number of columns was 64 and we wanted to reduce it to 16 by summing up various columns. Thus, each row is a 1x64 array, essentially or you can think of it as an 8x8 matrix. To get the first 8 new sums, I simply took groups of 8, got their sums, and those were my first eight new columns/features. But, it seemed more difficult to get the next 8 sums. These might be considered the column sums if you're thinking of an 8x8 matrix. I ended up writing some pretty round-about code to get what I wanted but I was wondering if anyone had tips on how this code could be shortened to be more simplistic and quicker. Keeping in mind, I did this in pandas. My original code is below. Any tips or hints are much appreciated.
Jon
1 2 3 4 5 6 7 8 9 |
for index, row in df.iloc[:, 0 : 64 ].iterrows(): row = np.array(row) row = row.reshape( 8 , 8 ) row_sum = row. sum (axis = 0 ) row_sum = pd.DataFrame(row_sum.reshape( 1 , 8 ),columns = [i for i in range ( 0 , 8 )]) column_sum = row. sum (axis = 1 ) column_sum = pd.DataFrame(column_sum.reshape( 1 , 8 ), columns = [i for i in range ( 8 , 16 )]) result = pd.concat([row_sum, column_sum], axis = 1 ) df_2 = df_2.append(result) |