If you can rewrite
for j in range(m)
in numpy-vectorized form, it will work faster; e.g. something like this:delta_mat = fp1[:, 1] - fp2[:, 2]
; Below is an example (not tested), where I tried to compute difference between two huge arrays (that are given in txt/csv-format, as in your case):common_size = 10 ** 6 N = 10 ** 9 filename = ["file-1.txt", "file-2.txt"] chunks1 = pd.read_csv(filename[0], chunksize=common_size, names=['c1', 'c2', 'lt', 'rt']) chunks2 = pd.read_csv(filename[1], chunksize=common_size, names=['ch', 'tmstp', 'lt', 'rt']) output = np.memmap('newfile1.dat', dtype='float64', mode='w+', shape=(N, 4)) # Example: column-wise difference, i.e. ch - c1, tmstp - c2, lt - lt, rt - rt # output is stored to newfile1.dat for ind, (chunk1, chunk2) in enumerate(zip(chunks1, chunks2)): output[common_size * ind : common_size * (ind + 1), :] = chunks2.values - chunks1.values output.flush() # It may cause an error, if file-1 and file-2 have different number of rows.