Fastest way to subtract elements of datasets of HDF5 file?

**scidam** · (This post was last modified: Aug-01-2020, 11:50 PM by scidam.)

If you can rewrite for j in range(m) in numpy-vectorized form, it will work faster; e.g. something like this:
delta_mat = fp1[:, 1] - fp2[:, 2]; Below is an example (not tested), where I tried to compute difference between two huge arrays (that are given in txt/csv-format, as in your case):

common_size = 10 ** 6
N = 10 ** 9
filename = ["file-1.txt", "file-2.txt"]


chunks1 = pd.read_csv(filename[0], chunksize=common_size,
                     names=['c1', 'c2', 'lt', 'rt'])

chunks2 = pd.read_csv(filename[1], chunksize=common_size,
                     names=['ch', 'tmstp', 'lt', 'rt'])

output = np.memmap('newfile1.dat', dtype='float64', mode='w+', shape=(N, 4))
 
 
# Example: column-wise difference, i.e. ch - c1, tmstp - c2, lt - lt, rt - rt
# output is stored to newfile1.dat
for ind, (chunk1, chunk2) in enumerate(zip(chunks1, chunks2)):
    output[common_size * ind : common_size * (ind + 1), :] =  chunks2.values - chunks1.values
    output.flush()
    
# It may cause an error,  if file-1 and file-2 have different number of rows.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Printing file path of lift elements	dyerlee91	1	1,503	Sep-27-2021, 01:22 PM Last Post: snippsat
	[solved] Save a matplotlib figure into hdf5 file	paul18fr	1	2,513	Jun-08-2021, 05:58 PM Last Post: paul18fr
	How to subtract columns with dates?	jpy	3	2,253	Dec-29-2020, 12:11 AM Last Post: jpy
	Accessing details of chunks in HDF5 file	Robotguy	0	1,569	Aug-29-2020, 06:51 AM Last Post: Robotguy
	How to sort a HDF5 file	Robotguy	1	3,071	Jul-23-2020, 05:34 PM Last Post: DeaD_EyE
	Datasets	lErn1324	1	1,514	Jul-17-2020, 06:29 PM Last Post: Larz60+
	Formula with elements of list - If-condition regarding the lists elements	lewielewis	2	2,738	May-08-2020, 01:41 PM Last Post: nnk
	Datasets of grammatically uncommon sentences?	regstuff	3	2,198	Nov-03-2019, 07:02 PM Last Post: Larz60+
	Groupby in pandas with conditional - add and subtract	rregorr	2	6,959	Jul-12-2019, 05:17 PM Last Post: rregorr
	Subtract rows (like r[1]-r[2] and r[3]-r[3]) and no pandas	pradeepkumarbe	1	2,601	Dec-18-2018, 01:16 PM Last Post: ichabod801

Fastest way to subtract elements of datasets of HDF5 file?

User Panel Messages

Announcements