Jun-02-2020, 03:01 PM
My program can generate huge numpy arrays. Personally, I think a sensible maximum size is an array of shape (819200, 460800, 4) - a 4K (4096x2304) image scaled up 200 times. Although it can create bigger, if you do it's just a bit stupid.
According to numpy, if you try and create an array that big:
A numpy array that size, full of zeros only takes 4KB of disk space with H5PY (using GZIP compression)
Let's say that array is now full of RGBA values.
To save it as an image I need to do:
Unfortunately, I'm not able to do this:
Has anyone got an idea of how I can save my numpy array to an image without loading it into RAM?
According to numpy, if you try and create an array that big:
datas = np.zeros((819200, 460800, 4), np.unint8)
Error:Unable to allocate 1.37 TiB for an array with shape (819200, 460800, 4) and data type uint8
Well... 1.37TB.... that's a lot. Of course, that's because it's trying to load it into RAM. After doing some research, I found H5PY which means I can store (and modify) my massive numpy array on disk, rather than RAM (that means it does take a performance hit however).A numpy array that size, full of zeros only takes 4KB of disk space with H5PY (using GZIP compression)
with h5py.File("mytestfile.hdf5", "w") as f: dset = f.create_dataset("mydataset", data=(819200, 460800, 4), dtype='i', compression='gzip') #creates 4KB file
Let's say that array is now full of RGBA values.
To save it as an image I need to do:
cv2.imwrite("someimage.png", np.array(dset))
. The issue? Well I have to load the dataset into a numpy array to save it as an image which means loading the array into RAM meaning I get a memory error.Unfortunately, I'm not able to do this:
cv2.imwrite("someimage.png", dset)
because cv2 isn't able to read from a H5PY dataset.Has anyone got an idea of how I can save my numpy array to an image without loading it into RAM?