Python Forum
Corrupted numpy arrays when save to file. - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Corrupted numpy arrays when save to file. (/thread-23166.html)



Corrupted numpy arrays when save to file. - DreamingInsanity - Dec-13-2019

I take a numpy array, and write it to a gzip file like this:
f = gzip.GzipFile("frames.npy.gz", "a")
np.save(file=f, arr=arr)
f.close()
However, when I try and extract it (on mac) with Archive Utility I get
Error:
Error 32 - Broken pipe
a similar error appears using Unarchiver. Using unzip in terminal results in the error:
Error:
End-of-central-directory signature not found.
I am writing to the gzip file in different processes which could definitely be causing the problem, yet I do that elsewhere in the script, with absolutely no problem extracting the file.

Force extracting the file, this is the data contained within:
ìNUMPY�v�{'descr': '|u1', 'fortran_order': False, 'shape': (24000, 18000, 4), }
This is located at the top - this looks fine to me.
Most of the file is then filled with:
ˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇ
About 167 million lines down, there is some more data like this:
HXeˇP`mˇHXeˇ5ERˇ7GT
An then it continues with:
ˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇˇ
for the rest of the file. This is clearly what is causing the problem.
The annoying this is, the arrays themselves are perfectly fine, so I can't figure out what is causing the corruption (if it is even python).


RE: Corrupted numpy arrays when save to file. - ibreeden - Dec-14-2019

Have you already tried f = gzip.open("frames.npy.gz", "a") instead of f = gzip.GzipFile("frames.npy.gz", "a")?


RE: Corrupted numpy arrays when save to file. - DreamingInsanity - Dec-14-2019

(Dec-14-2019, 10:58 AM)ibreeden Wrote: Have you already tried f = gzip.open("frames.npy.gz", "a") instead of f = gzip.GzipFile("frames.npy.gz", "a")?
Yeah I have - unfortunately it doesn't make a difference.