Python Forum

Full Version: Read binary file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I am beginner and I need to read binary files.
How can I read each forth(nth) chunk of 1024 bytes.
What have you tried? Please post code with issue.
open with mode 'rb', example:
chunksize = 256 # Or whatever size you want to read in at a time
with open('MyFilename', 'rb') as f:
    for chunk in iter(lambda: f.read(chunksize), b""):
        ....
(Feb-15-2018, 10:13 AM)ammann Wrote: [ -> ]Hi, I am beginner and I need to read binary files.
How can I read each forth(nth) chunk of 1024 bytes.

Larz60+ solution is to read the whole file in chunks.
Taking this example and extending this with enumerate.

chunksize = 256 # Or whatever size you want to read in at a time
with open('MyFilename', 'rb') as f:
    for n, chunk in enumerate(iter(lambda: f.read(chunksize), b"")):
        if n % 4 != 0:
            continue
simplified


from functools import partial
# to get rid of the lambda


chunksize = 1024
with open('MyFilename', 'rb') as f:
    reader_partial = partial(f.read, chunksize)
    reader_iterator = iter(reader_partial, b"")
    # iter iterates until it get a an empty byte string
    for n, chunk in enumerate(reader_iterator):
        if n % 4 != 0:
            continue
        # do something every 4th 1024 bytes
        # code...
This example and the previous example do read the full file.
Also parts are read, which should not be processed.
You can change this, if you use the methods tell and seek on the file-object.

With this approach, you should get every 4th 1024 bytes started at zero:

with open('/bin/sh', 'rb') as fd:
    while True:
        print('At position:', fd.tell())
        chunk = fd.read(1024)
        print('Reading 1024 bytes')
        if not chunk:
            break
        fd.seek(3 * 1024, 1)


I'm not sure if I made a mistake with the factor 3 in fd.seek.
I use fd.seek(n, 1) to move the pointer in the file to the relative position n.
fd.seek(n, 0) is a absolute move to the position n.
Thanks for answers, but I have a problem again.
chunksize = 1024
new_image = []


with open('damaged.jpg', 'rb') as f:
    print(f.read(1024))
    for n, chunk in enumerate(iter(lambda: f.read(chunksize), b"")) :
        if (n-1)%3 != 0 :
            new_image.append(chunk)
        else:
            new_image.append(chunk[::-1])


print(len(new_image))

with open('new.jpg', 'wb') as new:
    i = 0
    while i+1 <= len(new_image):
        new.write(new_image[i])
        i += 1
I wrote all chunks to new_image, but size of damaged.jpg is 57984 and size of new.jpg is 56960. So how I lost 1 chunk?
Edited: Oh, i find my mistake in 6th line.
Copy a file?

with open('image.jpg', 'rb') as in_file:
    with open('new_image.jpg', 'wb') as out_file:
        chunk = 4096

        while True:
            chnk = in_file.read(chunk)
            if chnk:
                out_file.write(chnk)
            else:
                break