Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Custom file class
#11
I don't know if this is the best solution, but it does appear to work fairly well for the header. This is the base class for binary files with headers.
"""Binfile.  A binary file with a header."""

def valid_key(key, valid_keys=None):
    """Validate key against a list of valid keys.  Return True if key in list"""
    if valid_keys is None:
        return True
    for k in valid_keys:
        if key == k:
            return True
    return False

class BinFile():
    """Class to read and write binary file format"""
    def __init__(self, filename=None, mode=None):
        self.filename = filename
        self.mode = mode
        self.file = None
        self.header = {}

    def __enter__(self):
        """Context manager enter.  Do open(filename, mode)"""
        self.open(self.filename, self.mode)
        return self

    def __exit__(self, *args):
        """Context manager exit.  Closes file if open"""
        self.close()

    def open(self, filename, mode):
        """Open file for reading ("r") or writing ("w")"""
        if mode in ('r', 'rb'):
            self.file = open(filename, 'rb')
            self.read_header()
        elif mode in ('w', 'wb'):
            self.file = open(filename, 'wb')
            self.write_header()  # Can call write_header again if changes are requried
        else:
            self.file = None
            raise ValueError(f'Invalid mode {mode} for open')

    def close(self):
        """Close file"""
        self.file.close()

    def read_header(self):
        """Read header from a file.  Looks for first "key" that contains a
        non-ascii character.  Override in subclass
        """
        self.file.seek(0)
        self.header = {}
        while True:
            key, value = self.read_header_line()
            if key is None:
                self.file.seek(len(self.header)*128)
                break
            self.header[key] = value

    def write_header(self):
        """Write header to a file"""
        self.file.seek(0)
        for key, value in self.header.items():
            self.write_header_line(key, value)

    def read_header_line(self, valid_keys=None):
        """Read a line from the header.  Return key and value.
        Use optional valid_keys to validate header key.
        """
        try:
            line = self.file.read(128).decode('ascii')
        except UnicodeDecodeError:
            return None, None

        if len(line) < 128:
            return None, None

        key = line[:32].strip()
        value = line[32:].strip()
        if valid_keys is not None:
            if not valid_key(key, valid_keys):
                raise ValueError(f'Unexpected key {key}')
        return key, value

    def write_header_line(self, key, value=None):
        """Write a header line to a file"""
        if value is None:
            value = self.header[key]
        self.file.write(bytes(f'{key:\0<32}{value:\0<96}', 'utf-8'))
Now I am starting to look at the time history part of the file. Let's say I have 2 channels , call them x and y, each with 5 data points. The data in the file will look like this:
Quote:x0, y0, x1, y1, x2, y2, x3, y3, x4, y4
When I use the data I need it to be arranged by channel.
Quote:x0, x1, x2, x3, x4, y0, y1, y2, y3, y4
A real file will probably be 8 channels instead of 2, and 10,000 data points per channel instead of 5. There is also a hard time limit on reading from the file and rotating the data (tens of milliseconds), so I need this to be fast.

I am looking at numpy.rot90, are there other tools that do this kind of thing? Other choices are to call an external C function to do the rotation.
Reply
#12
(Jan-31-2021, 08:43 PM)deanhystad Wrote: I am looking at numpy.rot90, are there other tools that do this kind of thing? Other choices are to call an external C function to do the rotation.

If that works, that's probably the best way forward. Numpy is optimized to work with large datasets, so (hopefully) as long as it works, you'll likely get the best performance possible while using python.

Pandas can also do something similar: https://pandas.pydata.org/pandas-docs/st...spose.html

Since you have the data in the header, you can pre-allocate the memory with numpy, which should help it with being fast (no resizing or reallocating). Something like:
>>> import numpy as np
>>> channels = 2 # from header data
>>> samples_per_channel = 5 # again, from header data
>>> data = np.empty((channels, samples_per_channel), dtype=np.uint32)
>>> data
array([[1886216563, 1601398124, 1601332592, 1851877475,  543974766],
       [ 221585469,         10,          0,    7798904,          0]],
      dtype=uint32)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question How to move a class to a custom module? python300 4 1,550 Mar-08-2022, 09:19 PM
Last Post: python300
  How to serialize custom class objects in JSON? Exsul1 4 3,475 Sep-23-2019, 08:27 AM
Last Post: wavic
  Duplicate output when calling a custom function from the same file? road2knowledge 2 2,378 May-10-2019, 07:58 AM
Last Post: road2knowledge
  Class File ankur2207 1 2,796 Sep-07-2017, 02:22 PM
Last Post: ichabod801
  PlaintextCorpusReader connectives custom file raky 1 2,988 Jun-16-2017, 08:26 PM
Last Post: raky

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020