Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Custom file class
#1
I have a special format file with a header and some content. In other languages I wrote a file subclass for reading and writing this file type. The file class has a dictionary like interface for the header and a streaming interface for reading/writing the other content. I am thinking about how to do this in Python. Could anyone point me to examples of subclassing io for reading/writing a special file format or suggest better ways to do this kind of thing in Python.
Reply
#2
could you show a file sample with perhaps one complete record?
Custom file class can be quite easily written.
Reply
#3
My file is a binary file with a text header. The header consists of multiple records all 128 bytes long. Each record has a 32 byte key followed by a 96 byte value. It looks something like this:
FILE_TYPE                       TIME_HISTORY_FILE
CREATION_DATE                   JANUARY 27, 2021
SAMPLE_PERIOD                   0.001
NUMBER_OF_SAMPLES               4
NUMBER_OF_CHANNELS              2
CHANNEL_1_NAME                  X
CHANNEL_1_UNITS                 m
CHANNEL_2_NAME                  Y
CHANNEL_2_UNITS                 mm
END_OF_HEADER
The length of the header changes to allow different numbers of channels.

Following the header is the time history data.
x0,y0,x1,y1,x2,y2,x3,y3

Each time history value is 4 bytes long. The length of the time history data is NUMBER_OF_SAMPLES * NUMBER_OF_CHANNELS.

I cannot change the file organization. There are several legacy applications that work with this file format.

In other languages the API for this file type is:
open(filename) : Opens file and reads header
get(key) : Returns value associated with this header key
set(key, value) : Set value associated with this header key
read(count, buffer) : Read count time history values into buffer
save(filename) : Opens file for writing. Writes header to file.
write(count, buffer) : Write count time history values stored in buffer
close() : Close the file

I would like to look at ways that others have solved similar problems. I am currently studying the gzip library.
Reply
#4
A couple of questions about the header:
  1. is header rec number actually part of record?
  2. id header tab delimited?
  3. is FILE_TYPE always TIME_HISTORY_FILE?
Reply
#5
1: ? I do not understand the question
2: No. Key and value information is padded with spaces so the key is always 32 bytes long and value 96 bytes long.
3: No. There is also an extended time history type which has a slightly different header.
Reply
#6
Quote:1: ? I do not understand the question
sample from post 3:
Output:
1 FILE_TYPE TIME_HISTORY_FILE
is the '1' at start actually part of the record.
Reply
#7
No. That is just an artifact of wrapping with Python tags. Guess I should have used something else.
Reply
#8
How about reading data and converting into something like this:
record = {
    'FILE_TYPE': 'TIME_HISTORY_FILE',
    'CREATION_DATE': 'JANUARY 27, 2021',
    'SAMPLE_PERIOD': 0.001,
    'ch1': {
        'x': 'value',
        'y': 'value',
    },
    'ch2': {
        'x': 'value',
        'y': 'value',
    },
    'ch3': {
        'x': 'value',
        'y': 'value',
    },
    'ch4': {
        'x': 'value',
        'y': 'value',
    },
    ...
}
Reply
#9
I really don't have a problem with how to represent the data. My question have more to do with the mechanics of opening the file, reading from the file, writing to the file.

For example, I would really like to do this:
def dump_file(filename):
    with timehistoryfile.open(filename) as file:
        for key, value in file.header.items():
            print(key.strip(), value.strip())
So I want to inherit or implement the things that support context management.

I want to read a bunch of the time history values all at once, so I need to implement read(count). How do I implement "read(count)" Is it as simple as converting count to size and calling read(size) from the base class? If so, what is a good base class to use?
Reply
#10
You don't need a base class to create a context manager, you just need to implement the enter/exit interface.

>>> class TimeHistory:
...   def __init__(self, filename):
...     self.filename = filename
...     self.fobj = None
...   def __enter__(self):
...     self.fobj = open(self.filename, "r")
...     return self
...   def __exit__(self, *args):
...     if self.fobj:
...         self.fobj.close()
...     self.fobj = None
...
>>> with TimeHistory('test.txt') as file:
...   print(file)
...
<__main__.TimeHistory object at 0x000001AA1BB40AF0>
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to serialize custom class objects in JSON? Exsul1 4 1,729 Sep-23-2019, 08:27 AM
Last Post: wavic
  Duplicate output when calling a custom function from the same file? road2knowledge 2 1,127 May-10-2019, 07:58 AM
Last Post: road2knowledge
  import just one class from a file sylas 4 1,901 Apr-25-2018, 08:56 PM
Last Post: sylas
  Class File ankur2207 1 1,670 Sep-07-2017, 02:22 PM
Last Post: ichabod801
  PlaintextCorpusReader connectives custom file raky 1 2,035 Jun-16-2017, 08:26 PM
Last Post: raky

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020