Python Forum
supporting both str and bytes on write file
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
supporting both str and bytes on write file
#1
i want to open a file object that supports both str and bytes output (print and write functions or methods). is there a way to do this?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#2
You can easily create one by wrapping a file opened in mode 'wb' in your own class. Just overload the write() method. You could use singledispatchmethod() for this.
ndc85430 likes this post
Reply
#3
There are many ways to do this.

class ReadWriteMixed:
    def __init__(self, fileobj):
        self.fileobj = fileobj
    
    def __getattr__(self, name):
        """
        For lazyness
        """
        return getattr(self.fileobj, name)

    def write(self, data: str | bytes):
        if isinstance(data, str) and "b" in self.fileobj.mode:
            data = data.encode()
        elif isinstance(data, bytes) and "b" not in self.fileobj.mode:
            data = data.decode()

        self.fileobj.write(data)

    def read_bytes(self, size=None):
        data = self.fileobj.read(size)
        if "b" not in self.fileobj.mode:
            data = data.encode()
        return data

    def read_text(self, size=None):
        data = self.fileobj.read(size)
        if "b" in self.fileobj.mode:
            data = data.decode()
        return data

def open_mixed(file, mode="r"):
    return ReadWriteMixed(open(file, mode))


file_like = open_mixed("/etc/issue", "rb") # open in bytes mode
print(file_like.read_text())  # read_text makes an implicit conversion, if mode is bytes
But I would never use this code. The problem is, that you have lesser control if something happens automagically.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
(Jul-07-2022, 12:49 PM)DeaD_EyE Wrote: But I would never use this code. The problem is, that you have lesser control if something happens automagically.
what could happen? i'm wanting this to avoid "funny" (error-like) things happening.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#5
(Jul-07-2022, 07:24 AM)Gribouillis Wrote: You can easily create one by wrapping a file opened in mode 'wb' in your own class. Just overload the write() method.
so, basically, i just create my own file-like class to do what i need. if i included read methods, i would need to add a means to specify which type the caller wants returned (such as an arg given at read or open time, or distinct method names).
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#6
(Jul-07-2022, 11:38 PM)Skaperen Wrote: what could happen? i'm wanting this to avoid "funny" (error-like) things happening.

For example, you open a file in raw mode and read 3 bytes, but the Unicode may break because one or more bytes are missing. This will end in a UnicodeDecodeError. Usually, you have a StreamReader which takes bytes until it gets valid data to decode a single unicode-point.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#7
(Jul-07-2022, 11:45 PM)Skaperen Wrote: if i included read methods, i would need to add a means to specify which type the caller wants returned
This is normally specified by the mode in the call to open(). I'm afraid you are adding a lot of complexity with little benefit. Like @DeaD_EyE, I would never use such a class. I think bytes and unicode strings serve different purposes. You want to do this because you stick to the pre-internet way of programming where every text was written in ascii or latin-1 code.
Reply
#8
there are still times when text needs to be handled like binary. part of the cause for that is the way Unicode was defined (surrogate bytes to support UTF-16 is major example). what i try to do is make many of the concepts i create actually work in different ways of programming. Unicode does not solve every text problem. i wish it did. if they would get rid of UTF-16 (just have UTF-8 and UTF-32) then it would a lot closer. but we know that won't happen.

the majority of my programming involves the str type, only, for text. some needs to use bytes, such as handling certain cases of files.

i don't think the opening of the internet was a change in programming. sure, there are new ways, now. but the internet functions fine without such changes. perhaps your real case is that i still work systems through the command line interface (CLI). i have written many scripts intended as CLI commands. i expect to write many more.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#9
(Jul-08-2022, 07:06 AM)Gribouillis Wrote: This is normally specified by the mode in the call to open().
given an open file, how do i find out which mode was specified in the call to open()?
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#10
(Jul-14-2022, 11:44 PM)Skaperen Wrote: given an open file, how do i find out which mode was specified in the call to open()?
In the class io.FileIO there is a .mode attribute
>>> f = open('paillasse/tmp/foo', 'wb')
>>> dir(f)
['__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', '_dealloc_warn', '_finalizing', 'close', 'closed', 'detach', 'fileno', 'flush', 'isatty', 'mode', 'name', 'raw', 'read', 'read1', 'readable', 'readinto', 'readinto1', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines']
>>> f.mode
'wb'
On the other hand, it is not very satisfactory to write code that depends on the type of the file object that is passed to a function. For example memory files such as StringIO and BytesIO don't have a .mode attribute.

Better write code that doesn't need to guess if a file object accepts bytes or str.
rob101 likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  supporting both strings and bytes in functions Skaperen 0 1,446 Nov-28-2019, 03:17 PM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020