Jun-10-2021, 10:21 PM
(This post was last modified: Jun-10-2021, 10:21 PM by Gribouillis.)
I have a less ugly solution using two functions
Here is the code defining these functions
ub()
and bu()
which stand respectively for 'unicode bytes' and 'bytes unicode'. It gives code like>>> foo = b'spam' >>> s = bu(f'{ub(foo)}bar') >>> s b'spambar'Converting a bytes or a str with the
ub()
function returns a unicode string that contains only characters which ord is lower than 256. For str, it raises an exception if it is not possible, for example ub('€')
fails. Converting a bytes or a str with bu()
converts it to bytes but it will fail for str that contain unicode characters beyond 256. The first letter u or b mnemotechnically indicates if the function returns unicode or bytes, thus ub() returns unicode and bu() returns bytes.Here is the code defining these functions
from functools import singledispatch __version__ = '2021.06.11' class _Ub(str): """A subtype of str that can contain only chars with ord < 256 """ __slots__ = () def __new__(cls, s): instance = str.__new__(cls, s) instance.encode('latin-1') # fail if there is a char beyond 256 return instance def __bytes__(self): return self.encode('latin-1') @singledispatch def ub(s): """Convert argument to 'unicode bytes' a subclass of str Returns an instance of a subclass of str that contains only unicode characters with ord < 256. 'ub' stands for 'unicode bytes' """ return _Ub(s) @ub.register(bytes) @ub.register(bytearray) def _(s): return _Ub(s.decode('latin-1')) @ub.register(_Ub) def _(s): return s @singledispatch def bu(s): """Convert to bytes an object which str() has only characters < 256. 'bu' stands for 'bytes unicode' """ return bytes(ub(s)) @bu.register(bytes) def _(s): return s @bu.register(bytearray) @bu.register(_Ub) def _(s): return bytes(s) def main(): x = 'hello' print(ub(x)) y = b'world' print(ub(y)) print(bytes(y)) z = bytearray(b'nice') print(ub(z)) print(str(z)) print(bytes(z)) foo = b'spam' s = bu(f'{ub(foo)}bar') print(s) if __name__ == '__main__': main()