Aug-12-2018, 03:04 AM
i don't require contiguous bytes. but since all the items are < 256, then bytes are usable and they can be contiguous. a contiguous lookup would be faster than, for example, a list of ints, which was in earlier code prototypes. the bytes type does look good in Python3. but in Python2, bytes == str. in some places i split the code based on Python2 vs. Python3, while i try to make most of the code work in both Python2 and Python3. my goal for my UTF-8 code and my Escape Sequence code is to make everything work in 2.7 and 3.x as much as i can.
one thing i am putting some thought into is whether someone might want to get UTF-8 results back in a byte type but had to give Unicode data in a type that supports the full range of Unicode code points (a list of ints in both versions, str in Python3, unicode in Python2). originally i was going to return the same type as given. going from UTF-8 to Unicode is easier to see. even if the UTF-8 is given as bytes, the Unicode result probably can't be, so i will need to return something bigger. going the other way has a different issue. if the Unicode is given as some large type (which the caller usually must do), is that the type they want UTF-8 in? or do they want it in a byte type. so i'm thinking of adding support for a returntype= option to let the caller specify.
one thing i am putting some thought into is whether someone might want to get UTF-8 results back in a byte type but had to give Unicode data in a type that supports the full range of Unicode code points (a list of ints in both versions, str in Python3, unicode in Python2). originally i was going to return the same type as given. going from UTF-8 to Unicode is easier to see. even if the UTF-8 is given as bytes, the Unicode result probably can't be, so i will need to return something bigger. going the other way has a different issue. if the Unicode is given as some large type (which the caller usually must do), is that the type they want UTF-8 in? or do they want it in a byte type. so i'm thinking of adding support for a returntype= option to let the caller specify.
Tradition is peer pressure from dead people
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.