Jan-25-2018, 05:58 PM
I.e. the first character in a string var is multibyte, will var[0] return the character, or a code stored in the first byte?
Does str type support multibyte characters?
|
Jan-25-2018, 05:58 PM
I.e. the first character in a string var is multibyte, will var[0] return the character, or a code stored in the first byte?
Jan-25-2018, 06:36 PM
Did you make a test? Interactive console is great there.
What version do you use Python 3(as you should use) has big changes in Unicode.
Byte and Unicode are totally separated in Python 3(can not be mixed together). So in Python 3 are strings(Unicode) by default. >>> japanese = "桜の花びらたち" >>> japanese '桜の花びらたち' >>> japanese[0] '桜' >>> japanese[1] 'の'Byte or multi-byte should be no concern of you,this is handled internally by Python. PEP-393: Flexible String Representation Quote:Python 3.3 switched to a new internal representation, using the most compact form needed to represent all characters in a string.In and out of Python 3,then most always use a encoding and always use UTF-8 .If do that then get back the same sting and all working as shown over. # Write to disk japanese = "桜の花びらたち" with open('jap.txt', 'w', encoding='utf-8') as f_out: f_out.write(japanese) # Read from disk with open('jap.txt', encoding='utf-8') as f: print(f.read())
|
|
Possibly Related Threads… | |||||
Thread | Author | Replies | Views | Last Post | |
Remove escape characters / Unicode characters from string | DreamingInsanity | 5 | 21,685 |
May-15-2020, 01:37 PM Last Post: snippsat |
|
Type hinting - return type based on parameter | micseydel | 2 | 3,178 |
Jan-14-2020, 01:20 AM Last Post: micseydel |
|
Regex: How to say 'any number of characters of any type until x'? | JoeB | 2 | 3,014 |
Jan-24-2018, 03:30 PM Last Post: Mekire |