how to decode UTF-8 in python 3 - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: how to decode UTF-8 in python 3 (/thread-10756.html) |
how to decode UTF-8 in python 3 - oco - Jun-05-2018 Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> Str.decode(encoding = 'UTF-8',errors = 'strict') Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'Str' is not defined >>> .decode(encoding = 'UTF-8',errors = 'strict') File "<stdin>", line 1 .decode(encoding = 'UTF-8',errors = 'strict') ^ SyntaxError: invalid syntax >>> Str ="123" >>> Str.decode(encoding = 'UTF-8',errors = 'strict') Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'str' object has no attribute 'decode' RE: how to decode UTF-8 in python 3 - DeaD_EyE - Jun-05-2018
Then the standard complains:
RE: how to decode UTF-8 in python 3 - snippsat - Jun-05-2018 To add a little info to @DeaD_EyE post. One of the biggest changes in in Python 3 was Unicode. In Python 3 are strings(Unicode) by default. Bytes and strings(Unicode) are totally separated in Python 3(can not be mixed together). >>> s = b'hello ' >>> w = 'world' >>> s + w Traceback (most recent call last): File "<string>", line 428, in runcode File "<interactive input>", line 1, in <module> TypeError: can't concat str to bytes # Decode from bytes to string >>> s.decode() + w 'hello world' >>> # The same as >>> s.decode('utf-8') + w 'hello world # For last example >>> japanese = "桜の花びらたち" >>> japanese '桜の花びらたち' >>> type(japanese) <class 'str'>Bring in stuff in from outside world then most have a encoding to be string(Unicode) in Python 3.If not give encoding when take stuff in will be Bytes( b'something' ) or give error.UTF-8 is always the first choice to try and ideally "always" use. In and out example. # Write to disk japanese = "桜の花びらたち" with open('jap.txt', 'w', encoding='utf-8') as f_out: f_out.write(japanese) # Read from disk with open('jap.txt', encoding='utf-8') as f: print(f.read())
RE: how to decode UTF-8 in python 3 - wavic - Jun-05-2018 Perhaps there are some languages that the encoding have to be pointed out explicitly. So one can think that this is more or less general rule. I am happy that Python can speak in my own language. There were some issues with Python 3 and Unicode in Windows but they are fixed. As I know. |