To add a little info to @DeaD_EyE post.
One of the biggest changes in in Python 3 was Unicode.
In Python 3 are strings(Unicode) by default.
Bytes and strings(Unicode) are totally separated in Python 3(can not be mixed together).
If not give encoding when take stuff in will be Bytes(
UTF-8 is always the first choice to try and ideally "always" use.
In and out example.
One of the biggest changes in in Python 3 was Unicode.
In Python 3 are strings(Unicode) by default.
Bytes and strings(Unicode) are totally separated in Python 3(can not be mixed together).
>>> s = b'hello ' >>> w = 'world' >>> s + w Traceback (most recent call last): File "<string>", line 428, in runcode File "<interactive input>", line 1, in <module> TypeError: can't concat str to bytes # Decode from bytes to string >>> s.decode() + w 'hello world' >>> # The same as >>> s.decode('utf-8') + w 'hello world # For last example >>> japanese = "桜の花びらたち" >>> japanese '桜の花びらたち' >>> type(japanese) <class 'str'>Bring in stuff in from outside world then
most
have a encoding to be string(Unicode) in Python 3.If not give encoding when take stuff in will be Bytes(
b'something'
) or give error.UTF-8 is always the first choice to try and ideally "always" use.
In and out example.
# Write to disk japanese = "桜の花びらたち" with open('jap.txt', 'w', encoding='utf-8') as f_out: f_out.write(japanese) # Read from disk with open('jap.txt', encoding='utf-8') as f: print(f.read())
Output:桜の花びらたち