how to decode UTF-8 in python 3

***snippsat*** · (This post was last modified: Jun-05-2018, 10:40 AM by snippsat.)

To add a little info to @DeaD_EyE post.
One of the biggest changes in in Python 3 was Unicode.
In Python 3 are strings(Unicode) by default.
Bytes and strings(Unicode) are totally separated in Python 3(can not be mixed together).

>>> s = b'hello '
>>> w = 'world'
>>> s + w
Traceback (most recent call last):
  File "<string>", line 428, in runcode
  File "<interactive input>", line 1, in <module>
TypeError: can't concat str to bytes

# Decode from bytes to string
>>> s.decode() + w
'hello world'

>>> # The same as
>>> s.decode('utf-8') + w
'hello world

# For last example
>>> japanese = "桜の花びらたち"
>>> japanese
'桜の花びらたち'
>>> type(japanese)
<class 'str'>

Bring in stuff in from outside world then most have a encoding to be string(Unicode) in Python 3.
If not give encoding when take stuff in will be Bytes(b'something') or give error.
UTF-8 is always the first choice to try and ideally "always" use.
In and out example.

# Write to disk
japanese = "桜の花びらたち"
with open('jap.txt', 'w', encoding='utf-8') as f_out:
    f_out.write(japanese)
 
# Read from disk
with open('jap.txt', encoding='utf-8') as f:
    print(f.read())

Output:
桜の花びらたち

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Decode string ?	JohnnyCoffee	1	929	Jan-11-2023, 12:29 AM Last Post: bowlofred
	how to encode and decode same value	absolut	2	2,505	Sep-08-2020, 09:46 AM Last Post: TomToad
	python-resize-image unicode decode error	Pedroski55	3	3,647	Apr-21-2020, 10:56 AM Last Post: Pedroski55
	struct.decode() and '\0'	deanhystad	1	3,416	Apr-09-2020, 04:13 PM Last Post: TomToad
	Getting decode error.	shankar	8	10,704	Sep-20-2019, 10:05 AM Last Post: tinman
	charmap codec can't decode byte error with gzipped file in python	bluethundr	2	3,913	Apr-30-2019, 12:26 PM Last Post: bluethundr
	decode base64 with python give error	thailq	3	4,031	Sep-24-2018, 12:39 AM Last Post: thailq
	python charmap codec can't decode byte X in position Y character maps to < undefined>	owais	9	39,669	Apr-28-2018, 10:52 PM Last Post: abadawi
	Ask help for utf-8 decode/encode	forfan	12	11,288	Feb-25-2017, 02:04 AM Last Post: forfan

how to decode UTF-8 in python 3

User Panel Messages

Announcements