Python Forum

Full Version: Character Encodings
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm trying to learn Python 3 using this Book Learn Python 3 The Hard Way by Zed A. Shaw

[Image: book.png]

But Exercise 23's code is not working in my PC (Windows 7 64-bit)
Here are my Code and Error
Can anyone explain to me why my code is not working?
import sys

script, input_encoding, error = sys.argv

def main(language_file, encoding, errors):
    line = language_file.readline()

    if line:
        print_line(line, encoding, errors)
        return main(language_file, encoding, errors)


def print_line(line, encoding, errors):
    next_lang = line.strip()
    raw_bytes = next_lang.encode(encoding, errors = errors)
    cooked_string = raw_bytes.decode(encoding, errors = errors)

    print(raw_bytes, "<===>", cooked_string)


languages = open('lang.txt', encoding = 'utf-8')

main(languages, input_encoding, error)
Error:
λ python ex23.py utf-8 strict Traceback (most recent call last): File "ex23.py", line 23, in <module> main(languages, input_encoding, error) File "ex23.py", line 6, in main line = language_file.readline() File "C:\Users\Evil Patrick\AppData\Local\Programs\Python\Python37\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 11-12: invalid continuation byte
Languages.txt file https://pastebin.com/JmfjW7E9
A quick test under that you see work.
When you get the file use download button,so you get the original file,
you can mess it up yourself if open in a editor an save if don't now what the encoding is set to.

chardet to see what encoding on file is.
E:\div_code\read
λ chardetect languages.txt
languages.txt: utf-8 with confidence 0.99
Code open in utf-8
with open('languages.txt', encoding='utf-8') as f:
    print(f.read())
Output:
E:\div_code\read λ python languages.py Afrikaans አማርኛ Аҧсшәа العربية Aragonés Arpetan AzÉ™rbaycanca Bamanankan বাংলা Bân-lâm-gú Беларуская Български Boarisch Bosanski Буряад Català Чӑвашла ÄŒeÅ¡tina Cymraeg Dansk Deutsch Eesti Ελληνικά Español Esperanto فارسی Français Frysk Gaelg Gà idhlig Galego 한국어 Õ€Õ¡ÕµÕ¥Ö€Õ¥Õ¶ हिन्दी Hrvatski Ido Interlingua Italiano עברית ಕನ್ನಡ Kapampangan ქართული Қазақша Kreyòl ayisyen Latgaļu Latina LatvieÅ¡u Lëtzebuergesch Lietuvių Magyar Македонски Malti मराठी მარგალური مازِرونی Bahasa Melayu Монгол Nederlands नेपाल भाषा 日本語 Norsk bokmÃ¥l Nouormand Occitan OÊ»zbekcha/ўзбекча ਪੰਜਾਬੀ پنجابی پښتو Plattdüütsch Polski Português Română Romani Русский Seeltersk Shqip Simple English Slovenčina کوردیی ناوەندی Српски / srpski Suomi Svenska Tagalog தமிழ் ภาษาไทย Taqbaylit Татарча/tatarça తెలుగు Тоҷикӣ Türkçe Українська اردو Tiếng Việt Võro 文言 吴语 ייִדיש 中文