Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Character Encodings
#1
I'm trying to learn Python 3 using this Book Learn Python 3 The Hard Way by Zed A. Shaw

[Image: book.png]

But Exercise 23's code is not working in my PC (Windows 7 64-bit)
Here are my Code and Error
Can anyone explain to me why my code is not working?
import sys

script, input_encoding, error = sys.argv

def main(language_file, encoding, errors):
    line = language_file.readline()

    if line:
        print_line(line, encoding, errors)
        return main(language_file, encoding, errors)


def print_line(line, encoding, errors):
    next_lang = line.strip()
    raw_bytes = next_lang.encode(encoding, errors = errors)
    cooked_string = raw_bytes.decode(encoding, errors = errors)

    print(raw_bytes, "<===>", cooked_string)


languages = open('lang.txt', encoding = 'utf-8')

main(languages, input_encoding, error)
Error:
λ python ex23.py utf-8 strict Traceback (most recent call last): File "ex23.py", line 23, in <module> main(languages, input_encoding, error) File "ex23.py", line 6, in main line = language_file.readline() File "C:\Users\Evil Patrick\AppData\Local\Programs\Python\Python37\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 11-12: invalid continuation byte
Languages.txt file https://pastebin.com/JmfjW7E9
Reply
#2
A quick test under that you see work.
When you get the file use download button,so you get the original file,
you can mess it up yourself if open in a editor an save if don't now what the encoding is set to.

chardet to see what encoding on file is.
E:\div_code\read
λ chardetect languages.txt
languages.txt: utf-8 with confidence 0.99
Code open in utf-8
with open('languages.txt', encoding='utf-8') as f:
    print(f.read())
Output:
E:\div_code\read λ python languages.py Afrikaans አማርኛ Аҧсшәа العربية Aragonés Arpetan AzÉ™rbaycanca Bamanankan বাংলা Bân-lâm-gú Беларуская Български Boarisch Bosanski Буряад Català Чӑвашла ÄŒeÅ¡tina Cymraeg Dansk Deutsch Eesti Ελληνικά Español Esperanto فارسی Français Frysk Gaelg Gà idhlig Galego 한국어 Õ€Õ¡ÕµÕ¥Ö€Õ¥Õ¶ हिन्दी Hrvatski Ido Interlingua Italiano עברית ಕನ್ನಡ Kapampangan ქართული Қазақша Kreyòl ayisyen Latgaļu Latina LatvieÅ¡u Lëtzebuergesch Lietuvių Magyar Македонски Malti मराठी მარგალური مازِرونی Bahasa Melayu Монгол Nederlands नेपाल भाषा 日本語 Norsk bokmÃ¥l Nouormand Occitan OÊ»zbekcha/ўзбекча ਪੰਜਾਬੀ پنجابی پښتو Plattdüütsch Polski Português Română Romani Русский Seeltersk Shqip Simple English Slovenčina کوردیی ناوەندی Српски / srpski Suomi Svenska Tagalog தமிழ் ภาษาไทย Taqbaylit Татарча/tatarça తెలుగు Тоҷикӣ Türkçe Українська اردو Tiếng Việt Võro 文言 吴语 ייִדיש 中文
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question Right way to open files with different encodings? Winfried 2 193 Apr-23-2024, 05:50 PM
Last Post: snippsat
  [solved] unexpected character after line continuation character paul18fr 4 3,419 Jun-22-2021, 03:22 PM
Last Post: deanhystad
  SyntaxError: unexpected character after line continuation character siteshkumar 2 3,187 Jul-13-2020, 07:05 PM
Last Post: snippsat
  ModuleNotFoundError: no module named 'encodings' grunge10111 1 3,830 May-29-2020, 02:22 AM
Last Post: Larz60+
  subprocess.Popen() and encodings voltron 0 5,753 Feb-20-2020, 04:57 PM
Last Post: voltron
  how can i handle "expected a character " type error , when I input no character vivekagrey 2 2,757 Jan-05-2020, 11:50 AM
Last Post: vivekagrey
  Replace changing string including uppercase character with lowercase character silfer 11 6,217 Mar-25-2019, 12:54 PM
Last Post: silfer
  SyntaxError: unexpected character after line continuation character Saka 2 18,578 Sep-26-2017, 09:34 AM
Last Post: Saka

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020