Python Forum
encode/decode to show correct country letters in a CTk combobox
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
encode/decode to show correct country letters in a CTk combobox
#1
hi.
I have a text file encoded utf-8
Reading it with i.e. notepad shows "sør-trøndelag"
Note the Norwegian character "ø"

Binary reading in Python, from txtfile utf-8 encoded, shows:
"115 195 184 114 45 116 114 195 184 110 100 101 108 97 103"

A CtkComboBox receives the content of txtfile above.
Its output (list content viewed) is:
"sør-trøndelag"
Decoding binary file using utf-8 shows correct letter ø in combobox:
sør-trøndelag

Decoding using ansi and latin-1, both shows:
sør-trøndelag

So,the utf-8 should be used here(?)

Which side of the combobox needs coding to get the proper strings shown in its list?

Attached is an image if text above mess up correct viewing.

Thank You in advance.

I tried to save the sør-trøndelag into a new ansi coded textfile and it shows the correct/expected result sør-trøndelag.
Should I encode the list sent to combobox into i.e ansi (donno what coding Python uses?)
edit 2: tried to input sør-trøndelag into combobox. It was shown in combobox without change.

   
Reply
#2
text = bytes([115, 195, 184, 114, 45, 116, 114, 195, 184, 110, 100, 101, 108, 97, 103])
print(str(text.decode("latin1")))
print(str(text.decode("utf8")))
Output:
sør-trøndelag sør-trøndelag
latin1 encoding is the wrong choice.

When I cut the string from the website and pasted in a text file. I got this for bytes:
[115, 248, 114, 45, 116, 114, 248, 110, 100, 101, 108, 97, 103]
When I run this:
text = bytes([115, 248, 114, 45, 116, 114, 248, 110, 100, 101, 108, 97, 103])
print(str(text.decode("latin1")))
print(str(text.decode("utf8")))
Output:
sør-trøndelag Traceback (most recent call last): File "c:\...test.py", line 3, in <module> print(str(text.decode("utf8"))) ^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 1: invalid start byte
This time the string was encoded using latin encoding.

It is frustrating, but don't blame python or customtkinter. A lot of the blame has to go to Windows which doesn't really know what to do with extended characters. I think there is some code that spins a wheel to pick a random encoding. utf8 nearly always works.
Reply
#3
(Sep-02-2023, 04:35 AM)deanhystad Wrote:
text = bytes([115, 195, 184, 114, 45, 116, 114, 195, 184, 110, 100, 101, 108, 97, 103])
print(str(text.decode("latin1")))
print(str(text.decode("utf8")))
Output:
sør-trøndelag sør-trøndelag
latin1 encoding is the wrong choice.

When I cut the string from the website and pasted in a text file. I got this for bytes:
[115, 248, 114, 45, 116, 114, 248, 110, 100, 101, 108, 97, 103]
When I run this:
text = bytes([115, 248, 114, 45, 116, 114, 248, 110, 100, 101, 108, 97, 103])
print(str(text.decode("latin1")))
print(str(text.decode("utf8")))
Output:
sør-trøndelag Traceback (most recent call last): File "c:\...test.py", line 3, in <module> print(str(text.decode("utf8"))) ^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 1: invalid start byte
This time the string was encoded using latin encoding.

It is frustrating, but don't blame python or customtkinter. A lot of the blame has to go to Windows which doesn't really know what to do with extended characters. I think there is some code that spins a wheel to pick a random encoding. utf8 nearly always works.

hi, the issue is gone. UTF-8 wasnt set in windows10. Search on google found description of setting the checkbox: "howto set utf-8 system in win10"
thanks Wall
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Question UnicodeEncodeError: 'ascii' codec can't encode character u'\xe8' in position 562: ord ctrldan 23 4,880 Apr-24-2023, 03:40 PM
Last Post: ctrldan
  Decode string ? JohnnyCoffee 1 831 Jan-11-2023, 12:29 AM
Last Post: bowlofred
  Using locationtagger to extract locations found in a specific country/region lord_of_cinder 1 1,286 Oct-04-2022, 12:46 AM
Last Post: Larz60+
  PIL Image im.show() no show! Pedroski55 2 978 Sep-12-2022, 10:19 PM
Last Post: Pedroski55
Question Trouble installing modules/libraries and getting Notepad++ to show cyrillic letters Dragiev 6 2,271 Jul-24-2022, 12:55 PM
Last Post: Dragiev
  UnicodeEncodeError: 'ascii' codec can't encode character '\xfd' in position 14: ordin Armandito 6 2,741 Apr-29-2022, 12:36 PM
Last Post: Armandito
  PIL Image im.show() no show! Pedroski55 6 4,949 Feb-08-2022, 06:32 AM
Last Post: Pedroski55
  Control Mouse and Keyboard Across the Country Without VNC on Target PC Khuber79 5 3,016 Feb-21-2021, 02:42 AM
Last Post: NullAdmin
  'NoneType' object has no attribute 'encode' bhagyashree 6 8,881 Nov-05-2020, 03:50 PM
Last Post: deanhystad
  how to encode and decode same value absolut 2 2,361 Sep-08-2020, 09:46 AM
Last Post: TomToad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020