Python Forum
Want a list utf8 formatted but bytestrings found
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Want a list utf8 formatted but bytestrings found
#2
The \x00 you see, are bytes in hexadecimal representation. This is the representation of the string.
This representation is used in str, bytes, bytearray.
All characters, which can not displayed or are control characters, are displayed in this format.
If you print them, you don't see this internal representation of string literals.

With your data:
items = ['', 'Alexander Lepsveridze', 'John Comeau', '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82', '\xce\x8c\xce\xbc\xce\xb9\xce\xbb\xce\xbf\xcf\x82 \xce\xa4\xcf\x83\xce\xbf\xcf\x84\xcf\x85\xce\xbb\xce\xaf\xce\xbf\xcf\x85']

for item in items:
    print(item)
Output:
Alexander Lepsveridze John Comeau ÎÎºÎ·Ï Î¤ÏÎ¹Î¬Î¼Î·Ï ÎÎ¼Î¹Î»Î¿Ï Î¤ÏÎ¿Ï Ï Î»Î¯Î¿Ï
Now with a module, which can fix broken encodings:

import ftfy
items = ['', 'Alexander Lepsveridze', 'John Comeau', '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82', '\xce\x8c\xce\xbc\xce\xb9\xce\xbb\xce\xbf\xcf\x82 \xce\xa4\xcf\x83\xce\xbf\xcf\x84\xcf\x85\xce\xbb\xce\xaf\xce\xbf\xcf\x85']

for item in items:
    print(ftfy.fix_encoding(item))
Output:
Alexander Lepsveridze John Comeau Άκης Τσιάμης Όμιλος Τσοτυλίου
The string was originally utf8, but was encoded with latin1.

print(items[-1].encode('latin1').decode('utf8'))
Output:
Όμιλος Τσοτυλίου
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Messages In This Thread
RE: Want a list utf8 formatted but bytestrings found - by DeaD_EyE - Feb-14-2019, 09:08 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  [SOLVED] [Windows] Converting filename to UTF8? Winfried 5 2,677 Sep-06-2022, 10:47 PM
Last Post: snippsat
  Formatted string not translated by gettext YvanM 10 2,087 Sep-02-2022, 08:46 PM
Last Post: YvanM
  Split string using variable found in a list japo85 2 1,342 Jul-11-2022, 08:52 AM
Last Post: japo85
  How can I found how many numbers are there in a Collatz Sequence that I found? cananb 2 2,586 Nov-23-2020, 05:15 PM
Last Post: cananb
  How to run a method on an argument in a formatted string Exsul 1 1,713 Aug-30-2019, 01:57 AM
Last Post: Exsul
  How work with formatted text in Python? AlekseyPython 3 2,853 Mar-18-2019, 05:00 AM
Last Post: AlekseyPython
  Who converts data when writing to a database with an encoding different from utf8? AlekseyPython 1 2,404 Mar-04-2019, 08:26 AM
Last Post: DeaD_EyE
  modify line in file if pattern found in list. kttan 1 2,253 Dec-10-2018, 08:45 AM
Last Post: Gribouillis
  How to detect and tell user that no matches were found in a list RedSkeleton007 6 3,968 Jul-19-2018, 06:27 PM
Last Post: woooee
  How can I write formatted (i.e. bold, italic, change font size, etc.) text to a file? JohnJSal 6 24,213 Jun-19-2018, 03:43 PM
Last Post: JohnJSal

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020