Python Forum
Want a list utf8 formatted but bytestrings found
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Want a list utf8 formatted but bytestrings found
#2
The \x00 you see, are bytes in hexadecimal representation. This is the representation of the string.
This representation is used in str, bytes, bytearray.
All characters, which can not displayed or are control characters, are displayed in this format.
If you print them, you don't see this internal representation of string literals.

With your data:
items = ['', 'Alexander Lepsveridze', 'John Comeau', '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82', '\xce\x8c\xce\xbc\xce\xb9\xce\xbb\xce\xbf\xcf\x82 \xce\xa4\xcf\x83\xce\xbf\xcf\x84\xcf\x85\xce\xbb\xce\xaf\xce\xbf\xcf\x85']

for item in items:
    print(item)
Output:
Alexander Lepsveridze John Comeau ÎÎºÎ·Ï Î¤ÏÎ¹Î¬Î¼Î·Ï ÎÎ¼Î¹Î»Î¿Ï Î¤ÏÎ¿Ï Ï Î»Î¯Î¿Ï
Now with a module, which can fix broken encodings:

import ftfy
items = ['', 'Alexander Lepsveridze', 'John Comeau', '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82', '\xce\x8c\xce\xbc\xce\xb9\xce\xbb\xce\xbf\xcf\x82 \xce\xa4\xcf\x83\xce\xbf\xcf\x84\xcf\x85\xce\xbb\xce\xaf\xce\xbf\xcf\x85']

for item in items:
    print(ftfy.fix_encoding(item))
Output:
Alexander Lepsveridze John Comeau Άκης Τσιάμης Όμιλος Τσοτυλίου
The string was originally utf8, but was encoded with latin1.

print(items[-1].encode('latin1').decode('utf8'))
Output:
Όμιλος Τσοτυλίου
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Messages In This Thread
RE: Want a list utf8 formatted but bytestrings found - by DeaD_EyE - Feb-14-2019, 09:08 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  [SOLVED] [Windows] Converting filename to UTF8? Winfried 5 2,858 Sep-06-2022, 10:47 PM
Last Post: snippsat
  Formatted string not translated by gettext YvanM 10 2,138 Sep-02-2022, 08:46 PM
Last Post: YvanM
  Split string using variable found in a list japo85 2 1,406 Jul-11-2022, 08:52 AM
Last Post: japo85
  How can I found how many numbers are there in a Collatz Sequence that I found? cananb 2 2,644 Nov-23-2020, 05:15 PM
Last Post: cananb
  How to run a method on an argument in a formatted string Exsul 1 1,731 Aug-30-2019, 01:57 AM
Last Post: Exsul
  How work with formatted text in Python? AlekseyPython 3 2,888 Mar-18-2019, 05:00 AM
Last Post: AlekseyPython
  Who converts data when writing to a database with an encoding different from utf8? AlekseyPython 1 2,436 Mar-04-2019, 08:26 AM
Last Post: DeaD_EyE
  modify line in file if pattern found in list. kttan 1 2,286 Dec-10-2018, 08:45 AM
Last Post: Gribouillis
  How to detect and tell user that no matches were found in a list RedSkeleton007 6 4,046 Jul-19-2018, 06:27 PM
Last Post: woooee
  How can I write formatted (i.e. bold, italic, change font size, etc.) text to a file? JohnJSal 6 24,369 Jun-19-2018, 03:43 PM
Last Post: JohnJSal

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020