Want a list utf8 formatted but bytestrings found

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

Want a list utf8 formatted but bytestrings found

DeaD_EyE
So-and-so of the Yard

Posts: 2,024

Threads: 9

Joined: May 2017

Reputation: 230

Feb-14-2019, 09:08 AM (This post was last modified: Feb-14-2019, 09:08 AM by DeaD_EyE.)

The \x00 you see, are bytes in hexadecimal representation. This is the representation of the string.
This representation is used in str, bytes, bytearray.
All characters, which can not displayed or are control characters, are displayed in this format.
If you print them, you don't see this internal representation of string literals.

With your data:

items = ['', 'Alexander Lepsveridze', 'John Comeau', '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82', '\xce\x8c\xce\xbc\xce\xb9\xce\xbb\xce\xbf\xcf\x82 \xce\xa4\xcf\x83\xce\xbf\xcf\x84\xcf\x85\xce\xbb\xce\xaf\xce\xbf\xcf\x85']

for item in items:
    print(item)

Output:Alexander Lepsveridze
John Comeau
ÎÎºÎ·Ï Î¤ÏÎ¹Î¬Î¼Î·Ï
ÎÎ¼Î¹Î»Î¿Ï Î¤ÏÎ¿Ï
                 Ï
Î»Î¯Î¿Ï

Now with a module, which can fix broken encodings:

import ftfy
items = ['', 'Alexander Lepsveridze', 'John Comeau', '\xce\x86\xce\xba\xce\xb7\xcf\x82 \xce\xa4\xcf\x83\xce\xb9\xce\xac\xce\xbc\xce\xb7\xcf\x82', '\xce\x8c\xce\xbc\xce\xb9\xce\xbb\xce\xbf\xcf\x82 \xce\xa4\xcf\x83\xce\xbf\xcf\x84\xcf\x85\xce\xbb\xce\xaf\xce\xbf\xcf\x85']

for item in items:
    print(ftfy.fix_encoding(item))

Output:Alexander Lepsveridze
John Comeau
Άκης Τσιάμης
Όμιλος Τσοτυλίου

The string was originally utf8, but was encoded with latin1.

print(items[-1].encode('latin1').decode('utf8'))

Output:
Όμιλος Τσοτυλίου

Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!

Website Find

Messages In This Thread

Want a list utf8 formatted but bytestrings found - by nikos - Feb-14-2019, 07:47 AM

RE: Want a list utf8 formatted but bytestrings found - by DeaD_EyE - Feb-14-2019, 09:08 AM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-14-2019, 01:52 PM

RE: Want a list utf8 formatted but bytestrings found - by DeaD_EyE - Feb-14-2019, 04:52 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-14-2019, 04:55 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-14-2019, 07:08 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-14-2019, 08:11 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-15-2019, 12:58 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-15-2019, 01:23 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-15-2019, 02:06 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-15-2019, 02:22 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-15-2019, 02:23 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-15-2019, 05:03 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-15-2019, 05:43 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-15-2019, 07:07 PM

RE: Want a list utf8 formatted but bytestrings found - by DeaD_EyE - Feb-15-2019, 06:42 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-16-2019, 12:33 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-16-2019, 01:00 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-16-2019, 05:08 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-16-2019, 06:53 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-16-2019, 08:31 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-16-2019, 09:30 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-16-2019, 09:34 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-16-2019, 10:00 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-16-2019, 10:18 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-16-2019, 10:32 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-17-2019, 01:16 PM

RE: Want a list utf8 formatted but bytestrings found - by snippsat - Feb-17-2019, 03:09 PM

RE: Want a list utf8 formatted but bytestrings found - by nikos - Feb-18-2019, 08:26 AM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[SOLVED] [Windows] Converting filename to UTF8?	Winfried	5	2,677	Sep-06-2022, 10:47 PM Last Post: snippsat
	Formatted string not translated by gettext	YvanM	10	2,087	Sep-02-2022, 08:46 PM Last Post: YvanM
	Split string using variable found in a list	japo85	2	1,342	Jul-11-2022, 08:52 AM Last Post: japo85
	How can I found how many numbers are there in a Collatz Sequence that I found?	cananb	2	2,586	Nov-23-2020, 05:15 PM Last Post: cananb
	How to run a method on an argument in a formatted string	Exsul	1	1,713	Aug-30-2019, 01:57 AM Last Post: Exsul
	How work with formatted text in Python?	AlekseyPython	3	2,853	Mar-18-2019, 05:00 AM Last Post: AlekseyPython
	Who converts data when writing to a database with an encoding different from utf8?	AlekseyPython	1	2,404	Mar-04-2019, 08:26 AM Last Post: DeaD_EyE
	modify line in file if pattern found in list.	kttan	1	2,253	Dec-10-2018, 08:45 AM Last Post: Gribouillis
	How to detect and tell user that no matches were found in a list	RedSkeleton007	6	3,968	Jul-19-2018, 06:27 PM Last Post: woooee
	How can I write formatted (i.e. bold, italic, change font size, etc.) text to a file?	JohnJSal	6	24,213	Jun-19-2018, 03:43 PM Last Post: JohnJSal

Users browsing this thread: 1 Guest(s)

View a Printable Version

Want a list utf8 formatted but bytestrings found

User Panel Messages

Announcements