Special Characters read-write

Prisonfeed · Sep-17-2023, 08:07 PM

I have a directory filled with .gz text archives. To scan these archives, I use the following python code:

    with gzip.open(logDir+"\\"+fileName, mode="rb") as archive:
        for filename in archive:
            print(filename.decode().strip())

All used to work, however, the new system adds lines similar to this:

:§f Press [§bJ§f]

Python gives me this error:

File "C:\Users\Me\Documents\Python\ConvertLog.py", line 16, in readZIP print(filename.decode().strip())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 49: invalid start byte

Anyone know a way of dealing with strange characters that pop up? I can't just ignore the line. This happens to be one of the few lines I need to strip out and write to a condensed report.

I tried other modes, besides "rb". I really have no idea what else to try.

**Gribouillis** · (This post was last modified: Sep-17-2023, 08:29 PM by Gribouillis.)

Try to use the chardet module to detect the filename's encoding

>>> import chardet
>>> b = 'bépoç%$'.encode('latin1')
>>> b
b'b\xe9po\xe7%$'
>>> b.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1: invalid continuation byte
>>> chardet.detect(b)
{'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}
>>> enc = chardet.detect(b)['encoding']
>>> b.decode(enc)
'bépoç%$'

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	[SOLVED] Special characters in XML	ForeverNoob	3	2,082	Dec-04-2024, 01:26 PM Last Post: ForeverNoob
	python read PDF Statement and write it into excel	mg24	1	1,137	Sep-22-2024, 11:42 AM Last Post: Pedroski55
	Delete file with read-only permission, but write permission to parent folder	cubei	6	26,199	Jun-01-2024, 07:22 AM Last Post: Eleanorreo
	Copy xml content from webpage and save to locally without special characters	Nik1811	14	5,811	Mar-26-2024, 09:28 AM Last Post: Nik1811
	python Read each xlsx file and write it into csv with pipe delimiter	mg24	4	4,236	Nov-09-2023, 10:56 AM Last Post: mg24
	How do I read and write a binary file in Python?	blackears	6	27,169	Jun-06-2023, 06:37 PM Last Post: rajeshgk
	Read text file, modify it then write back	Pavel_47	5	5,213	Feb-18-2023, 02:49 PM Last Post: deanhystad
	how to read txt file, and write into excel with multiply sheet	jacklee26	14	17,977	Jan-21-2023, 06:57 AM Last Post: jacklee26
	Read JSON via API and write to SQL database	TecInfo	5	5,251	Aug-09-2022, 04:44 PM Last Post: TecInfo
	Write and read back data	Aggie64	6	3,297	Apr-18-2022, 03:23 PM Last Post: bowlofred

Special Characters read-write

User Panel Messages

Announcements