Python Forum
Special Characters read-write
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Special Characters read-write
#1
Question 
I have a directory filled with .gz text archives. To scan these archives, I use the following python code:

    with gzip.open(logDir+"\\"+fileName, mode="rb") as archive:
        for filename in archive:
            print(filename.decode().strip())
All used to work, however, the new system adds lines similar to this:

:§f Press [§bJ§f]

Python gives me this error:

File "C:\Users\Me\Documents\Python\ConvertLog.py", line 16, in readZIP print(filename.decode().strip())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa7 in position 49: invalid start byte
Anyone know a way of dealing with strange characters that pop up? I can't just ignore the line. This happens to be one of the few lines I need to strip out and write to a condensed report.

I tried other modes, besides "rb". I really have no idea what else to try.
Reply
#2
Try to use the chardet module to detect the filename's encoding
>>> import chardet
>>> b = 'bépoç%$'.encode('latin1')
>>> b
b'b\xe9po\xe7%$'
>>> b.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1: invalid continuation byte
>>> chardet.detect(b)
{'encoding': 'ISO-8859-1', 'confidence': 0.73, 'language': ''}
>>> enc = chardet.detect(b)['encoding']
>>> b.decode(enc)
'bépoç%$'
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [SOLVED] Special characters in XML ForeverNoob 3 1,653 Dec-04-2024, 01:26 PM
Last Post: ForeverNoob
  python read PDF Statement and write it into excel mg24 1 932 Sep-22-2024, 11:42 AM
Last Post: Pedroski55
  Delete file with read-only permission, but write permission to parent folder cubei 6 25,225 Jun-01-2024, 07:22 AM
Last Post: Eleanorreo
  Copy xml content from webpage and save to locally without special characters Nik1811 14 4,916 Mar-26-2024, 09:28 AM
Last Post: Nik1811
  python Read each xlsx file and write it into csv with pipe delimiter mg24 4 3,716 Nov-09-2023, 10:56 AM
Last Post: mg24
  How do I read and write a binary file in Python? blackears 6 24,009 Jun-06-2023, 06:37 PM
Last Post: rajeshgk
  Read text file, modify it then write back Pavel_47 5 4,468 Feb-18-2023, 02:49 PM
Last Post: deanhystad
  how to read txt file, and write into excel with multiply sheet jacklee26 14 16,663 Jan-21-2023, 06:57 AM
Last Post: jacklee26
  Read JSON via API and write to SQL database TecInfo 5 4,725 Aug-09-2022, 04:44 PM
Last Post: TecInfo
  Write and read back data Aggie64 6 3,037 Apr-18-2022, 03:23 PM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020