Python Forum
Failing reading a file and cannot exit it...
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Failing reading a file and cannot exit it...
I’m trying to scan files in a directory, and I found some files are corrupted somehow.
The files are standard text.log.
I tried opening them in Notepad – no errors and I do not see the content of the file.
I tried to open the files with Notepad++, Files appear to have one line:
I thought I could break out of the file by using try/except but it does not work.
Script just runs and runs... without stopping.
Here is a snipped I tried:
from pathlib import Path

for ef in Path('D:\\somedir\\').iterdir() :
    print(f" File ->{ef}")
    try :
        with open(ef,'r') as mfiler: 
            for echl in mfiler :
                print(f" line ->{echl}")
    except OSError as oss :
        print(f"  bad file -> {echl}")
Thank you!
Have a look at this link on using iterdir()

for ef in Path('some path).iterdir() doesn't look right. I've not used it before but, seems like your looping a loop.
I welcome all feedback.
The only dumb question, is one that doesn't get asked.
My Github
How to post code using bbtags

When I remove the file in question (the one I cannot open)
the snippet seems working fine, I'm using the following all the time and I think it is not the problem.
for ef in Path('D:\\somedir\\').iterdir() :
It is a file is a problem but I cannot abort it or exit it.

Thank you
Maybe it is working fine. How large are these "corrupted" files? It would take a while to print a million empty strings.
I think I found out how to fix the problem I have. See the snipped below.

Files that fail are not big, just 14MB.
I understand the file has data but the characters in the file are corrupted or unprintable.
The first line of each file starts with "YYMMDD HHMMSS", I thought I can check that.
from pathlib import Path
import re

for ef in Path('D:\\Somedir\\').iterdir() :
    with open(ef, encoding='utf-8', errors='ignore') as mfiler:  
        frt_ln = mfiler.readline()
        if not"\d",frt_ln) :
            print(f"  Found bad file -> {ef}")
I need to get the first and the last lines from each file, which I'll use later in the script if you wondering why I'm going this way.

Thank you.
Your are printing out 14 MB files? Or is that just example code? If it is just example code, what kind of processing are you doing?
No, I'm not printing each line of the file Wink
I'm searching for specific lines. It just happened that some of the files I cannot open.
I started looking for a way to abort the search of a file if it is 'can't be open/read" or I can't print each line...

Thank you.
What makes you think you cannot open some of the files?
When I found the first file that the script failed on, I tried to open it with the NotePad and NotePad++
Notepad had blank lines, NotePad++ had a line "NULNULNULNUL..."
I created a script to collect files that the script failes to read a first line and I found the NUL character is actually a '\0'.
It appears "\0" is a unicode character.
I added an if statement to my script:

if"\0\0\0", mysring) :
Also, I'm printing the file that failes and the line that failes to a file so I can debug the script...
I also found that the "NULNULNUL" line can happen anywhere in the file

I appreciate your help!

Possibly Related Threads…
Thread Author Replies Views Last Post
  Reading an ASCII text file and parsing data... oradba4u 2 309 Jun-08-2024, 12:41 AM
Last Post: oradba4u
  Failing to connect by 'net use' tester_V 1 388 Apr-20-2024, 06:31 AM
Last Post: tester_V
Sad problems with reading csv file. MassiJames 3 915 Nov-16-2023, 03:41 PM
Last Post: snippsat
  Reading a file name fron a folder on my desktop Fiona 4 1,155 Aug-23-2023, 11:11 AM
Last Post: Axel_Erfurt
  Reading data from excel file –> process it >>then write to another excel output file Jennifer_Jone 0 1,299 Mar-14-2023, 07:59 PM
Last Post: Jennifer_Jone
  Reading a file JonWayn 3 1,264 Dec-30-2022, 10:18 AM
Last Post: ibreeden
  Reading Specific Rows In a CSV File finndude 3 1,161 Dec-13-2022, 03:19 PM
Last Post: finndude
  Excel file reading problem max70990 1 1,023 Dec-11-2022, 07:00 PM
Last Post: deanhystad
  Replace columns indexes reading a XSLX file Larry1888 2 1,142 Nov-18-2022, 10:16 PM
Last Post: Pedroski55
  python difference between sys.exit and exit() mg24 1 2,086 Nov-12-2022, 01:37 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020