![]() |
Failing reading a file and cannot exit it... - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Failing reading a file and cannot exit it... (/thread-37999.html) |
Failing reading a file and cannot exit it... - tester_V - Aug-19-2022 Greetings! I’m trying to scan files in a directory, and I found some files are corrupted somehow. The files are standard text.log. I tried opening them in Notepad – no errors and I do not see the content of the file. I tried to open the files with Notepad++, Files appear to have one line: NULNULNUL.... I thought I could break out of the file by using try/except but it does not work. Script just runs and runs... without stopping. Here is a snipped I tried: from pathlib import Path for ef in Path('D:\\somedir\\').iterdir() : print(f" File ->{ef}") try : with open(ef,'r') as mfiler: for echl in mfiler : print(f" line ->{echl}") except OSError as oss : print(f" bad file -> {echl}")Thank you! RE: Failing reading a file and cannot exit it... - menator01 - Aug-19-2022 Have a look at this link on using iterdir() for ef in Path('some path).iterdir() doesn't look right. I've not used it before but, seems like your looping a loop.
RE: Failing reading a file and cannot exit it... - tester_V - Aug-19-2022 When I remove the file in question (the one I cannot open) the snippet seems working fine, I'm using the following all the time and I think it is not the problem. for ef in Path('D:\\somedir\\').iterdir() :It is a file is a problem but I cannot abort it or exit it. Thank you RE: Failing reading a file and cannot exit it... - deanhystad - Aug-19-2022 Maybe it is working fine. How large are these "corrupted" files? It would take a while to print a million empty strings. RE: Failing reading a file and cannot exit it... - tester_V - Aug-19-2022 I think I found out how to fix the problem I have. See the snipped below. Files that fail are not big, just 14MB. I understand the file has data but the characters in the file are corrupted or unprintable. The first line of each file starts with "YYMMDD HHMMSS", I thought I can check that. from pathlib import Path import re for ef in Path('D:\\Somedir\\').iterdir() : with open(ef, encoding='utf-8', errors='ignore') as mfiler: frt_ln = mfiler.readline() if not re.search("\d",frt_ln) : print(f" Found bad file -> {ef}") continueI need to get the first and the last lines from each file, which I'll use later in the script if you wondering why I'm going this way. Thank you. RE: Failing reading a file and cannot exit it... - deanhystad - Aug-19-2022 Your are printing out 14 MB files? Or is that just example code? If it is just example code, what kind of processing are you doing? RE: Failing reading a file and cannot exit it... - tester_V - Aug-19-2022 No, I'm not printing each line of the file ![]() I'm searching for specific lines. It just happened that some of the files I cannot open. I started looking for a way to abort the search of a file if it is 'can't be open/read" or I can't print each line... Thank you. RE: Failing reading a file and cannot exit it... - deanhystad - Aug-19-2022 What makes you think you cannot open some of the files? RE: Failing reading a file and cannot exit it... - tester_V - Aug-19-2022 When I found the first file that the script failed on, I tried to open it with the NotePad and NotePad++ Notepad had blank lines, NotePad++ had a line "NULNULNULNUL..." I created a script to collect files that the script failes to read a first line and I found the NUL character is actually a '\0'. It appears "\0" is a unicode character. I added an if statement to my script: if re.search("\0\0\0", mysring) : breakAlso, I'm printing the file that failes and the line that failes to a file so I can debug the script... I also found that the "NULNULNUL" line can happen anywhere in the file I appreciate your help! |