Python Forum
Using .hdf5 files only once they are finished writing
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Using .hdf5 files only once they are finished writing
#1
I am trying to use .hdf5 files once they are done writing (in my case, trying to emit them). But the problem is that I don't have a way to 1) test if they are finished writing and 2) then send them. The code that I have been trying to work with is follows:
        while True:
            event = self._q.get()
            while True:
                try:
                    file = h5py.File(event.src_path, "r")
                    file.close()
                    self.new_file.emit(event.src_path, os.path.basename(event.src_path))
                    break
                except OSError:
                    if retry_count < max_retry_count:
                        retry_count += 1
                        print(f"h5 file <{event.src_path}> is locked, retrying {retry_count}/{max_retry_count}")
                        time.sleep(retry_interval_seconds)
                    else:
                        print(f"h5 file <{event.src_path}> reached max retry count, skipping")

                except Exception as err:
                    print(f"Got unexpected Error <{type(err).__name__}> while opening <{event.src_path}> ")
                    traceback.print_exc()
Obviously this is problematic with the break. But without the break, the try stays in the loop and emits the same file over and over again. This code tests if they are done writing perfectly but the ability to send them and continue to take in new files does not work. Any insight is greatly appreciated.



For completeness, here is the full code:
    import time
    import traceback
    import os
    
    import h5py
    import queue
    from typing import Union
    
    from watchdog.observers import Observer
    from watchdog.events import FileSystemEventHandler, DirCreatedEvent, FileCreatedEvent
    
    from .tools.qt import QtCore
    
    from PyQt5.QtCore import pyqtSignal
    
    
    from PyQt5.QtWidgets import QApplication
    from PyQt5.QtCore import (
        QObject,
        QThread,
        pyqtSignal,
        pyqtSlot,
    )
    
    class NewFileHandler(FileSystemEventHandler):
    
        def __init__(self, q, *a, **k):
            super().__init__(*a, **k)
            self._q = q
    
        def on_created(self, event):
            self._q.put(event)
    
    class Worker(QObject):
    
        new_file = pyqtSignal(str,str)
    
        def __init__(self, path):
            super().__init__()
            self._q = queue.Queue()
            observer = Observer()
            handler = NewFileHandler(self._q)
            observer.schedule(handler, path=path, recursive=True)
            # starts a background thread! Thus we need to wait for the
            # queue to receive the events in work.
            observer.start()
    
        def work(self):
            max_retry_count = 3500  # for test purposes now but want to set an upper bound on verifying a file is finished.
            retry_interval_seconds = .01  # every hundreth it will try the file to see if it finished writing
            retry_count = 0
            while True:
                event = self._q.get()
                while True:
                    try:
                        file = h5py.File(event.src_path, "r")
                        file.close()
                        self.new_file.emit(event.src_path, os.path.basename(event.src_path))
                    except OSError:
                        if retry_count < max_retry_count:
                            retry_count += 1
                            print(f"h5 file <{event.src_path}> is locked, retrying {retry_count}/{max_retry_count}")
                            time.sleep(retry_interval_seconds)
                        else:
                            print(f"h5 file <{event.src_path}> reached max retry count, skipping")
    
                    except Exception as err:
                        print(f"Got unexpected Error <{type(err).__name__}> while opening <{event.src_path}> ")
                        traceback.print_exc()
This code is ran in a main.py file by the following code:

        thread = QThread(parent=self)
        print('try to connect to event service ...')
        worker = watchdog_search.Worker("/home/test_image_analyzer_files/Test_Data/")
        worker.moveToThread(thread)
        thread.started.connect(worker.work)
        thread.start()
        worker.new_file.connect(self.on_finished_run)
Reply
#2
Use the else part of try statements
while True:
    event = self._q.get()
    while True:
        try:
            file = h5py.File(event.src_path, "r")
            file.close()
        except OSError:
            if retry_count < max_retry_count:
                retry_count += 1
                print(f"h5 file <{event.src_path}> is locked, retrying {retry_count}/{max_retry_count}")
                time.sleep(retry_interval_seconds)
            else:
                print(f"h5 file <{event.src_path}> reached max retry count, skipping")
                break # <--- looks useful here
        except Exception as err:
            print(f"Got unexpected Error <{type(err).__name__}> while opening <{event.src_path}> ")
            traceback.print_exc()
        else:
            self.new_file.emit(event.src_path, os.path.basename(event.src_path))
            break
You could perhaps add a time.sleep() in the except Exception branch as well.
Reply
#3
Thank you for the tip. That helped and now it somewhat works. Now the problem is when one file is emitted i.e. goes to the else part, afterward the code gets stuck in OSError loop and goes through the if else loop until the else's break. I tried a few things but nothing worked. It seems the problem is that after emitting the signal, there is no event at self._q.get() so it goes into OSerror until the loop is broken. If during this loop, there is a new file at self._q.get() the file can be emitted once the if/else loop is finished and the process starts again. Tersely: the problem now is the wait time during files. Do you have any idea how I can fix this?


        
while True:
            event = self._q.get()
            max_retry_count = 3500  # for test purposes now but want to set an upper bound on verifying a file is finished.
            retry_interval_seconds = .01  # every hundreth it will try the file to see if it finished writing
            retry_count = 0            
            while True:
                try:
                    file = h5py.File(event.src_path, "r")
                    file.close()
                except OSError:
                    if retry_count < max_retry_count:
                        retry_count += 1
                        print(f"h5 file <{event.src_path}> is locked, retrying {retry_count}/{max_retry_count}")
                        time.sleep(retry_interval_seconds)
                    else:
                        print(f"h5 file <{event.src_path}> reached max retry count, skipping")
                        break  # <--- looks useful here
                except Exception as err:
                    print(f"Got unexpected Error <{type(err).__name__}> while opening <{event.src_path}> ")
                    traceback.print_exc()
                else:
                    self.new_file.emit(event.src_path, os.path.basename(event.src_path))
                    break
Reply
#4
I solved the problem with the following code if statement:

        while True:
            event = self._q.get()
            max_retry_count = 350  # for test purposes now but want to set an upper bound on verifying a file is finished.
            retry_interval_seconds = .01  # every hundreth it will try the file to see if it finished writing
            retry_count = 0
            if event.event_type == "created" and event.src_path.lower().endswith(".hdf5"):
                while True:
                    try:
                        file = h5py.File(event.src_path, "r")
                        file.close()
                    except OSError:
                        if retry_count < max_retry_count:
                            retry_count += 1
                            print(f"h5 file <{event.src_path}> is locked, retrying {retry_count}/{max_retry_count}")
                            time.sleep(retry_interval_seconds)
                        else:
                            print(f"h5 file <{event.src_path}> reached max retry count, skipping")
                            break  # <--- looks useful here
                    except Exception as err:
                        print(f"Got unexpected Error <{type(err).__name__}> while opening <{event.src_path}> ")
                        traceback.print_exc()
                    else:
                        self.new_file.emit(event.src_path, os.path.basename(event.src_path))
                        break
Reply
#5
(Nov-03-2021, 07:40 AM)pyhill00 Wrote: I solved the problem with the following if statement:

        while True:
            event = self._q.get()
            max_retry_count = 350  # for test purposes now but want to set an upper bound on verifying a file is finished.
            retry_interval_seconds = .01  # every hundreth it will try the file to see if it finished writing
            retry_count = 0
            if event.event_type == "created" and event.src_path.lower().endswith(".hdf5"):
                while True:
                    try:
                        file = h5py.File(event.src_path, "r")
                        file.close()
                    except OSError:
                        if retry_count < max_retry_count:
                            retry_count += 1
                            print(f"h5 file <{event.src_path}> is locked, retrying {retry_count}/{max_retry_count}")
                            time.sleep(retry_interval_seconds)
                        else:
                            print(f"h5 file <{event.src_path}> reached max retry count, skipping")
                            break  # <--- looks useful here
                    except Exception as err:
                        print(f"Got unexpected Error <{type(err).__name__}> while opening <{event.src_path}> ")
                        traceback.print_exc()
                    else:
                        self.new_file.emit(event.src_path, os.path.basename(event.src_path))
                        break
Reply
#6
Is it possible, that I need to close the thread after closing the program? I noticed after about five times of using the program the watchdog file doesn't even get called anymore and I am left at:

try to connect to event service ...
Reply
#7
I found where the problem is:

thread.started.connect(worker.work)
doesn't work after I run the program numerous times after another (i.e.: run the program then close the program then run the program again etc.). Does anyone know why this line doesn't work?
Reply
#8
I fixed the issue by using Daemon threading:

    worker = watchdog_search.Worker("/home/test_image_analyzer_files/Test_Data/")
    worker.new_file.connect(self.on_finished_run)
    thread = threading.Thread(target=worker.work)
    thread.setDaemon(True)
    thread.start()
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Writing into 2 text files from the same function paul18fr 4 1,628 Jul-28-2022, 04:34 AM
Last Post: ndc85430
  Process finished with exit code 137 (interrupted by signal 9: SIGKILL) erdemath 2 9,384 Apr-18-2022, 08:40 PM
Last Post: erdemath
  How to check if a file has finished being written leocsmith 2 7,688 Apr-14-2021, 04:21 PM
Last Post: perfringo
  process finished with exit code -1073741819 (0xC0000005) GMCobraz 8 5,299 Sep-01-2020, 08:19 AM
Last Post: GMCobraz
  How to stop Xmodem after bin file transfer was finished shaya2103 0 2,473 Nov-27-2019, 04:33 PM
Last Post: shaya2103
  Reading and writing files JakeHoward4 1 1,778 Aug-07-2019, 06:22 PM
Last Post: Yoriz
  Process finished with exit code -107374819 (0xC0000375) mrazko 2 8,394 Apr-05-2019, 12:46 PM
Last Post: mrazko
  Fabric - Run method is not being finished mglowinski93 3 3,593 Dec-29-2018, 10:45 AM
Last Post: mglowinski93
  Progress Finished Question malonn 32 17,220 May-23-2018, 02:43 AM
Last Post: malonn
  Help writing to files HummingMaster 3 3,509 Mar-12-2017, 05:24 PM
Last Post: zivoni

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020