Python Forum

Full Version: q re glob.iglob iterator and close
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm struggling to find any documentation that tells me how I should properly "short circuit" a glob.iglob iterator.

I have some code that needs to check that a directory has at least one file matching a given pattern. It seemed reasonable to handle this via a call to glob.iglob that then tested for a value:

itr = glob.iglob(os.path.join(publickeys, "*.key"))
if next(itr, None) is None:
    raise ValueError(f"publickeys {publickeys} has no *.key files")
I know that under the hood what glob.iglob is doing is yield'ing on a call to a 'private' function that ultimately calls os.scandir:

  1. https://github.com/python/cpython/blob/m...lob.py#L69
  2. https://github.com/python/cpython/blob/m...lob.py#L94
  3. https://github.com/python/cpython/blob/m...ob.py#L163
  4. https://github.com/python/cpython/blob/m...ob.py#L128
  5. https://github.com/python/cpython/blob/m...ob.py#L146

The os.scandir documentation (https://docs.python.org/3/library/os.html#os.scandir) does say it supports close to free acquired resources, and you can see in step 5 above that os.scandir is called using a with clause, which will close the os.scandir file descriptor once the entries have been looped over. But, of course, in my code I'm not looping over everything. I'm just checking the first entry...

I thought the following would be reasonable:

try:
    if next(itr, None) is None:
        raise ValueError(f"publickeys {publickeys} has no *.key files")
finally:
    itr.close()
and some testing shows that it does what I expect, the close'd iterator does properly reflect its closed state if I try and call next() on the iterator after I've closed it.

But a fellow developer is calling this try/finally into question and asking whether or not there is any risk if relying on the iterator close() method is a mistake, that it relies on "knowing that os.scandir is being called."

TLDR: Do I need to close a glob.iglob iterator that I'm not draining, or should I just let python's gc cycle 'handle it' whenever it decides it's safe to close the iterator?
As long as an opaque object such as itr is in scope, there is a risk that it keeps some resources alive. The simplest opaque way to clear these resources is that the object disappears from scope. It seems to me that the best way is to encapsulate this in a function
if no_glob(os.path.join(publickeys, "*.key")):
    raise ValueError(f"publickeys {publickeys} has no *.key files")

def no_glob(pathname, recursive=False):
    return next(glob.iglob(pathname, recursive), None) is None
itertools module documentation has recipes chapter and there is function consume for consuming iterator:

def consume(iterator, n=None):
    "Advance the iterator n-steps ahead. If n is None, consume entirely."
    # Use functions that consume iterators at C speed.
    if n is None:
        # feed the entire iterator into a zero-length deque
        collections.deque(iterator, maxlen=0)
    else:
        # advance to the empty slice starting at position n
        next(islice(iterator, n, n), None)