Maybe this helps a little bit to understand.
On the other side, sorting can simplify the code.
https://docs.python.org/3/library/functions.html#sorted
https://docs.python.org/3/library/iterto...ls.groupby
https://docs.python.org/3/library/pathli...le-pathlib
import pathlib from collections import defaultdict from itertools import groupby def get_group(file): """ Returns the group as a str """ return '_'.join(file.name.split('_')[2:4]) def file_order(file): a, b = map(int, file.name.split('_')[:2]) # I am using numbers for a and b in test code # #a, b = file.name.split('_')[:2] return a, b def grouped_files(doc_root): doc_root = pathlib.Path(doc_root) grouped = defaultdict(list) iterator = doc_root.glob('*_*_*_*.hdf') for group, items in groupby(iterator, key=get_group): for item in items: grouped[group].append(item) for files in grouped.values(): # doing an inline sort, which mutates the # list files.sort(key=file_order) return groupedIf the order of the hdf_files in each group doesn't matter, you can remove the sort code.
On the other side, sorting can simplify the code.
def grouped_files(doc_root): doc_root = pathlib.Path(doc_root) grouped = {} iterator = doc_root.glob('*_*_*_*.hdf') sorted_files = sorted(iterator, key=get_group) for group, items in groupby(sorted_files, key=get_group): # because of the sorted list by group, # each group occours only once # items is an interator, you have to consume the iterator grouped[group] = list(items) # if sorting of hdf_files is required # grouped[group] = sorted(items, key=file_order) return groupedhttps://docs.python.org/3/library/stdtypes.html#list.sort
https://docs.python.org/3/library/functions.html#sorted
https://docs.python.org/3/library/iterto...ls.groupby
https://docs.python.org/3/library/pathli...le-pathlib
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
All humans together. We don't need politicians!