How to concatenate files while looping through lists? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: How to concatenate files while looping through lists? (/thread-16934.html) |
How to concatenate files while looping through lists? - python_newbie09 - Mar-20-2019 I have a few hundreds of files in one folder and every file in the folder has 2 more corresponding files that it needs to combine with. I have done the code for finding the corresponding files and adding them into a list but I am now stuck at figuring out how to combine these files while looping through each list. f = f = ['a_b_c_111.hdf', 'b_b_c_111.hdf', 'b_c_e_112.hdf','c_c_e_112.hdf'] file_to_combine = {} for file in f: a,b,c,d = re.split(r'[_]',file) s = c + '_' + d if s in file_to_combine: file_to_combine[s].append(os.path.join(file)) else: file_to_combine[s] =[os.path.join(file)] for (k, v) in file_to_combine.items(): files = [','.join(v)] for i in files: split_files = files[0].split(",") print (split_files)running this script will result in 2 lists but it will be more when I run through 100s of files: ['a_b_c_111.hdf', 'b_b_c_111.hdf'] ['b_c_e_112.hdf', 'c_c_e_112.hdf'] I am now stuck and finding a way to loop through this list and concatenating hdf files within each of this list. Would appreciate some help on how best to do this. Thanks. RE: How to concatenate files while looping through lists? - Yoriz - Mar-20-2019 This might be useful HDF5 for Python RE: How to concatenate files while looping through lists? - DeaD_EyE - Mar-20-2019 Maybe this helps a little bit to understand. import pathlib from collections import defaultdict from itertools import groupby def get_group(file): """ Returns the group as a str """ return '_'.join(file.name.split('_')[2:4]) def file_order(file): a, b = map(int, file.name.split('_')[:2]) # I am using numbers for a and b in test code # #a, b = file.name.split('_')[:2] return a, b def grouped_files(doc_root): doc_root = pathlib.Path(doc_root) grouped = defaultdict(list) iterator = doc_root.glob('*_*_*_*.hdf') for group, items in groupby(iterator, key=get_group): for item in items: grouped[group].append(item) for files in grouped.values(): # doing an inline sort, which mutates the # list files.sort(key=file_order) return groupedIf the order of the hdf_files in each group doesn't matter, you can remove the sort code. On the other side, sorting can simplify the code. def grouped_files(doc_root): doc_root = pathlib.Path(doc_root) grouped = {} iterator = doc_root.glob('*_*_*_*.hdf') sorted_files = sorted(iterator, key=get_group) for group, items in groupby(sorted_files, key=get_group): # because of the sorted list by group, # each group occours only once # items is an interator, you have to consume the iterator grouped[group] = list(items) # if sorting of hdf_files is required # grouped[group] = sorted(items, key=file_order) return groupedhttps://docs.python.org/3/library/stdtypes.html#list.sort https://docs.python.org/3/library/functions.html#sorted https://docs.python.org/3/library/itertools.html#itertools.groupby https://docs.python.org/3/library/pathlib.html#module-pathlib RE: How to concatenate files while looping through lists? - python_newbie09 - Mar-24-2019 thank you. i will try this out. |