Python Forum
How to concatenate files while looping through lists?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to concatenate files while looping through lists?
#1
I have a few hundreds of files in one folder and every file in the folder has 2 more corresponding files that it needs to combine with. I have done the code for finding the corresponding files and adding them into a list but I am now stuck at figuring out how to combine these files while looping through each list.

f = f = ['a_b_c_111.hdf', 'b_b_c_111.hdf', 'b_c_e_112.hdf','c_c_e_112.hdf']
file_to_combine = {}

for file in f:
    a,b,c,d = re.split(r'[_]',file)
    s = c + '_' + d
    if s in file_to_combine:        
        file_to_combine[s].append(os.path.join(file))
    else:
        file_to_combine[s] =[os.path.join(file)]

for (k, v) in file_to_combine.items():    
    files = [','.join(v)]
    
    for i in files:
        split_files = files[0].split(",")
        print (split_files) 
running this script will result in 2 lists but it will be more when I run through 100s of files:

['a_b_c_111.hdf', 'b_b_c_111.hdf']
['b_c_e_112.hdf', 'c_c_e_112.hdf']

I am now stuck and finding a way to loop through this list and concatenating hdf files within each of this list.
Would appreciate some help on how best to do this. Thanks.
Reply
#2
This might be useful HDF5 for Python
Reply
#3
Maybe this helps a little bit to understand.

import pathlib
from collections import defaultdict
from itertools import groupby

def get_group(file):
    """
    Returns the group as a str
    """
    return '_'.join(file.name.split('_')[2:4])

def file_order(file):
    a, b = map(int, file.name.split('_')[:2])
    # I am using numbers for a and b in test code
    # 
    #a, b = file.name.split('_')[:2]
    return a, b
    
def grouped_files(doc_root):
    doc_root = pathlib.Path(doc_root)
    grouped = defaultdict(list)
    iterator = doc_root.glob('*_*_*_*.hdf')
    for group, items in groupby(iterator, key=get_group):
        for item in items:
            grouped[group].append(item)
    for files in grouped.values():
        # doing an inline sort, which mutates the
        # list
        files.sort(key=file_order)       
    return grouped
If the order of the hdf_files in each group doesn't matter, you can remove the sort code.
On the other side, sorting can simplify the code.

def grouped_files(doc_root):
    doc_root = pathlib.Path(doc_root)
    grouped = {}
    iterator = doc_root.glob('*_*_*_*.hdf')
    sorted_files = sorted(iterator, key=get_group)
    for group, items in groupby(sorted_files, key=get_group):
        # because of the sorted list by group,
        # each group occours only once
        # items is an interator, you have to consume the iterator
        grouped[group] = list(items)
        # if sorting of hdf_files is required
        # grouped[group] = sorted(items, key=file_order)
    return grouped
https://docs.python.org/3/library/stdtypes.html#list.sort
https://docs.python.org/3/library/functions.html#sorted
https://docs.python.org/3/library/iterto...ls.groupby
https://docs.python.org/3/library/pathli...le-pathlib
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
thank you. i will try this out.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python convert multiple files to multiple lists MCL169 6 1,432 Nov-25-2023, 05:31 AM
Last Post: Iqratech
  Generate a string of words for multiple lists of words in txt files in order. AnicraftPlayz 2 2,757 Aug-11-2021, 03:45 PM
Last Post: jamesaarr
  Concatenate str JohnnyCoffee 2 2,879 May-01-2021, 03:58 PM
Last Post: JohnnyCoffee
  Looping through Folder structure and get files mfkzolo 0 1,870 Nov-02-2020, 08:31 AM
Last Post: mfkzolo
  Populate the new lists by looping over the original lists drunkenphd 1 1,496 Oct-10-2020, 02:54 AM
Last Post: Skaperen
  Split dict of lists into smaller dicts of lists. pcs3rd 3 2,312 Sep-19-2020, 09:12 AM
Last Post: ibreeden
  Concatenate two dataframes moralear27 2 1,836 Sep-15-2020, 08:04 AM
Last Post: moralear27
  Concatenate two files with different columns into one dataframe moralear27 1 2,093 Sep-11-2020, 10:18 PM
Last Post: moralear27
  can only concatenate str (not "int") to str gr3yali3n 6 4,022 May-28-2020, 07:20 AM
Last Post: pyzyx3qwerty
  Concatenate two dictionaries harish 3 2,323 Oct-12-2019, 04:52 PM
Last Post: strngr12

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020