Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 How to concatenate files while looping through lists?
#1
I have a few hundreds of files in one folder and every file in the folder has 2 more corresponding files that it needs to combine with. I have done the code for finding the corresponding files and adding them into a list but I am now stuck at figuring out how to combine these files while looping through each list.

f = f = ['a_b_c_111.hdf', 'b_b_c_111.hdf', 'b_c_e_112.hdf','c_c_e_112.hdf']
file_to_combine = {}

for file in f:
    a,b,c,d = re.split(r'[_]',file)
    s = c + '_' + d
    if s in file_to_combine:        
        file_to_combine[s].append(os.path.join(file))
    else:
        file_to_combine[s] =[os.path.join(file)]

for (k, v) in file_to_combine.items():    
    files = [','.join(v)]
    
    for i in files:
        split_files = files[0].split(",")
        print (split_files) 
running this script will result in 2 lists but it will be more when I run through 100s of files:

['a_b_c_111.hdf', 'b_b_c_111.hdf']
['b_c_e_112.hdf', 'c_c_e_112.hdf']

I am now stuck and finding a way to loop through this list and concatenating hdf files within each of this list.
Would appreciate some help on how best to do this. Thanks.
Quote
#2
This might be useful HDF5 for Python
Quote
#3
Maybe this helps a little bit to understand.

import pathlib
from collections import defaultdict
from itertools import groupby

def get_group(file):
    """
    Returns the group as a str
    """
    return '_'.join(file.name.split('_')[2:4])

def file_order(file):
    a, b = map(int, file.name.split('_')[:2])
    # I am using numbers for a and b in test code
    # 
    #a, b = file.name.split('_')[:2]
    return a, b
    
def grouped_files(doc_root):
    doc_root = pathlib.Path(doc_root)
    grouped = defaultdict(list)
    iterator = doc_root.glob('*_*_*_*.hdf')
    for group, items in groupby(iterator, key=get_group):
        for item in items:
            grouped[group].append(item)
    for files in grouped.values():
        # doing an inline sort, which mutates the
        # list
        files.sort(key=file_order)       
    return grouped
If the order of the hdf_files in each group doesn't matter, you can remove the sort code.
On the other side, sorting can simplify the code.

def grouped_files(doc_root):
    doc_root = pathlib.Path(doc_root)
    grouped = {}
    iterator = doc_root.glob('*_*_*_*.hdf')
    sorted_files = sorted(iterator, key=get_group)
    for group, items in groupby(sorted_files, key=get_group):
        # because of the sorted list by group,
        # each group occours only once
        # items is an interator, you have to consume the iterator
        grouped[group] = list(items)
        # if sorting of hdf_files is required
        # grouped[group] = sorted(items, key=file_order)
    return grouped
https://docs.python.org/3/library/stdtypes.html#list.sort
https://docs.python.org/3/library/functions.html#sorted
https://docs.python.org/3/library/iterto...ls.groupby
https://docs.python.org/3/library/pathli...le-pathlib
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote
#4
thank you. i will try this out.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Concatenate two dictionaries harish 3 130 Oct-12-2019, 04:52 PM
Last Post: strngr12
  Looping through music files (SOLVED) ebolisa 0 215 Jul-13-2019, 06:16 PM
Last Post: ebolisa
  Looping through csv files in a folder WhatsupSmiley 3 1,693 Nov-13-2018, 08:39 PM
Last Post: Larz60+
  Files handling and lists gonzo620 12 1,140 Oct-09-2018, 01:35 AM
Last Post: ichabod801
  Python split and concatenate saravanatn 5 973 Jul-31-2018, 08:29 AM
Last Post: Axel_Erfurt
  Looping through files, check content and delete metalray 1 615 May-11-2018, 02:16 PM
Last Post: buran
  looping through lists brianl 2 733 Jan-10-2018, 07:06 PM
Last Post: brianl
  Looping .xlsx files in folder/subfolders copy pasting currentregion HarrisQ 4 2,224 Apr-17-2017, 06:35 AM
Last Post: HarrisQ

Forum Jump:


Users browsing this thread: 1 Guest(s)