Jul-09-2019, 07:53 PM
I have a dictionary of files in this format:
{'filea': ['test/folder2/filea', 'test/folder3/filea', 'test/folder1/filea'],
'fileb': ['test/folder2/fileb', 'test/folder3/fileb', 'test/folder1/fileb'],
'filec': ['test/folder2/filec', 'test/folder3/filec', 'test/folder1/filec']}
and I have created a for loop to go through each filename and create a dataframe that combines the files corresponding to each key in the dictionary above but when I run my loop, the next fileb in this case is getting appended into the dataframe created for file a. I am not sure how to fix this as I spent a few hours to no avail at solving this problem probably also because I have a very long code in between to understand where my mistake is with indentation. My code is as below:
Lets say the dictionary above is called file_list.
Any help on this is much appreciated!
{'filea': ['test/folder2/filea', 'test/folder3/filea', 'test/folder1/filea'],
'fileb': ['test/folder2/fileb', 'test/folder3/fileb', 'test/folder1/fileb'],
'filec': ['test/folder2/filec', 'test/folder3/filec', 'test/folder1/filec']}
and I have created a for loop to go through each filename and create a dataframe that combines the files corresponding to each key in the dictionary above but when I run my loop, the next fileb in this case is getting appended into the dataframe created for file a. I am not sure how to fix this as I spent a few hours to no avail at solving this problem probably also because I have a very long code in between to understand where my mistake is with indentation. My code is as below:
Lets say the dictionary above is called file_list.
for key,files in file_list.items(): #dataset = pd.Dataframe() for i in files: #loop over the files in each key #do something.... df = pd.DataFrame({'A':B,'C':D,'E':F}) print('This dataframe has the shape:',df.shape) #save dataframe df.to_hdf('xxx.hdf'.format(key[0:-4]),mode='w', key='df')I still can't really see where my mistake is as when the loop works on the files in fileb, it is getting appended into the dataframe that has the data from filea instead of creating a whole new dataframe for fileb.
Any help on this is much appreciated!