Python Forum

Full Version: Pandas, How to trigger parallel loop
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I have multiple CSV files in the folder, and I need to read each file and some calculation (like getting the first coulmn sum) and concate to result_df. Is there any method in python? to achieve this. Actually to ready and do some calculation, it is taking me 2mins, I need to wait longer time if there are many files.
Please show your code so far.
import os
import pandas as pd
 
df_result =pd.DataFrame()
 
directory = os.path.join("D:\\","\PythonCodes\inputmultifiles")
for root,dirs,files in os.walk(directory):
    for file in files:
        f = os.path.join(directory,file)
 
        if f.endswith(".csv"):
           ff=pd.read_csv(f)
           tmp = ff['Name']
           print(tmp)
           df_result= pd.concat([df_result,ff['Name']])
            
df_result = df_result.reset_index(drop=True)      
df_result.columns = ['New_col']
if the file size is large, and it takes time, wait the previous iteration finish. Now I want to do like multiple threading to trigger all iterations at a time and combine the results from each iteration.
This is really a pandas question, not csv, as pandas reads in the data.
(Oct-28-2020, 02:21 PM)Larz60+ Wrote: [ -> ]This is really a pandas question, not csv, as pandas reads in the data.

Yes, but more generally I would put it this way: how to trigger multithreading