Python Forum
python Read each xlsx file and write it into csv with pipe delimiter
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python Read each xlsx file and write it into csv with pipe delimiter
#5
Hi Team,

This is code I am using for cosolidation of xlsx files into Main file, Do ou see any error or suggest please




from pathlib import Path
import pandas as pd

foldername = "your_folder_name"
output_folder = "output_folder"  # Define the folder where you want to save the output CSV files
extra_header = "IFRS"
columns_to_fill = ['year1', 'year2', 'year3', 'year4', 'year5', 'Total']

files_combine = Path(f"{foldername}").rglob("*.xlsx")

header_written = False

for input_xls in files_combine:
    # Attempt to read from the 'Data' sheet; if it doesn't exist, read the first active sheet
    try:
        df = pd.read_excel(input_xls, sheet_name='Data')
        
        # Check if the header contains "reports"
        if 'reports' in df.columns.str.lower():
            df_dict = {input_xls.name: df}
        else:
            raise ValueError("Header does not contain 'reports'")
    except Exception:
        df_dict = pd.read_excel(input_xls, sheet_name=None, engine='openpyxl')  # Read all sheets into a dictionary
        
        # Select the first non-empty sheet as the active sheet
        for sheet_name, sheet_data in df_dict.items():
            if not sheet_data.empty:
                # Check if the header contains "reports"
                if 'reports' in sheet_data.columns.str.lower():
                    df = sheet_data
                    df_dict = {input_xls.name: df}
                    break
                else:
                    raise ValueError("Header does not contain 'reports'")
    
    # Check if "IFRS" column already exists, and if not, add it
    if 'IFRS' not in df.columns:
        df['IFRS'] = extra_header
    
    # Replace blank cells in the specified columns with 0
    df[columns_to_fill] = df[columns_to_fill].fillna(0)
    
    # Convert 'reporting date' column to yyyy-mm-dd format
    df['reporting date'] = pd.to_datetime(df['reporting date']).dt.strftime('%Y-%m-%d')
    
    # Extract the existing headers from the first row
    existing_headers = list(df.columns)
    
    # Add the extra header "IFRS" to the list of existing headers
    existing_headers.append(extra_header)
    
    # Update the DataFrame columns with the new list of headers
    df.columns = existing_headers
    
    # Insert an additional column for 'FileName' in the DataFrame
    df.insert(0, 'FileName', input_xls.name)
    
    # Define the output CSV file name based on the input file name
    output_csv_file = f"{output_folder}/{input_xls.stem}_combined.csv"
    
    # Check if the header has been written, and if not, write it
    if not header_written:
        header_written = True
        df.to_csv(output_csv_file, index=False, sep='\t', encoding='utf-8', float_format='%.6f', header=True, mode='a')
    else:
        # Append the workbook data to the CSV file without writing the header again
        df.to_csv(output_csv_file, index=False, sep='\t', encoding='utf-8', float_format='%.6f', header=False, mode='a')
Thanks
mg
deanhystad write Nov-09-2023, 12:59 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Reply


Messages In This Thread
RE: python Read each xlsx file and write it into csv with pipe delimiter - by mg24 - Nov-09-2023, 10:56 AM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Delete file with read-only permission, but write permission to parent folder cubei 6 22,374 Jun-01-2024, 07:22 AM
Last Post: Eleanorreo
  Media Pipe Python Interfacing with MATLAB cmcreecc 1 203 May-30-2024, 07:23 AM
Last Post: TrentErnser
  Connecting to Remote Server to read contents of a file ChaitanyaSharma 1 405 May-03-2024, 07:23 AM
Last Post: Pedroski55
  What does .flush do? How can I change this to write to the file? Pedroski55 3 418 Apr-22-2024, 01:15 PM
Last Post: snippsat
  Last record in file doesn't write to newline gonksoup 3 592 Jan-22-2024, 12:56 PM
Last Post: deanhystad
  Recommended way to read/create PDF file? Winfried 3 3,104 Nov-26-2023, 07:51 AM
Last Post: Pedroski55
  write to csv file problem jacksfrustration 11 1,863 Nov-09-2023, 01:56 PM
Last Post: deanhystad
Question Special Characters read-write Prisonfeed 1 721 Sep-17-2023, 08:26 PM
Last Post: Gribouillis
  read file txt on my pc to telegram bot api Tupa 0 1,269 Jul-06-2023, 01:52 AM
Last Post: Tupa
  parse/read from file seperated by dots giovanne 5 1,248 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020