python Read each xlsx file and write it into csv with pipe delimiter

mg24 · (This post was last modified: Nov-09-2023, 12:59 PM by deanhystad.)

Hi Team,

This is code I am using for cosolidation of xlsx files into Main file, Do ou see any error or suggest please

from pathlib import Path
import pandas as pd

foldername = "your_folder_name"
output_folder = "output_folder"  # Define the folder where you want to save the output CSV files
extra_header = "IFRS"
columns_to_fill = ['year1', 'year2', 'year3', 'year4', 'year5', 'Total']

files_combine = Path(f"{foldername}").rglob("*.xlsx")

header_written = False

for input_xls in files_combine:
    # Attempt to read from the 'Data' sheet; if it doesn't exist, read the first active sheet
    try:
        df = pd.read_excel(input_xls, sheet_name='Data')
        
        # Check if the header contains "reports"
        if 'reports' in df.columns.str.lower():
            df_dict = {input_xls.name: df}
        else:
            raise ValueError("Header does not contain 'reports'")
    except Exception:
        df_dict = pd.read_excel(input_xls, sheet_name=None, engine='openpyxl')  # Read all sheets into a dictionary
        
        # Select the first non-empty sheet as the active sheet
        for sheet_name, sheet_data in df_dict.items():
            if not sheet_data.empty:
                # Check if the header contains "reports"
                if 'reports' in sheet_data.columns.str.lower():
                    df = sheet_data
                    df_dict = {input_xls.name: df}
                    break
                else:
                    raise ValueError("Header does not contain 'reports'")
    
    # Check if "IFRS" column already exists, and if not, add it
    if 'IFRS' not in df.columns:
        df['IFRS'] = extra_header
    
    # Replace blank cells in the specified columns with 0
    df[columns_to_fill] = df[columns_to_fill].fillna(0)
    
    # Convert 'reporting date' column to yyyy-mm-dd format
    df['reporting date'] = pd.to_datetime(df['reporting date']).dt.strftime('%Y-%m-%d')
    
    # Extract the existing headers from the first row
    existing_headers = list(df.columns)
    
    # Add the extra header "IFRS" to the list of existing headers
    existing_headers.append(extra_header)
    
    # Update the DataFrame columns with the new list of headers
    df.columns = existing_headers
    
    # Insert an additional column for 'FileName' in the DataFrame
    df.insert(0, 'FileName', input_xls.name)
    
    # Define the output CSV file name based on the input file name
    output_csv_file = f"{output_folder}/{input_xls.stem}_combined.csv"
    
    # Check if the header has been written, and if not, write it
    if not header_written:
        header_written = True
        df.to_csv(output_csv_file, index=False, sep='\t', encoding='utf-8', float_format='%.6f', header=True, mode='a')
    else:
        # Append the workbook data to the CSV file without writing the header again
        df.to_csv(output_csv_file, index=False, sep='\t', encoding='utf-8', float_format='%.6f', header=False, mode='a')

Thanks
mg

deanhystad write Nov-09-2023, 12:59 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Delete file with read-only permission, but write permission to parent folder	cubei	6	22,374	Jun-01-2024, 07:22 AM Last Post: Eleanorreo
	Media Pipe Python Interfacing with MATLAB	cmcreecc	1	203	May-30-2024, 07:23 AM Last Post: TrentErnser
	Connecting to Remote Server to read contents of a file	ChaitanyaSharma	1	405	May-03-2024, 07:23 AM Last Post: Pedroski55
	What does .flush do? How can I change this to write to the file?	Pedroski55	3	418	Apr-22-2024, 01:15 PM Last Post: snippsat
	Last record in file doesn't write to newline	gonksoup	3	592	Jan-22-2024, 12:56 PM Last Post: deanhystad
	Recommended way to read/create PDF file?	Winfried	3	3,104	Nov-26-2023, 07:51 AM Last Post: Pedroski55
	write to csv file problem	jacksfrustration	11	1,863	Nov-09-2023, 01:56 PM Last Post: deanhystad
	Special Characters read-write	Prisonfeed	1	721	Sep-17-2023, 08:26 PM Last Post: Gribouillis
	read file txt on my pc to telegram bot api	Tupa	0	1,269	Jul-06-2023, 01:52 AM Last Post: Tupa
	parse/read from file seperated by dots	giovanne	5	1,248	Jun-26-2023, 12:26 PM Last Post: DeaD_EyE

python Read each xlsx file and write it into csv with pipe delimiter

User Panel Messages

Announcements