Python Forum

Hello all, I'm pretty unfamiliar with Python and am encountering an error with the attached code in writing the output data to a CSV. The idea is to read the needed data from a UV-vis UTF-8 style CSV and rewrite it with only the data I need. It grabs wavelength, and absorbance (Abs) from a series of trials, and gets time for each sample_name from the metadata. I can see that the data_row.extend operation does give the data i want, but it never appears in the CSV file. All of the other data does.

CSV input file is too big to include but i included xlsx file. I'm guessing/hoping this is something obvious with CSV writing that I've missed due to lack of experience. Note there are no errors reported with this code and it outputs the appropriate values in the terminal.

Thanks in advance and please let me know if I have formatted wrong. New to the code forum world. I'm just a codeworld-naive chemistry grad student.

import csv

def process_csv(input_file):
    # Initialize lists and dictionaries to store data
    sample_names = []
    wavelength_data = []  # Store all rows of data
    abs_data = {}
    time_data = {}  # Use a dictionary to store time data

    current_sample = None  # Initialize current_sample outside of the loop
    seen_sample_names = set()  # Keep track of sample names already processed

    with open(input_file, 'r', newline='') as csvfile:
        csvreader = csv.reader(csvfile)

        # Extract sample names and initialize data dictionaries
        headers = next(csvreader)
        sample_names = [headers[i] for i in range(0, len(headers), 2)]
        for name in sample_names:
            abs_data[name] = []

        for i, row in enumerate(csvreader):
            if i < 2:
                continue  # Skip the first two rows

            # Store all rows of data for later reference
            wavelength_data.append(row)

            if row[0] in sample_names and row[0] not in seen_sample_names:
                current_sample = row[0]
                seen_sample_names.add(current_sample)  # Mark sample name as seen

                time_row = i + 35  # Move to the row exactly 36 rows higher
                time_col = 1  # Adjust this to the correct column index where time is located
                time_data[current_sample] = (time_row, time_col)

            # Split data into Abs values and append to the corresponding list in the data dictionary
            for name in sample_names:
                abs_data[name].append(row[headers.index(name) + 1])

    # Create a list to store the output data
    output_data = []

    for i in range(len(wavelength_data)):
        data_row = [wavelength_data[i][0], None]  # Initialize time data with None
        for sample_name in sample_names:
            time_row, time_col = time_data.get(sample_name, (None, None))
            if i == time_row:
                data_row[1] = wavelength_data[i][time_col]
                break
        data_row.extend(abs_data[name][i] for name in sample_names)
        output_data.append(data_row)

    # Create the output CSV file
    output_file = input_file.replace(".csv", "_output.csv")

    with open(output_file, 'w', newline='') as csvfile:
        csvwriter = csv.writer(csvfile)

        # Write the header row
        header_row = ["Wavelength (nm)", "Time (s)"] + sample_names
        csvwriter.writerow(header_row)

        # Write the data rows
        for data_row in output_data:
            csvwriter.writerow(data_row)

    print(f"Data extracted and saved to {output_file}")

if __name__ == "__main__":
    input_file = "Fast scanning excess substrate Fe(II)TPP + O2 + 2eq ScHMPA + 1000eq SK 9t1 -60C.csv"  # Replace with the path to your input CSV file
    process_csv(input_file)

pandas can greatly simplify data conversion.

It would be better in this instance to:

explain the goal of your code.
provide sample input data. (you forgot xlsx data)
Quote:but i included xlsx file
show expected output.

Then we can provide sample pandas code as an example.

gbtur8up

Larz60+