Python Forum

Full Version: Merging multiple csv files with same X,Y,Z in each
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
As the name suggests, looking for help building a script to merge multiple csv files. Each file has; X, Y, Z and Value(W). I am wanting to match the x,y,z in each csv file and paste the Value(W) in each subsequent column, so it is not a simple concatenation operation.
What have you tried?
Could you provide an example of combining two files. I don't' understand what you mean by "match the X, Y, Z" and "paste value(W) in each subsequent column.
Data I am looking at is readings in 3D down a borehole. Various readings are taken at the same x,y,z location, but reside in different csv files. I need 1 csv file that lists the different reading types (in subsequent columns) at the same x,y,z location.

My workflow that I am up to is as follows

First I have to create headers in each file

import pandas as pd
import os

#define the directory where files are 
input_dir = '/path/'

#build list of files to amend
for filename in os.listdir(input_dir):
    if filename.endswith('.csv'):
        # Read the csv file
        filepath = os.path.join(input_dir,filename)
        data = pd.read_csv(filepath)
        
        #insert the column names "Hole", "X", "Y", "Z" & "Value" as headers
        data = pd.DataFrame(data, columns=['Hole','X','Y','Z','Value'])
        
        #Write the modified data to the same CSV file
        data.to_csv(filepath, index=False)
Then make a unique index for each record so I can match them together.

# define the output directory - wanting to create new versions rather than overwriting
output_dir = '/path/'

#Combine the values in columns a, b, c & d to make index
        data['index'] = data['a'].astype(str) + '_' + data['b'].astype(str)+ '_' + data['c'].astype(str)+ '_' + data['d'].astype(str)
        
        # write the modified data to a new csv file in teh output directory
        output_filepath = os.path.join(output_dir,filename)
        data.to_csv(output_filepath, index=False)
"What I am up to now"; Then I need to match index and merge records

The only problem is the column headers will not be unique. maybe put the filename into the header rather than "Value"?

What is the best way to match and merge?