Data I am looking at is readings in 3D down a borehole. Various readings are taken at the same x,y,z location, but reside in different csv files. I need 1 csv file that lists the different reading types (in subsequent columns) at the same x,y,z location.
My workflow that I am up to is as follows
First I have to create headers in each file
import pandas as pd
import os
#define the directory where files are
input_dir = '/path/'
#build list of files to amend
for filename in os.listdir(input_dir):
if filename.endswith('.csv'):
# Read the csv file
filepath = os.path.join(input_dir,filename)
data = pd.read_csv(filepath)
#insert the column names "Hole", "X", "Y", "Z" & "Value" as headers
data = pd.DataFrame(data, columns=['Hole','X','Y','Z','Value'])
#Write the modified data to the same CSV file
data.to_csv(filepath, index=False)
Then make a unique index for each record so I can match them together.
# define the output directory - wanting to create new versions rather than overwriting
output_dir = '/path/'
#Combine the values in columns a, b, c & d to make index
data['index'] = data['a'].astype(str) + '_' + data['b'].astype(str)+ '_' + data['c'].astype(str)+ '_' + data['d'].astype(str)
# write the modified data to a new csv file in teh output directory
output_filepath = os.path.join(output_dir,filename)
data.to_csv(output_filepath, index=False)
"What I am up to now"; Then I need to match index and merge records
The only problem is the column headers will not be unique. maybe put the filename into the header rather than "Value"?
What is the best way to match and merge?