Help needed with merging two CSV files - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Help needed with merging two CSV files (/thread-35783.html) |
Help needed with merging two CSV files - eyadfr - Dec-13-2021 Hello, I need your help with the below please 🙏 I have TWO ".CSV" files with the same number of columns and rows (257 columns and X rows). I want to merge them together as follows; Write the content of the first (256 columns) from the first file, then write the content of the second file. I have written below code, but I'm ending up with a loop that creates a huge size huge that crashes my computer every time! Although, when testing with a smaller size CSV files, I'm getting a good result. # This is a sample Python script. import csv # read data from file1.csv and file.csv file1 = open('./File1.csv') file2 = open('./File2.csv') # make csv readers object for each file reader1 = csv.reader(file1) reader2 = csv.reader(file2) # write a result to result.csv file3 = open('./Result.csv', 'w') writer = csv.writer(file3) dic = {} words = [] # read data from file1 for row in reader1: count = len(row) words.append(row[count - 1]) dic[row[count - 1]] = row[0:count - 1] # read data from file2 and merge data with file1 for row in reader2: count = len(row) dic[row[count - 1]] += row[0:count - 1] #write data to result.csv for word in words: dic[word].append(word) writer.writerow(dic[word]) print('merge finished') RE: Help needed with merging two CSV files - bowlofred - Dec-13-2021 What do you want to do with column 257, just ignore it? If so, I'd probably do something like this: import csv with open("File1.csv") as in1, open("Filie2.csv") as in2, open("Result.csv", "w") as out: writer = csv.writer(out) for row in csv.reader(i1): writer.writerow(row[:256]) for row in csv.reader(i2): writer.writerow(row[:256]) RE: Help needed with merging two CSV files - Pedroski55 - Dec-13-2021 Well, csv is just text and the files are identical in format. The only problem I see is if you have headers, then the headers will appear twice with the below, but that can be dealt with easily. I have a lot of csv files I use as answer keys for OMR. import glob path2files = '/home/pedro/winter2021/OMR/20EAP/question_data_csv/' files = glob.glob(path2files + '*.csv') for file in files: print('The files are:', file) file1 = input('Copy and paste 1 of the above files here ... ') file2 = input('Copy and paste 1 of the above files here ... ') with open(file1) as data1, open(file2) as data2: string1 = data1.read() string2 = data2.read() amalgamate = string1 + string2 output = '/home/pedro/temp/output.csv' with open(output, 'w') as o: o.write(amalgamate) print('Only problem is the header row appears twice!') RE: Help needed with merging two CSV files - eyadfr - Dec-14-2021 Good day! Thank you so much for responding and sharing above solutions. I would like to have both file content look like this: Where N is the number of columns and its identical for both files. Please note that both files have no header/title. Very much appreciated 🙏 RE: Help needed with merging two CSV files - bowlofred - Dec-14-2021 import csv with open("File1.csv") as i1, open("File2.csv") as i2, open("Result.csv", "w") as out: writer = csv.writer(out) for set1, set2 in zip(csv.reader(i1), csv.reader(i2)): writer.writerow(set1[:-1] + set2) RE: Help needed with merging two CSV files - paul18fr - Dec-14-2021 You may also use Pandas and Numpy to deal with huge files; just an example here bellow import pandas as pd import numpy as np import os Path=str(os.getcwd()) CsvFile1='data1.csv' CsvFile2='data2.csv' # if you've a header (if not put it to False) Header=True # First csv file reading File1=pd.read_csv(Path + '/' + CsvFile1, header=None, delimiter=';') if Header : File1=File1.drop([0], axis=0) Array1=File1.to_numpy(dtype=float) # second csv file reading File2=pd.read_csv(Path + '/' + CsvFile2, header=None, delimiter=';') if Header : File2=File2.drop([0], axis=0) Array2=File2.to_numpy(dtype=float) # concatenation: you must have the same number of rows, but may have different number of columns Concat=np.hstack((Array1, Array2)) # write a "like" csv file; remove fmt for the highest accuracy %.18e # np.savetxt(Path + '/concat.csv', Concat, delimiter=';', fmt='%f') np.savetxt(Path + '/concat.csv', Concat, delimiter=';') |