Python Forum
Help needed with merging two CSV files
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help needed with merging two CSV files
#1
Hello,

I need your help with the below please 🙏

I have TWO ".CSV" files with the same number of columns and rows (257 columns and X rows). I want to merge them together as follows;

Write the content of the first (256 columns) from the first file, then write the content of the second file.

I have written below code, but I'm ending up with a loop that creates a huge size huge that crashes my computer every time! Although, when testing with a smaller size CSV files, I'm getting a good result.

# This is a sample Python script.

import csv

# read data from file1.csv and file.csv
file1 = open('./File1.csv')
file2 = open('./File2.csv')

# make csv readers object for each file
reader1 = csv.reader(file1)
reader2 = csv.reader(file2)

# write a result to result.csv
file3 = open('./Result.csv', 'w')
writer = csv.writer(file3)

dic = {}
words = []

# read data from file1
for row in reader1:
    count = len(row)
    words.append(row[count - 1])
    dic[row[count - 1]] = row[0:count - 1]

# read data from file2 and merge data with file1
for row in reader2:
    count = len(row)
    dic[row[count - 1]] += row[0:count - 1]

#write data to result.csv
for word in words:
    dic[word].append(word)
    writer.writerow(dic[word])

print('merge finished')
Reply
#2
What do you want to do with column 257, just ignore it?

If so, I'd probably do something like this:


import csv

with open("File1.csv") as in1, open("Filie2.csv") as in2, open("Result.csv", "w") as out:
    writer = csv.writer(out)

    for row in csv.reader(i1):
        writer.writerow(row[:256])
    for row in csv.reader(i2):
        writer.writerow(row[:256])
eyadfr likes this post
Reply
#3
Well, csv is just text and the files are identical in format.

The only problem I see is if you have headers, then the headers will appear twice with the below, but that can be dealt with easily.

I have a lot of csv files I use as answer keys for OMR.

import glob

path2files = '/home/pedro/winter2021/OMR/20EAP/question_data_csv/'

files = glob.glob(path2files + '*.csv')

for file in files:
    print('The files are:', file)

file1 = input('Copy and paste 1 of the above files here ... ')
file2 = input('Copy and paste 1 of the above files here ... ')

with open(file1) as data1, open(file2) as data2:
    string1 = data1.read()
    string2 = data2.read()

amalgamate = string1 + string2
output = '/home/pedro/temp/output.csv'
with open(output, 'w') as o:
    o.write(amalgamate)

print('Only problem is the header row appears twice!')
eyadfr likes this post
Reply
#4
Good day!

Thank you so much for responding and sharing above solutions.

I would like to have both file content look like this:

[Image: 26396346ff660680ae25089130fa60b3dcd68f55...3e0941.jpg]

Where N is the number of columns and its identical for both files.

Please note that both files have no header/title.

Very much appreciated 🙏
Reply
#5
import csv

with open("File1.csv") as i1, open("File2.csv") as i2, open("Result.csv", "w") as out:
    writer = csv.writer(out)

    for set1, set2 in zip(csv.reader(i1), csv.reader(i2)):
        writer.writerow(set1[:-1] + set2)
eyadfr and Larz60+ like this post
Reply
#6
You may also use Pandas and Numpy to deal with huge files; just an example here bellow

import pandas as pd
import numpy as np
import os

Path=str(os.getcwd())
CsvFile1='data1.csv' 
CsvFile2='data2.csv' 

# if you've a header (if not put it to False)
Header=True

# First csv file reading
File1=pd.read_csv(Path + '/' + CsvFile1, header=None, delimiter=';')
if Header : File1=File1.drop([0], axis=0)
Array1=File1.to_numpy(dtype=float)

# second csv file reading
File2=pd.read_csv(Path + '/' + CsvFile2, header=None, delimiter=';')
if Header : File2=File2.drop([0], axis=0)
Array2=File2.to_numpy(dtype=float)

# concatenation: you must have the same number of rows, but may have different number of columns
Concat=np.hstack((Array1, Array2))

# write a "like" csv file; remove fmt for the highest accuracy %.18e
# np.savetxt(Path + '/concat.csv', Concat, delimiter=';', fmt='%f')
np.savetxt(Path + '/concat.csv', Concat, delimiter=';')
eyadfr likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Sorting and Merging text-files [SOLVED] AlphaInc 10 2,550 Aug-20-2021, 05:42 PM
Last Post: snippsat
  Merging all file_name.log's files from directory to one and search “PerformanceINFO" sutra 0 1,139 Dec-09-2020, 05:14 PM
Last Post: sutra
  Merging Excel Files JezMim 1 1,256 Sep-06-2020, 08:56 PM
Last Post: bowlofred
  Merging CSV Files in Jupyter RJ117 0 4,035 Jan-07-2018, 06:24 AM
Last Post: RJ117
  merging two csv files jlcolam 3 10,244 Jul-13-2017, 06:59 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020