Python Forum
Add a row to a dataframe or append whole dataframe. - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Add a row to a dataframe or append whole dataframe. (/thread-31847.html)



Add a row to a dataframe or append whole dataframe. - tsurubaso - Jan-06-2021

Hello to all,
Happy new year and good health!!


I am sure this is super simple, but I can't find the right way to do it.
I have a folder with some csv files.
All csv files have the same structure, same number of columns, similar data but the number of rows are different.
I want to regroup all that data in one dataframe to make a single CSV.

I have the following code.

import pandas as pd
import glob
path = r"path\to\my\folder\*.csv"
df2 = pd.DataFrame(columns=['Name','Reading', 'Office','Phone','address' ])

csvfiles = []
a=0
for file in glob.glob(path):
    csvfiles.append(file)
   # print(file)

for csvnub in csvfiles:
   df=pd.read_csv(csvnub)
   count_row = df.shape[0]

   print(count_row)
   df2.iloc[a] = df.iloc[count_row]
   a=a+1

print(df2)
df2.to_csv("InfosTotal.csv", index=True, encoding="utf_8_sig")
I tried append()...

for csvnub in csvfiles:
   df=pd.read_csv(csvnub)
   count_row = df.shape[0]
   print(count_row)
   for row in count_row:
       df2.append(row) 
or

df2.append(df, ignore_index = False) 
I tried concat() also...

Help will be extremely appreciated.


RE: Add a row to a dataframe or append whole dataframe. - tsurubaso - Jan-07-2021

OK,
I found the solution myself, I don't know if this is clean, proper but it works.
Never touch a working script...
hahaha
ok,
I am sure this will help others.

import pandas as pd
import glob
path = r"path\to\my\folder\*.csv"
df2 = pd.DataFrame(columns=['Name','Reading', 'Office','Phone','address' ])

csvfiles = []
for file in glob.glob(path):
    csvfiles.append(file)
   # print(file)

for csvnub in csvfiles:
   df=pd.read_csv(csvnub)
   count_row = df.shape[0]
   print(count_row)
   df2=pd.concat([df, df2])
   

df2.sort_values(by=['Name'])
df2.drop_duplicates(keep='first', inplace=True)
df2.to_csv("InfosTotal.csv", index=True, encoding="utf_8_sig")