Oct-07-2022, 08:22 PM
How can I insert a header in a new MS Excel file and then append data, into the same MS Excel file after the last populated row, from another MS Excel file? The number of columns will be the same.
Appending data
|
Oct-07-2022, 08:22 PM
How can I insert a header in a new MS Excel file and then append data, into the same MS Excel file after the last populated row, from another MS Excel file? The number of columns will be the same.
Managed to get this code to work (although I get the error D:\Python\Test.py:8: FutureWarning: save is not part of the public API, usage can give in unexpected results and will be removed in a future version file_name.save()) which inserts the header to a new Excel file. However, I don't know how to append data from a file called Book1.xlsx and append it to Company Financials.xlsx after the header.
import pandas as pd from pandas import ExcelWriter header = ['Country', 'Product', 'Units Sold', 'Revenue', 'Cost', 'Revenue', 'Cost', 'Profit', 'Date', 'Month Number'] df = pd.DataFrame(columns = header) file_name = ExcelWriter('D:\Python/Company Financials.xlsx') df.to_excel(file_name,'Sheet1',index=0) file_name.save()
Oct-07-2022, 11:37 PM
Just using openpyxl, you can do that like this:
targetFile is a new Excel. Each time you want to write a new row, get targetFile[tsheet].max_row, that is the last row with data. Add 1 and that is the row you want to write to: tmaxRow = targetFile[tsheet].max_row + 1 This little function gets a whole row from XLfile and writes that row to targetFile # get all columns of 1 row which meets the condition def getRow(XLfile, targetFile, sheet, tsheet, rowNum, maxCol): tmaxRow = targetFile[tsheet].max_row + 1 print('target file max row is', tmaxRow) for colNum in range(1, maxCol + 1): sv = XLfile[sheet].cell(row=rowNum, column=colNum).value targetFile[tsheet].cell(row=tmaxRow, column=colNum).value=sv
Tried the following code: -
import openpyxl from openpyxl import Workbook # get all columns of 1 row which meets the condition def getRow(XLfile, targetFile, sheet, tsheet, rowNum, maxCol): tmaxRow = targetFile[tsheet].max_row + 1 print('target file max row is', tmaxRow) for colNum in range(1, maxCol + 1): sv = XLfile[sheet].cell(row=rowNum, column=colNum).value targetFile[tsheet].cell(row=tmaxRow, column=colNum).value=sv XLfile="delete_empty_rows.xlsx" targetFile="test.xlsx" sheet="daily sales" tsheet="person" rowNum=1 maxCol=9 getRow(XLfile, targetFile, sheet, tsheet, rowNum, maxCol)and got this error: - "D:\Power BI & Python\venv\Scripts\python.exe" "D:/Power BI & Python/Test.py" Traceback (most recent call last): File "D:\Power BI & Python\Test.py", line 16, in <module> getRow(XLfile, targetFile, sheet, tsheet, rowNum, maxCol) File "D:\Power BI & Python\Test.py", line 3, in getRow tmaxRow = targetFile[tsheet].max_row + 1 TypeError: string indices must be integers Process finished with exit code 1
Oct-08-2022, 08:28 AM
(This post was last modified: Oct-08-2022, 08:28 AM by Pedroski55.)
Can't see the problem.
Copy and paste each step one at a time, then you'll find it. This works for me in bash. It copies the header row from my source file to row 2 of the target file. Hope it works this time! #! /usr/bin/python3 import openpyxl from pathlib import Path # show the XL files def showFiles(path2XL): mydir = Path(path2XL) filelist = [filename for filename in mydir.iterdir() if filename.is_file()] for filename in filelist: print(f"\nfilename: {filename.name}") # open the source file you want def openXL(path2XL): print('copy and paste the name of the file you want to work on here ... ') file = input('copy and paste the path to the Excel you want to work on here ... ') # data_only = True flag will get the values of formulas, not the formula # ws['B2'].data_type tells you if the cell contains a formula. 'f' indicates a formula sourceFile = openpyxl.load_workbook(path2XL + file, data_only = True) sheets = sourceFile.sheetnames for sheet in sheets: print('\nThe sheet names are:', sheet) return (sourceFile, file) # get the sheet you want to extract data from def getSheet(): print('Which sheets do you want to get data from?') mysheet = input('Copy and paste the sheet name you want from above ... ') return mysheet # make a target file def makeXL(targetname, tsheet, path2XL): targetFile = openpyxl.Workbook() tsheet = targetFile.active tsheet.title = 'person' tsheets = targetFile.sheetnames tsheet = tsheets[0] targetFile.save(path2XL + savename) print(f'target file made and saved to', path2XL + targetname) return targetFile # get all columns of 1 row which meets the condition def getRow(XLfile, targetFile, sheet, tsheet, rowNum, maxCol, path2XL, savename): tmaxRow = targetFile[tsheet].max_row + 1 print('target file max row + 1 is', tmaxRow) for colNum in range(1, maxCol + 1): sv = XLfile[sheet].cell(row=rowNum, column=colNum).value targetFile[tsheet].cell(row=tmaxRow, column=colNum).value=sv targetFile.save(path2XL + savename) print(f'target file saved to', path2XL + savename) if __name__ == "__main__": print('Where are the XL files? For example: /home/pedro/myPython/openpyxl/xlsx_files/ ?') # path2XL = '/home/pedro/myPython/openpyxl/xlsx_files/' path2files = input('Where are the XL files? Enter the path to the XL files ... ') showFiles(path2files) data = openXL(path2files) sF = data[0] myfile = data[1] sheet = getSheet() savename = "test_me1.xlsx" tsheet="person" tF = makeXL(savename, tsheet, path2files) rowNum=1 # the headers maxCol=9 getRow(sF, tF, sheet, tsheet, rowNum, maxCol, path2files, savename) print('All done!') |
|
Possibly Related Threads… | |||||
Thread | Author | Replies | Views | Last Post | |
Appending a row of data in an MS Excel file | azizrasul | 3 | 2,125 |
Nov-06-2022, 05:17 PM Last Post: azizrasul |
|
Appending data into a file in tab delimited format | metro17 | 1 | 5,318 |
Aug-06-2019, 07:34 AM Last Post: fishhook |