Feb-13-2024, 06:45 PM
(This post was last modified: Feb-13-2024, 06:46 PM by MasterOfDestr.)
I need to create excel report (update data in existing file) based on variety source data (also excel files). All data will be .xlsx. In general steps that I'll do are:
1.Download settings information;
2.Create "starting" reports based on settings information;
3.Filter & Copy & paste data to created reports (in some of the cases data starts from as example B3:T10 - so not always A1);
4.Copy & paste (autofill) formulas.
5.Launch macro (if necessary I can try rewrite macro code in Python)
I've tried to use openpyxl and pandas but I find it as not stable solution. As example if I run code for 1st time it's working but in 2nd approach I receive errors like
-"BadZipFile: File is not a zip file" -> (if I upload starting excel that will be updated then issue disappears);
-"template = load_workbook("_MAT_TEMPLATE.xlsx") KeyError: "There is no item named '[Content_Types].xml' in the archive" -> again uploading file resolves the issue;
-ValueError: I/O operation on closed file.-> issue when I want to save excel after pasting pandas dataframe - not solved yet.
As I haven't work on such complex topic yet, could you advice what library I could use for working on such task? I've tired openpyxl and pandas but I feel that they are somehow unstable. I've learned about Spire.XLS but seems that this is paid solution. If you also know already existing project on github that I could use as "inspiration" I would be also grateful.
Best regards, Jakub
What I did so far is
1.Download settings information;
2.Create "starting" reports based on settings information;
3.Filter & Copy & paste data to created reports (in some of the cases data starts from as example B3:T10 - so not always A1);
4.Copy & paste (autofill) formulas.
5.Launch macro (if necessary I can try rewrite macro code in Python)
I've tried to use openpyxl and pandas but I find it as not stable solution. As example if I run code for 1st time it's working but in 2nd approach I receive errors like
-"BadZipFile: File is not a zip file" -> (if I upload starting excel that will be updated then issue disappears);
-"template = load_workbook("_MAT_TEMPLATE.xlsx") KeyError: "There is no item named '[Content_Types].xml' in the archive" -> again uploading file resolves the issue;
-ValueError: I/O operation on closed file.-> issue when I want to save excel after pasting pandas dataframe - not solved yet.
As I haven't work on such complex topic yet, could you advice what library I could use for working on such task? I've tired openpyxl and pandas but I feel that they are somehow unstable. I've learned about Spire.XLS but seems that this is paid solution. If you also know already existing project on github that I could use as "inspiration" I would be also grateful.
Best regards, Jakub
What I did so far is
from openpyxl.workbook import workbook from openpyxl import load_workbook import pandas as pd import xlsxwriter xlPY = load_workbook("FX_rates_PY.xlsx") wsPY = xlPY["englisch"] FXPY = wsPY["B7:u32"] template = load_workbook("_MAT_TEMPLATE.xlsx") templateFXPY = template["FX rates_PY"] materials = template["Materials"] dest_rangePY = templateFXPY["A1:T26"] templateFXCY = template["FX rates_CY"] dest_rangeCY = templateFXCY["A1:T26"] for i, row in enumerate(FXPY): for j, cell in enumerate(row): dest_rangePY[i][j].value=cell.value xlCY = load_workbook("FX_rates_CY.xlsx") wsCY = xlCY["englisch"] FXCY = wsCY["B7:U32"] for i, row in enumerate(FXCY): for j, cell in enumerate(row): dest_rangeCY[i][j].value=cell.value cover_sheet = template["Cover"] cover_sheet["C66"] = "Y" cover_sheet["C55"] = "Y" Month = cover_sheet['c7'].value CurrentYear = cover_sheet['c5'].value Bot_input = template["BOT Input"] data_values = [] for row in Bot_input: data_values.append([cell.value for cell in row]) PlantDataClean = pd.DataFrame(data_values[1:], columns=data_values[0]) PlantDataClean = PlantDataClean[PlantDataClean["GRIP Plant"].notna()] PlantDataClean['GRIP Plant'] = PlantDataClean['GRIP Plant'].astype(int) print(PlantDataClean) erp_value = PlantDataClean['ERP'].values[0] SAP_Plant_value = PlantDataClean['SAP Plant'].values[0] print(SAP_Plant_value) from openpyxl.utils.dataframe import dataframe_to_rows if erp_value == "BKP": BKP = load_workbook("BKP_Material List.xlsx") BKPsheet = BKP["Sheet1"] data_values = [] for row in BKPsheet: data_values.append([cell.value for cell in row]) BKPsheetPD = pd.DataFrame(data_values[1:], columns=data_values[0]) BKPsheetPDfilter = BKPsheetPD[BKPsheetPD["Plant"] == SAP_Plant_value] BKPsheetPDfilter =BKPsheetPDfilter["Material"] print(BKPsheetPDfilter) BKPsheetPDfilter.to_excel('_MAT_TEMPLATE.xlsx',sheet_name="sheet1") #,startcol=2,startrow=0,header=None) else: print("S4 not avaiable yet") template.save("_MAT_TEMPLATE.xlsx")