Feb-13-2024, 06:45 PM
(This post was last modified: Feb-13-2024, 06:46 PM by MasterOfDestr.)
I need to create excel report (update data in existing file) based on variety source data (also excel files). All data will be .xlsx. In general steps that I'll do are:
1.Download settings information;
2.Create "starting" reports based on settings information;
3.Filter & Copy & paste data to created reports (in some of the cases data starts from as example B3:T10 - so not always A1);
4.Copy & paste (autofill) formulas.
5.Launch macro (if necessary I can try rewrite macro code in Python)
I've tried to use openpyxl and pandas but I find it as not stable solution. As example if I run code for 1st time it's working but in 2nd approach I receive errors like
-"BadZipFile: File is not a zip file" -> (if I upload starting excel that will be updated then issue disappears);
-"template = load_workbook("_MAT_TEMPLATE.xlsx") KeyError: "There is no item named '[Content_Types].xml' in the archive" -> again uploading file resolves the issue;
-ValueError: I/O operation on closed file.-> issue when I want to save excel after pasting pandas dataframe - not solved yet.
As I haven't work on such complex topic yet, could you advice what library I could use for working on such task? I've tired openpyxl and pandas but I feel that they are somehow unstable. I've learned about Spire.XLS but seems that this is paid solution. If you also know already existing project on github that I could use as "inspiration" I would be also grateful.
Best regards, Jakub
What I did so far is
1.Download settings information;
2.Create "starting" reports based on settings information;
3.Filter & Copy & paste data to created reports (in some of the cases data starts from as example B3:T10 - so not always A1);
4.Copy & paste (autofill) formulas.
5.Launch macro (if necessary I can try rewrite macro code in Python)
I've tried to use openpyxl and pandas but I find it as not stable solution. As example if I run code for 1st time it's working but in 2nd approach I receive errors like
-"BadZipFile: File is not a zip file" -> (if I upload starting excel that will be updated then issue disappears);
-"template = load_workbook("_MAT_TEMPLATE.xlsx") KeyError: "There is no item named '[Content_Types].xml' in the archive" -> again uploading file resolves the issue;
-ValueError: I/O operation on closed file.-> issue when I want to save excel after pasting pandas dataframe - not solved yet.
As I haven't work on such complex topic yet, could you advice what library I could use for working on such task? I've tired openpyxl and pandas but I feel that they are somehow unstable. I've learned about Spire.XLS but seems that this is paid solution. If you also know already existing project on github that I could use as "inspiration" I would be also grateful.
Best regards, Jakub
What I did so far is
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
from openpyxl.workbook import workbook from openpyxl import load_workbook import pandas as pd import xlsxwriter xlPY = load_workbook( "FX_rates_PY.xlsx" ) wsPY = xlPY[ "englisch" ] FXPY = wsPY[ "B7:u32" ] template = load_workbook( "_MAT_TEMPLATE.xlsx" ) templateFXPY = template[ "FX rates_PY" ] materials = template[ "Materials" ] dest_rangePY = templateFXPY[ "A1:T26" ] templateFXCY = template[ "FX rates_CY" ] dest_rangeCY = templateFXCY[ "A1:T26" ] for i, row in enumerate (FXPY): for j, cell in enumerate (row): dest_rangePY[i][j].value = cell.value xlCY = load_workbook( "FX_rates_CY.xlsx" ) wsCY = xlCY[ "englisch" ] FXCY = wsCY[ "B7:U32" ] for i, row in enumerate (FXCY): for j, cell in enumerate (row): dest_rangeCY[i][j].value = cell.value cover_sheet = template[ "Cover" ] cover_sheet[ "C66" ] = "Y" cover_sheet[ "C55" ] = "Y" Month = cover_sheet[ 'c7' ].value CurrentYear = cover_sheet[ 'c5' ].value Bot_input = template[ "BOT Input" ] data_values = [] for row in Bot_input: data_values.append([cell.value for cell in row]) PlantDataClean = pd.DataFrame(data_values[ 1 :], columns = data_values[ 0 ]) PlantDataClean = PlantDataClean[PlantDataClean[ "GRIP Plant" ].notna()] PlantDataClean[ 'GRIP Plant' ] = PlantDataClean[ 'GRIP Plant' ].astype( int ) print (PlantDataClean) erp_value = PlantDataClean[ 'ERP' ].values[ 0 ] SAP_Plant_value = PlantDataClean[ 'SAP Plant' ].values[ 0 ] print (SAP_Plant_value) from openpyxl.utils.dataframe import dataframe_to_rows if erp_value = = "BKP" : BKP = load_workbook( "BKP_Material List.xlsx" ) BKPsheet = BKP[ "Sheet1" ] data_values = [] for row in BKPsheet: data_values.append([cell.value for cell in row]) BKPsheetPD = pd.DataFrame(data_values[ 1 :], columns = data_values[ 0 ]) BKPsheetPDfilter = BKPsheetPD[BKPsheetPD[ "Plant" ] = = SAP_Plant_value] BKPsheetPDfilter = BKPsheetPDfilter[ "Material" ] print (BKPsheetPDfilter) BKPsheetPDfilter.to_excel( '_MAT_TEMPLATE.xlsx' ,sheet_name = "sheet1" ) #,startcol=2,startrow=0,header=None) else : print ( "S4 not avaiable yet" ) template.save( "_MAT_TEMPLATE.xlsx" ) |