Posts: 12,050
Threads: 487
Joined: Sep 2016
Oct-29-2022, 08:55 AM
(This post was last modified: Oct-29-2022, 08:55 AM by Larz60+.)
Note: Using a function allows you to call the code for other files.
Better yet, keep the code I sent in a separate script. This can now be imported and caled as needed.
You can also make modifications to the class, and they are automatically available in the scripts that import that class.
Using a modification your final code. I will show how this is done below.
It's what you should strive for to keep code modular.
first save my code as ExcelToCSV.py
import pandas as pd
from pathlib import Path
import os
class ExcelToCsv:
def __init__(self):
os.chdir(os.path.abspath(os.path.dirname(__file__)))
self.homepath = Path('.')
def excel_to_csv(self, workbook, sheetname, startrow, endrow, csvfile, index=False):
# Added to make sure filename is Posix compliant.
if not isinstance(csvfile, 'pathlib.PosixPath'):
csvfile = Path(csvfile)
skiprows = startrow
if startrow > 1:
skiprows = startrow-1
df = pd.read_excel(workbook, sheet_name=sheetname, skiprows, nrows=endrow-startrow)
df.to_csv(csvfile, index=index) your modified code:
from ExcelToCsv import ExcelToCsv
from openpyxl import load_workbook
ExcToCSV = ExcelToCsv().excel_to_csv
def main():
workbook = r"ExcelData3.xlsx"
wb = load_workbook(workbook)
for ws in wb.worksheets:
sheetname=ws.title
csv_file=f'{ws.title}.csv'
ExcToCSV(workbook, sheetname, startrow=1, nrows = ws.max_row) Not tested.
Posts: 75
Threads: 14
Joined: Jul 2022
Oct-29-2022, 04:27 PM
(This post was last modified: Oct-29-2022, 04:27 PM by azizrasul.)
Done that, but I get the following error: -
"D:\Power BI & Python\venv\Scripts\python.exe" "D:/Power BI & Python/Test.py"
Traceback (most recent call last):
File "D:\Power BI & Python\Test.py", line 1, in <module>
from ExcelToCsv import ExcelToCsv
ModuleNotFoundError: No module named 'ExcelToCsv'
Process finished with exit code 1
Posts: 6,827
Threads: 20
Joined: Feb 2020
Oct-29-2022, 06:51 PM
(This post was last modified: Oct-29-2022, 06:51 PM by deanhystad.)
What did you name the module (file)? Remember that case is important, and convention is to use all lower case for modules.
I would package it like this:
# shebang?
import openpyxl
import pandas as pd
from pathlib import Path
def read_excel_sheet(workbook, sheet=0, start=None, end=None):
"""Read sheet from excel workbook. Return dataframe
workbook : Name of excel file to read.
sheet : Name of sheet in workbook to convert. Defaults to first sheet
start : Start reading at this row. Default is start reading at row 0.
end : Stop reading at this row. Default is read to end.
"""
skiprows = max(0, start - 1) if start is not None else None
nrows = end - start if start and end else None
return pd.read_excel(workbook, sheet_name=sheet, skiprows=skiprows, nrows=nrows)
def excel_sheet_to_csv(workbook, sheet=0, csvfile=None, start=None, end=None, index=False):
"""Convert excel spreadsheet (.xlsx) to csv file
workbook : Name of excel file to read.
sheet : Name of sheet in workbook to convert. Defaults to first sheet
csvfile : Name of csv file to write. Default to workbook with .csv extension
start : Start reading at this row. Default is start reading at row 0.
end : Stop reading at this row. Default is read to end.
index : Write index column if True. Default is False
"""
if csvfile is None:
csvfile = Path(workbook).with_suffix(".csv")
df = read_excel_sheet(workbook, sheet, start, end)
df.to_csv(csvfile, index=index)
def excel_to_csv(workbook, csvfile=None):
"""Convert excel workbook (.xlsx) to csv file(s)
workbook : Name of excel file to read.
csvfile : Name of csv file to write. Default to workbook with .csv extension
"""
if csvfile is None:
csvfile = Path(workbook).with_suffix(".csv")
sheets = openpyxl.load_workbook(workbook).sheetnames
if len(sheets) == 1:
excel_sheet_to_csv(workbook, 0, csvfile)
else:
path = Path(csvfile)
name, ext = path.name.split(".")
for sheet in sheets:
excel_sheet_to_csv(
workbook,
sheet,
path.parent/f"{name}_{sheet}.{ext}")
if __name__ == "__main__":
import sys
excel_to_csv(*sys.argv[1:]) This gives you the same excel_to_csv function that you already have (now called excel_sheet_to_csv). It gives you a new function (named excel_to_csv) that converts all the sheets in a workbook to csv files and another that reads a sheet from a workbook and returns it as a dataframe.
And for good measure, you can add a shebang, pass in command line arguments, and run it like a program.
Posts: 75
Threads: 14
Joined: Jul 2022
I have named it 'exceltocsv.py' but I get no csv files appearring. Here is the code I have in the two .py files.
import pandas as pd
from pathlib import Path
import os
class ExcelToCsv:
def __init__(self):
os.chdir(os.path.abspath(os.path.dirname(__file__)))
self.homepath = Path('.')
def excel_to_csv(self, workbook, sheetname, startrow, endrow, csvfile, index=False):
# Added to make sure filename is Posix compliant.
if not isinstance(csvfile, 'pathlib.PosixPath'):
csvfile = Path(csvfile)
skiprows = startrow
if startrow > 1:
skiprows = startrow - 1
df = pd.read_excel(workbook, sheetname, skiprows, endrow - startrow)
df.to_csv(csvfile, index=index) from exceltocsv import ExcelToCsv
from openpyxl import load_workbook
ExcToCSV = ExcelToCsv().excel_to_csv
def main():
workbook = r"ExcelData3.xlsx"
wb = load_workbook(workbook)
for ws in wb.worksheets:
sheetname = ws.title
csv_file = f'{ws.title}.csv'
ExcToCSV(workbook, sheetname, startrow=1, nrows=ws.max_row, csvfile=csv_file)
print(csv_file)
Posts: 75
Threads: 14
Joined: Jul 2022
From post #13 I get
File "D:\Power BI & Python\Test.py", line 60, in <module>
excel_to_csv(workbook,*sys.argv[1:])
NameError: name 'workbook' is not defined
Posts: 12,050
Threads: 487
Joined: Sep 2016
Oct-29-2022, 07:56 PM
(This post was last modified: Oct-29-2022, 07:57 PM by Larz60+.)
The class module should be saved as ExcelToCsv.py
Capital letters (which are expected (but not required) for class scripts)
Posts: 75
Threads: 14
Joined: Jul 2022
Changed the .py file as suggested and I get exit 0 but no output files!
Posts: 12,050
Threads: 487
Joined: Sep 2016
Not sure what 'exit 0' is.
Since I can't see what your input file looks like, I cannot test.
I would use debugger and single step through if able.
do you know how to use debugger?
Posts: 6,827
Threads: 20
Joined: Feb 2020
Oct-30-2022, 12:45 AM
(This post was last modified: Oct-30-2022, 12:45 AM by deanhystad.)
(Oct-29-2022, 07:28 PM)azizrasul Wrote: From post #13 I get
File "D:\Power BI & Python\Test.py", line 60, in <module>
excel_to_csv(workbook,*sys.argv[1:])
NameError: name 'workbook' is not defined Operator error? Did you provide a filename for the workbook argument?
test workbook_filename optional_csv_filename
or
python test.py workbook_filename optional_csv_filename
Posts: 75
Threads: 14
Joined: Jul 2022
'exit 0' in PyCharm means there was no error.
|