Python Forum

Full Version: Reading Excel file and use a wildcard in file name and sheet name
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a simple script that reads an excel file and captures certain columns and saves to a new Excel spreadsheet. I run this script each day and before I run it I have to update the date in the file name and sheet name to match the current report. I am trying to determine if it is possible to use a wildcard in the file name and sheet name. I have tried using the * for a wildcard like this....data = pd.read_excel(r"H:\Daily Reports\Employee_*_Report.xlsx" , sheet_name='Metrics * Employee'), but it does not seem to work. I am relatively new to using Python and I am sure it is probably something simple that I am missing.

Here is the python script I am using.

import pandas as pd

data = pd.read_excel(r"H:\Daily Reports\Employee_12202021_Report.xlsx" , sheet_name='Metrics Dec 20 Employee')

#Select only certain columns
df = pd.DataFrame(data, columns= ['Lookup','Report_Date', 'Employee_ID', 'Type', 'Location', 'Sales', 'Hours'])

#load to excel and save
df.to_excel (r'H:\Monthly Reports\Completed\Employee Metrics 12202021.xlsx', sheet_name='Summary', index=False, header=True)
Is there more than one file in the directory?

to get the sheet names you could do domething like:

import pandas as pd
import os

df = pd.read_excel(f'{os.getcwd()}/weather/test.xlsx', None)

for sheet in df.keys():
    print(sheet)
Output:
Sheet1
Will list all sheets in the file.

You could use listdir() to get file names
The directory would only contain one file, and the file would have a date in the file name. I had thought about using a similar approach to list the sheet names, but wasn't sure how to account for the sheet name on each daily file to be variable when reading the excel file/sheet.
(Jan-13-2022, 07:09 PM)randolphoralph Wrote: [ -> ]The directory would only contain one file, and the file would have a date in the file name.
Can use glob or also pathlib has glob build in.
Example
import pandas as pd
from pathlib import Path

dest = r'.'
for path in Path(dest).rglob('output_*_result.xlsx'):
    if path.is_file():
        print(path)

df = pd.read_excel(path)
print(df)
Now will this match any number.
So in my test one file with these two names read fine.
output_123_result.xlsx
output_99999_result.xlsx
As it's one file should get away with not specify sheet name.
(Jan-13-2022, 07:57 PM)snippsat Wrote: [ -> ]
(Jan-13-2022, 07:09 PM)randolphoralph Wrote: [ -> ]The directory would only contain one file, and the file would have a date in the file name.
Can use glob or also pathlib has glob build in.
Example
import pandas as pd
from pathlib import Path

dest = r'.'
for path in Path(dest).rglob('output_*_result.xlsx'):
    if path.is_file():
        print(path)

df = pd.read_excel(path)
print(df)
Now will this match any number.
So in my test one file with these two names read fine.
output_123_result.xlsx
output_99999_result.xlsx
As it's one file should get away with not specify sheet name.

I added from pathlib import Path, but get an error stating ImportError: cannot import name 'path' from 'pathlib'. Does it matter that I am working within Windows?
pathlib was new in Python 3.4,what version of Python do you use?
Remember that import is case sensitive,so if i test can make your error.
>>> from pathlib import path
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
ImportError: cannot import name 'path' from 'pathlib' (C:\Python310\lib\pathlib.py

# It's like this 
>>> from pathlib import Path
>>> 
(Jan-13-2022, 09:45 PM)snippsat Wrote: [ -> ]pathlib was new in Python 3.4,what version of Python do you use?
Remember that import is case sensitive,so if i test can make your error.
>>> from pathlib import path
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
ImportError: cannot import name 'path' from 'pathlib' (C:\Python310\lib\pathlib.py

# It's like this 
>>> from pathlib import Path
>>> 

That is exactly what the issue was. Case sensitive. Good catch and thank you for your help.