Reading Excel file and use a wildcard in file name and sheet name

randolphoralph · (This post was last modified: Jan-13-2022, 05:23 PM by randolphoralph.)

I have a simple script that reads an excel file and captures certain columns and saves to a new Excel spreadsheet. I run this script each day and before I run it I have to update the date in the file name and sheet name to match the current report. I am trying to determine if it is possible to use a wildcard in the file name and sheet name. I have tried using the * for a wildcard like this....data = pd.read_excel(r"H:\Daily Reports\Employee_*_Report.xlsx" , sheet_name='Metrics * Employee'), but it does not seem to work. I am relatively new to using Python and I am sure it is probably something simple that I am missing.

Here is the python script I am using.

import pandas as pd

data = pd.read_excel(r"H:\Daily Reports\Employee_12202021_Report.xlsx" , sheet_name='Metrics Dec 20 Employee')

#Select only certain columns
df = pd.DataFrame(data, columns= ['Lookup','Report_Date', 'Employee_ID', 'Type', 'Location', 'Sales', 'Hours'])

#load to excel and save
df.to_excel (r'H:\Monthly Reports\Completed\Employee Metrics 12202021.xlsx', sheet_name='Summary', index=False, header=True)

menator01 · Jan-13-2022, 06:27 PM

Is there more than one file in the directory?

to get the sheet names you could do domething like:

import pandas as pd
import os

df = pd.read_excel(f'{os.getcwd()}/weather/test.xlsx', None)

for sheet in df.keys():
    print(sheet)

Output:
Sheet1

Will list all sheets in the file.

You could use listdir() to get file names

randolphoralph · Jan-13-2022, 07:09 PM

The directory would only contain one file, and the file would have a date in the file name. I had thought about using a similar approach to list the sheet names, but wasn't sure how to account for the sheet name on each daily file to be variable when reading the excel file/sheet.

***snippsat*** · (This post was last modified: Jan-13-2022, 07:57 PM by snippsat.)

(Jan-13-2022, 07:09 PM)randolphoralph Wrote: The directory would only contain one file, and the file would have a date in the file name.

Can use glob or also pathlib has glob build in.
Example

import pandas as pd
from pathlib import Path

dest = r'.'
for path in Path(dest).rglob('output_*_result.xlsx'):
    if path.is_file():
        print(path)

df = pd.read_excel(path)
print(df)

Now will this match any number.
So in my test one file with these two names read fine.

output_123_result.xlsx
output_99999_result.xlsx

As it's one file should get away with not specify sheet name.

randolphoralph · (This post was last modified: Jan-13-2022, 08:59 PM by randolphoralph.)

(Jan-13-2022, 07:57 PM)snippsat Wrote:
(Jan-13-2022, 07:09 PM)randolphoralph Wrote: The directory would only contain one file, and the file would have a date in the file name.
Can use glob or also pathlib has glob build in.
Example
import pandas as pd
from pathlib import Path

dest = r'.'
for path in Path(dest).rglob('output_*_result.xlsx'):
    if path.is_file():
        print(path)

df = pd.read_excel(path)
print(df)
Now will this match any number.
So in my test one file with these two names read fine.
output_123_result.xlsx
output_99999_result.xlsx
As it's one file should get away with not specify sheet name.

I added from pathlib import Path, but get an error stating ImportError: cannot import name 'path' from 'pathlib'. Does it matter that I am working within Windows?

***snippsat*** · (This post was last modified: Jan-13-2022, 09:45 PM by snippsat.)

pathlib was new in Python 3.4,what version of Python do you use?
Remember that import is case sensitive,so if i test can make your error.

>>> from pathlib import path
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
ImportError: cannot import name 'path' from 'pathlib' (C:\Python310\lib\pathlib.py

# It's like this 
>>> from pathlib import Path
>>>

randolphoralph · Jan-13-2022, 10:20 PM

(Jan-13-2022, 09:45 PM)snippsat Wrote: pathlib was new in Python 3.4,what version of Python do you use?
Remember that import is case sensitive,so if i test can make your error.
>>> from pathlib import path
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
ImportError: cannot import name 'path' from 'pathlib' (C:\Python310\lib\pathlib.py

# It's like this 
>>> from pathlib import Path
>>> 

That is exactly what the issue was. Case sensitive. Good catch and thank you for your help.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Python openyxl not updating Excel file	MrBean12	1	315	Mar-03-2024, 12:16 AM Last Post: MrBean12
	Copy Paste excel files based on the first letters of the file name	Viento	2	423	Feb-07-2024, 12:24 PM Last Post: Viento
	file open "file not found error"	shanoger	8	1,087	Dec-14-2023, 08:03 AM Last Post: shanoger
	problems with reading csv file.	MassiJames	3	618	Nov-16-2023, 03:41 PM Last Post: snippsat
	Search Excel File with a list of values	huzzug	4	1,216	Nov-03-2023, 05:35 PM Last Post: huzzug
	Updating sharepoint excel file odd results	cubangt	1	821	Nov-03-2023, 05:13 PM Last Post: noisefloor
	Python and pandas: Aggregate lines form Excel sheet	Glyxbringer	12	1,837	Oct-31-2023, 10:21 AM Last Post: Pedroski55
	trouble reading string/module from excel as a list	popular_dog	0	416	Oct-04-2023, 01:07 PM Last Post: popular_dog
	Need to replace a string with a file (HTML file)	tester_V	1	761	Aug-30-2023, 03:42 AM Last Post: Larz60+
	Reading a file name fron a folder on my desktop	Fiona	4	898	Aug-23-2023, 11:11 AM Last Post: Axel_Erfurt

Reading Excel file and use a wildcard in file name and sheet name

User Panel Messages

Announcements