Python Forum

Full Version: Version of glob for that Supports Windows Wildcards?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Is there a Python module that works like glob except that it works properly on Windows?

Here's the problem. Windows recognizes "?" and "*" as wildcards. Windows does NOT recognize "[" or "]" as special characters, whereas glob uses them to denote character classes. Many of my scripts are set to operate on either a file or a file pattern. Technically all file names are patterns. They are just patterns that match at most one file. So a loop like

for file in glob.glob(pattern):

works just fine under Windows as long as an actual pattern with a wildcard is given. It fails when

1. a non-wildcard filename is given and
2. that filename contains "[" or "]"

If my folder contains the files

Awards Day [1994].mp4
Betula Lake [1983].mp4
Piano Recital [1994].mp4

and my pattern is "Awards Day [1994].mp4" then the loop does not find the file. While glob provides the glob.escape method, it also escapes "?" and "*" thus making the Windows wildcards ineffective for pattern matching.

One "solution" which I expect someone to propose is to rename all my files to avoid the use of "[" or "]". This is not a solution. I should be allowed to use any valid character that Windows allows.
I don't have a solution for you that I think is elegant, but you can can do something silly like
pattern.replace("[", "[[").replace("]", "]]")
to do the escaping you need.
Write own regex,glob work many general cases(not for many different corner cases) with a set of predefined pattern eg *.txt is in regex .*\\.txt\\Z(?ms)

Can use eg os.scandir for looping over files,with own regex that match wanted files.
Example:
import os
import re

match = re.compile(r'.*\[\d+\].*').match
for fn in os.scandir('.'):
    if match(fn.name):
        print(fn.name)
Output:
Awards Day [1994].mp4 Betula Lake [1983].mp4 Piano Recital [1994].mp4
If i just comment out line 6,the read all files in folder.
Output:
Awards Day [1994].mp4 Betula Lake [1983].mp4 filename.csv file_reg.py foo.txt Piano Recital [1994].mp4 read_test.py
As it turns out
pattern.replace("[", "[[").replace("]", "]]")
doesn't work, however
pattern.replace("[", "[[]")
does the trick. Thanks for the suggestion. Silly? Yes. Inelegant? Also yes, but it's short and I can easily tag it with a "#kludge" end-of-line comment so that if something more elegant comes along I can easily do a find/replace.
Just use glob.escape, then you don't have to think about regex or escaping in source code.

In [1]: import glob                                                                                                             

In [2]: '*' + glob.escape('[') + '???' + glob.escape(']') + '*.mp3'                                                             
Out[2]: '*[[]???]*.mp3'
Yeah. Tried that. Unfortunately, glob.escape also escapes "*" which means I can no longer use Windows/DOS wildcards in my file names.