Python Forum
Version of glob for that Supports Windows Wildcards? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Version of glob for that Supports Windows Wildcards? (/thread-19146.html)



Version of glob for that Supports Windows Wildcards? - Reverend_Jim - Jun-14-2019

Is there a Python module that works like glob except that it works properly on Windows?

Here's the problem. Windows recognizes "?" and "*" as wildcards. Windows does NOT recognize "[" or "]" as special characters, whereas glob uses them to denote character classes. Many of my scripts are set to operate on either a file or a file pattern. Technically all file names are patterns. They are just patterns that match at most one file. So a loop like

for file in glob.glob(pattern):

works just fine under Windows as long as an actual pattern with a wildcard is given. It fails when

1. a non-wildcard filename is given and
2. that filename contains "[" or "]"

If my folder contains the files

Awards Day [1994].mp4
Betula Lake [1983].mp4
Piano Recital [1994].mp4

and my pattern is "Awards Day [1994].mp4" then the loop does not find the file. While glob provides the glob.escape method, it also escapes "?" and "*" thus making the Windows wildcards ineffective for pattern matching.

One "solution" which I expect someone to propose is to rename all my files to avoid the use of "[" or "]". This is not a solution. I should be allowed to use any valid character that Windows allows.


RE: Version of glob for that Supports Windows Wildcards? - micseydel - Jun-14-2019

I don't have a solution for you that I think is elegant, but you can can do something silly like
pattern.replace("[", "[[").replace("]", "]]")
to do the escaping you need.


RE: Version of glob for that Supports Windows Wildcards? - snippsat - Jun-15-2019

Write own regex,glob work many general cases(not for many different corner cases) with a set of predefined pattern eg *.txt is in regex .*\\.txt\\Z(?ms)

Can use eg os.scandir for looping over files,with own regex that match wanted files.
Example:
import os
import re

match = re.compile(r'.*\[\d+\].*').match
for fn in os.scandir('.'):
    if match(fn.name):
        print(fn.name)
Output:
Awards Day [1994].mp4 Betula Lake [1983].mp4 Piano Recital [1994].mp4
If i just comment out line 6,the read all files in folder.
Output:
Awards Day [1994].mp4 Betula Lake [1983].mp4 filename.csv file_reg.py foo.txt Piano Recital [1994].mp4 read_test.py



RE: Version of glob for that Supports Windows Wildcards? - Reverend_Jim - Jun-15-2019

As it turns out
pattern.replace("[", "[[").replace("]", "]]")
doesn't work, however
pattern.replace("[", "[[]")
does the trick. Thanks for the suggestion. Silly? Yes. Inelegant? Also yes, but it's short and I can easily tag it with a "#kludge" end-of-line comment so that if something more elegant comes along I can easily do a find/replace.


RE: Version of glob for that Supports Windows Wildcards? - DeaD_EyE - Jun-16-2019

Just use glob.escape, then you don't have to think about regex or escaping in source code.

In [1]: import glob                                                                                                             

In [2]: '*' + glob.escape('[') + '???' + glob.escape(']') + '*.mp3'                                                             
Out[2]: '*[[]???]*.mp3'



RE: Version of glob for that Supports Windows Wildcards? - Reverend_Jim - Jun-18-2019

Yeah. Tried that. Unfortunately, glob.escape also escapes "*" which means I can no longer use Windows/DOS wildcards in my file names.