Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 glob for dir listing
#1
Hi 
I am attempting to list all directories using a glob pattern.
In each case below I cannot obtain a definitive list.

for example:
>>> import glob
>>> dirPattern = '/data/part[0-9]'
>>> for d in glob.glob(dirPattern):
...   print(d)
... 
/data/part8
/data/part2
/data/part4
/data/part7
/data/part3
/data/part1
/data/part5
/data/part9
/data/part6
one of the directories is however missing - "/data/part10"

I have tried this as well:
>>> dirPattern='/data/part[0-9]{2}'
>>> for d in glob.glob(dirPattern):
...   print(d)
... 
>>> 
>>> dirPattern='/data/part[0-9][0-9]'
>>> for d in glob.glob(dirPattern):
...   print(d)
... 
/data/part10

But as you can see, either nothing appears or only part10 is listed.
Can anybody suggest a pattern that will match part1 to part10 ?

Thanks
Quote
#2
import glob
dirPattern = './part*'
for d in glob.glob(dirPattern):
   print(d)
This works for me
Quote
#3
that would also include other directories, for example "part_xyz".
Only directories with a numeric ending (2 digits only) should be listed.

Glob might not be the answer.
Quote
#4
(Mar-07-2017, 03:08 PM)bluefrog Wrote: Glob might not be the answer.
Yes glob can only do simple regex stuff.
Can use os.listdir() or newer Python version os.scandir().
Then can write own regex.
Eg:
import os
import re

for f_name in os.listdir():
   if re.match(r'^[A-Za-z]+\d{1,2}$', f_name):
       print(f_name)
Quote
#5
You can still use glob combined with re matching:

for d in glob.glob("/data/part[0-9]*"):
    if re.match("/data/part\d{1,2}$", d):
       print(d)
Compared to os.listdir(), it will iterate only on "prefiltered" list, practically it should be same.

And if you dont have mixed names like part2x, then even glob("/data/part[0-9]*") would work ...
bluefrog likes this post
Quote
#6
great, thanks!
I should've used a regex from the start.

The only thing for us to consider however is that often we have to query hadoop filesystems, so although your suggestion will work for data on shared linux file systems, I don't think it will do for hadoop.

I'll have to experiment with prefix's
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Version of glob for that Supports Windows Wildcards? Reverend_Jim 5 526 Jun-18-2019, 06:31 PM
Last Post: Reverend_Jim
  AWS lambda script help - Listing untagged volumes jutler 0 386 Feb-13-2019, 02:36 PM
Last Post: jutler
  nested for loops glob devenuro 3 835 Sep-20-2018, 09:54 PM
Last Post: ODIS
  Glob and automating help Thunberd 0 511 Jun-13-2018, 04:42 PM
Last Post: Thunberd
  CSV import results in listing of all letters of elements Bigshow23 3 1,612 May-23-2017, 08:00 PM
Last Post: buran
  Adding regedit value to glob.glob AlterBlitz 2 1,174 May-18-2017, 09:09 PM
Last Post: nilamo

Forum Jump:


Users browsing this thread: 1 Guest(s)