Python Forum
why i don't like os.walk()
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
why i don't like os.walk()
#11
did you try pathlib?
FYI the following was corrected.
Quote:From: Guido van Rossum <gui...@python.org>
Mon, 11 Jan 2016 12:41:48 -0800

On Mon, Jan 11, 2016 at 10:57 AM, Gregory P. Smith <[email protected]> wrote:

> On Wed, Jan 6, 2016 at 3:05 PM Brendan Moloney <[email protected]> wrote:>>> Its important to keep in mind the main benefit of scandir is you don't>> have to do ANY stat call in many cases, because the directory listing>> provides some subset of this info. On Linux you can at least tell if a path>> is a file or directory. On windows there is much more info provided by the>> directory listing. Avoiding subsequent stat calls is also nice, but not>> nearly as important due to OS level caching.>>>> +1 - this was one of the two primary motivations behind scandir. Anything> trying to reimplement a filesystem tree walker without using scandir is> going to have sub-standard performance.>> If we ever offer anything with "find like functionality" related to> pathlib, it *needs* to be based on scandir. Anything else would just be> repeating the convenient but untrue limiting assumptions of os.listdir:> That the contents of a directory can be loaded into memory and that we> don't mind re-querying the OS for stat information that it already gave us> but we threw away as part of reading the directory.>

And we already have this in the form of pathlib's [r]glob() methods.
There's a patch to the glob module in http://bugs.python.org/issue25596 and
as soon as that's committed I hope that its author(s) will work on doing a
similar patch for pathlib's [r]glob (tracking this in
http://bugs.python.org/issue26032).

--
--Guido van Rossum (python.org/~guido)
Reply
#12
(Jan-10-2018, 04:44 AM)Skaperen Wrote: if it has a subdirectory, i want to yield, one file object at a time, the entire subtree of it before yielding anything after it at that level.
I don't understand what you want. Can you give an example hierarchy and the result you're expecting?
Reply
#13
(Jan-10-2018, 05:12 AM)Gribouillis Wrote:
(Jan-10-2018, 04:44 AM)Skaperen Wrote: if it has a subdirectory, i want to yield, one file object at a time, the entire subtree of it before yielding anything after it at that level.
I don't understand what you want. Can you give an example hierarchy and the result you're expecting?
here are 2 commands on one line to make a tree and a 2 command pipeline to show the order i want my generator to yield results:
Output:
lt1/forums /home/forums 1> mkdir tryme{,/{foo,bar}}{,/{1,2}};touch tryme{,/{foo,bar}}{,/{1,2}}/{a,b,c} lt1/forums /home/forums 2> find tryme -print|sort tryme tryme/1 tryme/1/a tryme/1/b tryme/1/c tryme/2 tryme/2/a tryme/2/b tryme/2/c tryme/a tryme/b tryme/bar tryme/bar/1 tryme/bar/1/a tryme/bar/1/b tryme/bar/1/c tryme/bar/2 tryme/bar/2/a tryme/bar/2/b tryme/bar/2/c tryme/bar/a tryme/bar/b tryme/bar/c tryme/c tryme/foo tryme/foo/1 tryme/foo/1/a tryme/foo/1/b tryme/foo/1/c tryme/foo/2 tryme/foo/2/a tryme/foo/2/b tryme/foo/2/c tryme/foo/a tryme/foo/b tryme/foo/c lt1/forums /home/forums 3>
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#14
here is the code i have made so far. you probably want to scroll past the debugging tools that are in there.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#15
(Jan-10-2018, 05:54 AM)Skaperen Wrote: here are 2 commands on one line to make a tree and a 2 command pipeline to show the order i want my generator to yield results
os.walk() cannot do that because it only walks directories, not files.

I once wrote a nice generic non-recursive tree/graph traversal module named walktree. Using this module, it is a piece of cake to code your traversal

from collections import namedtuple
from walktree import walk
import os

Node = namedtuple('Node', 'path isdir')    

def subn(node):
    if not node.isdir:
        return ()
    pairs = sorted((e.path, e.is_dir(follow_symlinks=False))
                   for e in os.scandir(node.path))
    return (Node(*p) for p in pairs)


def flatwalk(path):
    root = Node(path, os.path.isdir(path) and not os.path.islink(path))
    for seq in walk(root, subn, walk.enter | walk.leaf):
        yield(seq[-1].path)

if __name__ == '__main__':
    for x in flatwalk('tryme'):
        print(x)
Remark: this code uses scandir() as recommended by G van Rossum.
Reply
#16
Nice solution Grib Wink
(Jan-10-2018, 10:45 PM)Gribouillis Wrote: Remark: this code uses scandir() as recommended by G van Rossum.
Could also mention Ben Hoyt the creator of scandir(),which now also is used in in os.walk() and glob.
Here a nice blog Contributing os.scandir() to Python about the whole process from start to become a part of standard library.
Reply
#17
i need to do a flatwalk generator that works in every Python version released in the past 10 years. so that means 2.6 and on.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#18
i need to do a flatwalk generator that works in every Python version released in the past 10 years.
why?
Isn't that sort of like saying you need to build an oil filter that fits every car going back to the Model-T?
Reply
#19
because that is a generic standard time frame for making software tools compatible. i would want to ask BDL what to use in 2.7? wasn't 2.7 going to get lots of backports from 3.X? os.scandir() seems like it would not break things to be added to 2.7.X (i believe 2.7.14 is next).

if i needed to be compatible with Python 1.X then i would consider that to be like a model-T. there are places i still run across 2.6. in about 9 months, 2.6 will be off my radar but 2.7 will still be there for a while. so that means os.listdir() for me.
Tradition is peer pressure from dead people

What do you call someone who speaks three languages? Trilingual. Two languages? Bilingual. One language? American.
Reply
#20
(Jan-11-2018, 01:37 AM)Skaperen Wrote: i need to do a flatwalk generator that works in every Python version released in the past 10 years. so that means 2.6 and on.
The scandir module in pypi works for python 2.6+ and 3.2+ !
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  examples using os.walk() Skaperen 12 7,122 Mar-22-2021, 05:56 PM
Last Post: Skaperen

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020