Nov-11-2017, 05:54 PM
(This post was last modified: Nov-11-2017, 05:54 PM by QbLearningPython.)
While testing a module, I have found a weird behaviour of pathlib package. I have a list of pathlib.Paths and I sorted() it. I supposed that the order retrieved by sorted() a list of Paths would be the same as the order retrieved by sorted() a list of their (string) filenames. But it is not the case.
Let me explain.
I have a list of filenames such as :
If I run the following:
the alphabetical (string) order of this list will be:
But when I try to order the same list as pathlib.Paths using:
The list returned is (just showing filenames of the pathlib.Paths):
which is different from previous list because 'spam/spams.txt' does not go after '/spam/spam.txt' and before all '/spam/spams/*' files (instead, it goes at the end of the list).
You can check it using:
which returns False.
I am not sure this would be a bug. Maybe it is the intended purpose. However, I think that it is a weird behaviour. Unless I am missing something, I can hardly understand why a list of pathlib.Paths and a list with the same string filenames can be ordered in the same fashion.
A crafted script to test this:
I am running Python 3.6.3 on MacOs 10.12.6
Thanks.
Let me explain.
I have a list of filenames such as :
1 2 3 4 5 6 7 8 9 10 11 |
filenames_for_testing = ( '/spam/spams.txt' , '/spam/spam.txt' , '/spam/another.txt' , '/spam/binary.bin' , '/spam/spams/spam.ttt' , '/spam/spams/spam01.txt' , '/spam/spams/spam02.txt' , '/spam/spams/spam03.ppp' , '/spam/spams/spam04.doc' , ) |
1 2 3 4 |
sorted_filenames = sorted (filenames_for_testing) print () [ print (element) for element in sorted_filenames] print () |
- /spam/another.txt
- /spam/binary.bin
- /spam/spam.txt
- /spam/spams.txt
- /spam/spams/spam.ttt
- /spam/spams/spam01.txt
- /spam/spams/spam02.txt
- /spam/spams/spam03.ppp
- /spam/spams/spam04.doc
But when I try to order the same list as pathlib.Paths using:
1 2 3 4 5 6 7 |
from pathlib import Path paths_for_testing = [ Path(filename) for filename in filenames_for_testing ] sorted_paths = sorted (paths_for_testing) |
- /spam/another.txt
- /spam/binary.bin
- /spam/spam.txt
- /spam/spams/spam.ttt
- /spam/spams/spam01.txt
- /spam/spams/spam02.txt
- /spam/spams/spam03.ppp
- /spam/spams/spam04.doc
- /spam/spams.txt
which is different from previous list because 'spam/spams.txt' does not go after '/spam/spam.txt' and before all '/spam/spams/*' files (instead, it goes at the end of the list).
You can check it using:
1 |
sorted_filenames = = [ str (path) for path in sorted_paths] |
I am not sure this would be a bug. Maybe it is the intended purpose. However, I think that it is a weird behaviour. Unless I am missing something, I can hardly understand why a list of pathlib.Paths and a list with the same string filenames can be ordered in the same fashion.
A crafted script to test this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
from pathlib import Path # order string filenames filenames_for_testing = ( '/spam/spams.txt' , '/spam/spam.txt' , '/spam/another.txt' , '/spam/binary.bin' , '/spam/spams/spam.ttt' , '/spam/spams/spam01.txt' , '/spam/spams/spam02.txt' , '/spam/spams/spam03.ppp' , '/spam/spams/spam04.doc' , ) sorted_filenames = sorted (filenames_for_testing) # output ordered list of string filenames print () print ( "Ordered list of string filenames:" ) print () [ print ( f '\t{element}' ) for element in sorted_filenames] print () # order paths (build from same string filenames) paths_for_testing = [ Path(filename) for filename in filenames_for_testing ] sorted_paths = sorted (paths_for_testing) # output ordered list of pathlib.Paths print () print ( "Ordered list of pathlib.Paths:" ) print () [ print ( f '\t{element}' ) for element in sorted_paths] print () # compare print () if sorted_filenames = = [ str (path) for path in sorted_paths]: print ( 'Ordered lists of string filenames and pathlib.Paths are EQUAL.' ) else : print ( 'Ordered lists of string filenames and pathlib.Paths are DIFFERENT.' ) for element in range ( 0 , len (sorted_filenames)): if sorted_filenames[element] ! = str (sorted_paths[element]): print () print ( 'First different element:' ) print ( f '\tElement #{element}' ) print ( f '\t{sorted_filenames[element]} != {sorted_paths[element]}' ) break print () |
Thanks.