Posts: 2
Threads: 1
Joined: Apr 2024
I'm writing an image viewer and need a way to get the next/previous image in a folder. I've cobbled together an approach based on various searches of the internet and documentation, but I don't really like it.
My existing solution depends on using os.listdir to load all the filenames in the directory into memory, then sorting that list. This is slow for large directories, especially when initially creating the list. It seems like every approach to doing this in Python requires loading and sorting all the filenames in one form or another.
Is there not a way to do this kind of query on a lower level, without having to store and sort the paths in Python itself? Maybe via some extended filesystem library? I do need this to be cross-platform.
Here is what I have now, for reference (this is part of a larger program, so some things are referenced that aren't present):
def applyFileListSort(self, sort):
self.sort_type = sort
if not self.file_list:
return
if sort is SortType.ALPHABETICAL:
self.file_list.sort(key=cmp_to_key(locale.strcoll))
elif sort is SortType.MODIFIED:
self.file_list.sort(key=os.path.getmtime)
else:
raise ValueError("Unhandled sorting")
def getFileList(self):
if not self.file_list:
self.file_list = []
dir = os.path.dirname(self.getFilename())
for file in os.listdir(path=dir):
if os.path.splitext(file)[1].lower() in SUPPORTED_EXTENSIONS:
self.file_list.append(os.path.join(dir, file))
self.applyFileListSort(self.sort_type)
return self.file_list
def changeImage(self, offset):
file_list = self.getFileList()
cur_i = file_list.index(self.getFilename())
new_filename = file_list[(cur_i + offset) % len(file_list)]
self.openFile(new_filename)
Posts: 4,779
Threads: 76
Joined: Jan 2018
You can use os.scandir() instead of os.listdir() , which doesn't load the list.
« We can solve any problem by introducing an extra level of indirection »
Posts: 1,088
Threads: 143
Joined: Jul 2017
Not too sure what you want exactly.
I think you don't want a long list in memory, but that is only text, won't take up too much space.
The pictures won't remain open, just 1 pic at a time, close it before opening the next pic.
from pathlib import Path
mydir = Path('/home/pedro/Pictures/')
# only get files, not directories
filelist = [filename for filename in mydir.iterdir() if filename.is_file()]
for filename in filelist:
print(f"\nfilename: {filename.name}")
print(f"file suffix: {filename.suffix}")
# get the ending like gif or jpg
def getSuffix(filename):
ending = filename.suffix[1:]
return ending
# sort the files according to the endings
# could also save files in folders according to the endings
files = sorted(filelist, key=getSuffix) Using yield and next you can get the pics 1 at a time, but you can't go back, unless you save the name of the pic, which would entail making a list!
def showDetails():
filelist = [filename for filename in mydir.iterdir() if filename.is_file()]
yield (filename, filename.suffix)
f = showDetails()
# get information about the file
details = next(f)
# get information from all files in filelist
for i in f:
print(i)
Posts: 2,120
Threads: 10
Joined: May 2017
Here as a class with type hints. mypy does not complain :-)
This loads the whole directory content for the first time, and then only if the sort_function has been changed.
The __init__ method makes the object iterable.
Index access can be used on this object. __getitem__ allows it.
Without type hints it's lesser code. I do this, to be able to check my code.
from pathlib import Path
from typing import Any, Callable, Iterable
class Images:
def __init__(
self,
root: str | Path,
filetypes: Iterable[str],
sort_func: Callable[[Path], Any] | None = None,
):
self.root = Path(root)
self.filetypes = filetypes
self.images: list[Path] = []
self._sort_func: Callable[[Path], Any] | None = sort_func
self.pointer: int | None = None
self.scan_dir()
@property
def sort_func(self) -> Callable[[Path], Any] | None:
return self._sort_func
@sort_func.setter
def sort_func(self, func: Callable[[Path], Any] | None) -> None:
self._sort_func = func
self.scan_dir()
def _iter_dir(self):
return (
file
for file in self.root.iterdir()
if file.is_file() and file.suffix.lower() in self.filetypes
)
def scan_dir(self):
self.images = sorted(self._iter_dir(), key=self.sort_func)
def _xt(self, direction: int) -> Path | None:
if not len(self):
return None
if self.pointer is None:
self.pointer = 0
else:
self.pointer += direction
self.pointer %= len(self.images)
return self.images[self.pointer]
def next(self) -> Path | None:
return self._xt(1)
def prev(self) -> Path | None:
return self._xt(-1)
def __getitem__(self, index: int) -> Path:
return self.images[index]
def __iter__(self):
yield from self.images
def __len__(self) -> int:
return len(self.images)
def sort_by_size(path: Path) -> int:
return path.stat().st_size
img = Images(r"C:\Users\XXX\Pictures", (".png", ".jpg"), sort_by_size)
print(list(img))
Posts: 6,775
Threads: 20
Joined: Feb 2020
I don't think the problem has anything to do with the posted code. I created a folder with 10,000 files and ran your code. It took 0,0415 seconds to load all the filenames into a list and 0,004 seconds to sort the files in alphabetical order. Sorting by last modified takes 100 times longer, 0.4 seconds.
Do you have more than 10,000 image files in your folder?
I also tried using pathlib.glob() to get all the image files. This was faster than looping through os.listdir, taking only 0.003 seconds to load the files.
WilliamKappler likes this post
Posts: 2
Threads: 1
Joined: Apr 2024
So, what I was hoping was that there was a lower level operation I could use that wouldn't (may not?) actually require storing all the filenames, sorting them, searching them to move to the next one, so on. However, it seems like that doesn't exist and even when it does in something like C/C++, is highly operating system dependent.
With that said, I went further into this and I think my original problem was more user error on my part with how I was testing this. I was unknowingly running this on quite a slow device and a lot of what I was attributing to this file search process was due to a large image that kept making my test look slower than it really was. It turns out, it is pretty fast, definitely enough I'm not worried anymore.
I appreciate the responses and apologize this was somewhat a non-issue in the end. Also going to take a closer look at DeaD_EyE's suggestion to help clean up this process.
Thanks a lot!
Posts: 4,779
Threads: 76
Joined: Jan 2018
(Apr-11-2024, 03:32 PM)WilliamKappler Wrote: So, what I was hoping was that there was a lower level operation os.scandir is a lower level operation. It doesn't store all the filenames.
« We can solve any problem by introducing an extra level of indirection »
Posts: 6,775
Threads: 20
Joined: Feb 2020
Apr-11-2024, 05:45 PM
(This post was last modified: Apr-11-2024, 05:45 PM by deanhystad.)
Quote:os.scandir is a lower level operation. It doesn't store all the filenames.
It also doesn't sort the filenames alphabetically or by modification date.
You could always write your own scanning/sorting external function and call it from python, or run shell script, but I really don't think long delays are related to getting and sorting the files. A program that displays one image at a time using prev/next is not appropriate for sifting through thousands of images. Even if there are tens of thousands of images it takes less than a second to get all the files and sort them. If there is a lengthy delay in the OP's program I would look elsewhere for the source of that delay.
Posts: 1
Threads: 0
Joined: Apr 2024
Apr-11-2024, 06:42 PM
(This post was last modified: May-01-2024, 09:40 AM by shalena61.)
(Apr-08-2024, 07:24 AM)Pedroski55 Wrote: Not too sure what you want exactly.
I think you don't want a long list in memory, but that is only text, won't take up too much space.
The pictures won't remain open, just 1 pic at a time, close it before opening the next pic.
from pathlib import Path
mydir = Path('/home/pedro/Pictures/')
# only get files, not directories
filelist = [filename for filename in mydir.iterdir() if filename.is_file()]
for filename in filelist:
print(f"\nfilename: {filename.name}")
print(f"file suffix: {filename.suffix}")
# get the ending like gif or jpg
def getSuffix(filename):
ending = filename.suffix[1:]
return ending
# sort the files according to the endings
# could also save files in folders according to the endings
files = sorted(filelist, key=getSuffix) Using yield and next you can get the pics 1 at a time, but you can't go back, unless you save the name of the pic, which would entail making a list!
def showDetails(): [url=https://pmkisanyojanastatus.com/]PM Kisan Status[/url]
filelist = [filename for filename in mydir.iterdir() if filename.is_file()]
yield (filename, filename.suffix)
f = showDetails()
# get information about the file
details = next(f)
# get information from all files in filelist
for i in f:
print(i) Thanks for share good iinformation.
Posts: 1,088
Threads: 143
Joined: Jul 2017
Just as a comparison, you can make a generator and a list, then compare the sizes.
The generator is tiny!
from pathlib import Path
import sys
mydir = Path('/home/pedro/Pictures/')
# only get files, not directories
# make a list
file_list = [filename for filename in mydir.iterdir() if filename.is_file()]
# make a generator, very small in memory
filelist = (filename for filename in mydir.iterdir() if filename.is_file())
for filename in filelist:
print(f"\nfilename: {filename.name}")
print(f"file suffix: {filename.suffix}")
sys.getsizeof(filelist) # returns 104
sys.getsizeof(file_list) # returns 1656
type(filelist) # returns <class 'generator'>
type(file_list) # returns <class 'list'>
|