Here a more simple version, not tested much:
I changed the FileList, because it's now a generator, which yields Path objects.
You can make the homepath with
Makes it more readable. Also the recursion is not needed and can fail, if your directory tree is deeper than 1000.
The method
Now to you other problem. You can extend the function, to take a argument for extensions you want to process.
By the way, you could iterate over the generator. Before you provided a list as argument, which was modified inside the function. Now the function is turned into a generator (the yield keyword is inside). For each iteration the generator returns a path.
If the file-list is not needed, you could remove
from pathlib import Path from typing import Generator FileList = Generator[Path, None, None] def get_list_of_files(base_folder: Path) -> FileList: for entry in base_folder.rglob('*'): # Skip all hidden files and folders if entry.name.startswith('.'): continue if entry.is_file(): yield entry base_folder = Path.home() / 'Development' files = list(get_list_of_files(base_folder)) for file in files: print(file)Btw. the typehint stuff is not needed, but they could help an IDE to check for errors.
I changed the FileList, because it's now a generator, which yields Path objects.
You can make the homepath with
Path.home()
and join paths with /
.Makes it more readable. Also the recursion is not needed and can fail, if your directory tree is deeper than 1000.
The method
rglob('*')
does the job and find all files and directories recursive.Now to you other problem. You can extend the function, to take a argument for extensions you want to process.
from pathlib import Path my_extensions = ['.txt', '.odt', '.csv'] my_path = Path('/A/directory/somewhere/that/does/not.exist/file.txt') print('The suffix is:', my_path.suffix) if my_path.suffix not in my_extensions: print(my_path, 'has not the allowed file extension')Instead of print something, you
continue
in the for-loop, to skip this element.By the way, you could iterate over the generator. Before you provided a list as argument, which was modified inside the function. Now the function is turned into a generator (the yield keyword is inside). For each iteration the generator returns a path.
If the file-list is not needed, you could remove
files = []
for file in get_list_of_files(base_folder): print(file) my_zipfile = zipfile.ZipFile(file) my_zipfile.extractall(r'C:\Users\username\My_Dataset\new')Maybe the zipfile Module doesn't understand the Path object (haven't tested yet), but a path could converted into a string with
str(file)
.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
All humans together. We don't need politicians!