Python Forum
how to check for file type in a folder
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to check for file type in a folder
#1
Hello, I want the script to go through the entire folder and only list the files which are neither Zip or Rar files

but when I use this code, it just goes through the entire folder listing all the files, what am i doing wrong?

import zipfile, os, rarfile, unicodedata

from rarfile import RarFile
rootFolder = u"C:/Users/user/Desktop/archives/"

from zipfile import ZipFile
rootFolder = u"C:/Users/user/Desktop/archives/"

zipfiles = [os.path.join(rootFolder, f) for f in os.listdir(rootFolder)]
[print(i) for i in zipfiles if not isinstance(i, ZipFile) and not isinstance(i, RarFile)]
Reply
#2
First you create rootFolder twice, which is not the problem.
The last line just print all elements, because they are not an instance of ZipFile nor RarFile.
They are all strings.

zipfiles = [os.path.join(rootFolder, f) for f in os.listdir(rootFolder) if f.endswith('.rar') or f.endswith('.zip')]
This should give you a list with strings, where only strings are inside which ends with .rar or .zip.
This makes your comprehension a little bit long. You can use a function to decide if an element is added or using multiline.

Multiline example:
zipfiles = [
    os.path.join(rootFolder, f) for f in os.listdir(rootFolder)
    if f.endswith('.rar') or f.endswith('.zip')
    ]
Or with a decider function:
def is_archive(file):
    register = ('.rar', '.zip')
    return any(file.endswith(ftype) for ftype in register)


zipfiles = [os.path.join(rootFolder, f) for f in os.listdir(rootFolder) if is_archive(f)]
Another approach can be the use of pathlib in combimation with glob.

from pathlib import Path


archive_folder = Path('your path')
rar_archives = list(archive_folder.glob('**/*.rar')
zip_archives = list(archive_folder.glob('**/*.zip')
In the lists *_archive are Paths stored. There are some functions/modules, which can't handle Path objects.
In this case, you can convert the Path object with str(your_path_element) to a str.
The benefit of globbing is, that you only get the matching files.

The '**/*.rar' means, that also subdirectories are included.
https://docs.python.org/3/library/pathli....Path.glob
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
ZipFile is a class for reading or writing zip files. Your zipfiles list is a list of strings that are file paths. They're instances of str, not ZipFile. I would normally do this by checking the extension of the paths in zipfiles (if i[-3:] not in ('zip', 'rar')). If you are worried that some have the wrong extension, you would have to open them up iwht the ZipFile and RarFile classes, and see if there are errors trying to read them.

Edit: Dead_Eye beat me with a much better explanation.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply
#4
(Sep-15-2018, 01:26 PM)DeaD_EyE Wrote: First you create rootFolder twice, which is not the problem.
The last line just print all elements, because they are not an instance of ZipFile nor RarFile.
They are all strings.

zipfiles = [os.path.join(rootFolder, f) for f in os.listdir(rootFolder) if f.endswith('.rar') or f.endswith('.zip')]
This should give you a list with strings, where only strings are inside which ends with .rar or .zip.
This makes your comprehension a little bit long. You can use a function to decide if an element is added or using multiline.

Multiline example:
zipfiles = [
    os.path.join(rootFolder, f) for f in os.listdir(rootFolder)
    if f.endswith('.rar') or f.endswith('.zip')
    ]
Or with a decider function:
def is_archive(file):
    register = ('.rar', '.zip')
    return any(file.endswith(ftype) for ftype in register)


zipfiles = [os.path.join(rootFolder, f) for f in os.listdir(rootFolder) if is_archive(f)]
Another approach can be the use of pathlib in combimation with glob.

from pathlib import Path


archive_folder = Path('your path')
rar_archives = list(archive_folder.glob('**/*.rar')
zip_archives = list(archive_folder.glob('**/*.zip')
In the lists *_archive are Paths stored. There are some functions/modules, which can't handle Path objects.
In this case, you can convert the Path object with str(your_path_element) to a str.
The benefit of globbing is, that you only get the matching files.

The '**/*.rar' means, that also subdirectories are included.
https://docs.python.org/3/library/pathli....Path.glob

not possible the entirety of the folder extensions are as follows: .0, .1, .2, .3,... until .1999
Reply
#5
Then you will have to do what I said. Try to open each one in turn (with ZipFile and RarFile), and see if you get an error. If you don't get any errors, add it to your list.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare folder A and subfolder B and display files that are in folder A but not in su Melcu54 3 533 Jan-05-2024, 05:16 PM
Last Post: Pedroski55
  Reading a file name fron a folder on my desktop Fiona 4 901 Aug-23-2023, 11:11 AM
Last Post: Axel_Erfurt
  please check this i wanna use a csv file as a graph xCj11 5 1,483 Aug-25-2022, 08:19 PM
Last Post: deanhystad
  Function not executing each file in folder mathew_31 9 2,238 Aug-22-2022, 08:40 PM
Last Post: deanhystad
  check if a file exist on the internet and get the size kucingkembar 6 1,759 Apr-16-2022, 05:09 PM
Last Post: kucingkembar
  Trying to determine attachment file type before saving off.. cubangt 1 2,143 Feb-23-2022, 07:45 PM
Last Post: cubangt
  Dynamic File Name to a shared folder with open command in python sjcsvatt 9 6,025 Jan-07-2022, 04:55 PM
Last Post: bowlofred
  Code to check folder and sub folders for new file and alert fioranosnake 2 1,933 Jan-06-2022, 05:03 PM
Last Post: deanhystad
  Compare filename with folder name and copy matching files into a particular folder shantanu97 2 4,475 Dec-18-2021, 09:32 PM
Last Post: Larz60+
  How to import file and function in another folder SriRajesh 1 3,153 Dec-18-2021, 08:35 AM
Last Post: Gribouillis

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020