Python Forum

Full Version: detecting type of compression of a file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
i would like a library function that can read the first 4096 bytes of a named file (no more than that, less if it can) and detect if it is a compressed file or not and return an indicator of which type of compression or not..

0 for not compressed
1 ... for compressed

or names may be given as strings. i do not want to have to call some program since i would like for this to be portable. it should support gzip, bzip2, lzma, pkzip, and xz. and anything else Python can support. an alternative is one that can determine this from a bytes string of the starting contents of the file (the caller reads in some of the file).
Here is an experiment in a linux console
$ cp spam.zip foo
$ truncate -s 4096 foo
$ file foo
foo: Zip archive data, at least v2.0 to extract
Calling Linux' file command could give you the functionality.

Python libraries may exist in Pypi such as python-magic or filetype