How to detect type of compression used on the file? (if no file extension is specified)

BinaryCompression

Binary Problem Overview


How can one detect the type of compression used on the file? (assuming that .zip, .gz, .xz or any other extension is not specified).

Is this information stored somewhere in the header of that file?

Binary Solutions


Solution 1 - Binary

You can determine that it is likely to be one of those formats by looking at the first few bytes. You should then test to see if it really is one of those, using an integrity check from the associated utility for that format, or by actually proceeding to decompress.

You can find the header formats in the descriptions:

Others:

  • zlib (.zz) format description, starts with two bytes (in bits) 0aaa1000 bbbccccc, where ccccc is chosen so that the first byte viewed as a int16 times 256 plus the second byte viewed as a int16 is a multiple of 31. e.g: 01111000(bits) = 120(int16), 10011100(bits) = 156(int16), 120 * 256 + 156 = 30876 which is a multiple of 31
  • compress (.Z) starts with 0x1f, 0x9d
  • bzip2 (.bz2) starts with 0x42, 0x5a, 0x68
  • Zstandard (.zstd) format description, frame starts with a 4 byte magic number using little-endian format 0xFD2FB528, a skipable frame starts with 0x184D2A5? (question mark is any value from 0 to F), and dictionary starts with 0xEC30A437.
  • A few more formats in the magic database from the file command

Solution 2 - Binary

If you're on a Linux box just use the 'file' command.

http://en.wikipedia.org/wiki/File_(command)

$ mv foo.zip dink
$ file dink
dink: gzip compressed data, from Unix, last modified: Sat Aug  6 08:08:57 2011,
max compression
$

Solution 3 - Binary

As an alternative to inspecting the file header by hand, you could use some utility like TrID. The link points to the cross-platform command line version; for Windows there's a GUI, too.

Solution 4 - Binary

If you want to determine an algorithm used to compress a linux kernel, there is a script for that, see this question and answer: https://unix.stackexchange.com/a/553192/264065

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Question22332112View Question on Stackoverflow
Solution 1 - BinaryMark AdlerView Answer on Stackoverflow
Solution 2 - Binaryct_View Answer on Stackoverflow
Solution 3 - BinaryM.AliView Answer on Stackoverflow
Solution 4 - BinaryAsharkView Answer on Stackoverflow