I am supposed to find whether a file is compressed using .bz2 or .tar.bz2(without using extension of file) and
and decompress it accordingly. I used the file command but it is giving same result for both .bz2 and .tar.bz2. Please suggest a way to identify .bz2 and .tar.bz2 files distinctly.
3 Answers
Run the file through bzcat and pipe the result to file:
$ bzcat somefile.bz2 | file -
/dev/stdin: data # or whatever; this is not a .tar.bz2
$ bzcat otherfile.bz2 | file -
/dev/stdin: POSIX tar archive # this *is* a .tar.bz2
file wouldn't care: it looks at the format, not the filename.
The tar-formatted content can only be seen by uncompressing the file.
If you have a version of file that supports the -z or -Z options , you can use either of those to try to look inside compressed files to find out what they are.
Neither option is part of the POSIX spec for file.
However, the -z option has been in the BSD file command for a very long time, since the early 2000s at least (the changelog on my system doesn't show anything before 2003). -Z was added in June 2015. BSD file is used on many current systems, including most (all?) Linux distributions, *BSD, and Mac OSX.
From man file:
-z, --uncompress
Try to look inside compressed files.
-Z, --uncompress-noreport
Try to look inside compressed files, but report information about the contents only not the compression.
For example:
# make a .tar.bz2 file
tar cfj test.tar.bz2 *
# make a .bz2 file
echo junk | bzip2 -c > junk.bz2
# try to fool `file`
cp test.tar.bz2 test.bz2
cp junk.bz2 junk.tar.bz2
echo "file -z:"
file -z *.bz2
echo
echo
echo "file -Z:"
file -Z *.bz2
Output:
file -z:
junk.bz2: ASCII text (bzip2 compressed data, block size = 900k)
junk.tar.bz2: ASCII text (bzip2 compressed data, block size = 900k)
test.bz2: POSIX tar archive (GNU) (bzip2 compressed data, block size = 900k)
test.tar.bz2: POSIX tar archive (GNU) (bzip2 compressed data, block size = 900k)
file -Z:
junk.bz2: ASCII text
junk.tar.bz2: ASCII text
test.bz2: POSIX tar archive (GNU)
test.tar.bz2: POSIX tar archive (GNU)
tararchive", because both are (probably) files compressed bybzip2and would be uncompressed bybunzip2. The fact that the uncompressed file is atararchive or something else is not something that would matter to thebzip2compressor. I'm confused by what's being asked.