2

I am supposed to find whether a file is compressed using .bz2 or .tar.bz2(without using extension of file) and and decompress it accordingly. I used the file command but it is giving same result for both .bz2 and .tar.bz2. Please suggest a way to identify .bz2 and .tar.bz2 files distinctly.

1
  • 1
    By "decompress", do you mean "extract fully if it's a tar archive", because both are (probably) files compressed by bzip2 and would be uncompressed by bunzip2. The fact that the uncompressed file is a tar archive or something else is not something that would matter to the bzip2 compressor. I'm confused by what's being asked. Commented Sep 15, 2019 at 22:20

3 Answers 3

3

Run the file through bzcat and pipe the result to file:

$ bzcat somefile.bz2 | file -
/dev/stdin: data               # or whatever; this is not a .tar.bz2

$ bzcat otherfile.bz2 | file -
/dev/stdin: POSIX tar archive  # this *is* a .tar.bz2
1

file wouldn't care: it looks at the format, not the filename.

The tar-formatted content can only be seen by uncompressing the file.

1

If you have a version of file that supports the -z or -Z options , you can use either of those to try to look inside compressed files to find out what they are.

Neither option is part of the POSIX spec for file.

However, the -z option has been in the BSD file command for a very long time, since the early 2000s at least (the changelog on my system doesn't show anything before 2003). -Z was added in June 2015. BSD file is used on many current systems, including most (all?) Linux distributions, *BSD, and Mac OSX.

From man file:

-z, --uncompress

Try to look inside compressed files.

-Z, --uncompress-noreport

Try to look inside compressed files, but report information about the contents only not the compression.

For example:

# make a .tar.bz2 file
tar cfj test.tar.bz2 *

# make a .bz2 file
echo junk | bzip2 -c > junk.bz2

# try to fool `file`
cp test.tar.bz2 test.bz2
cp junk.bz2 junk.tar.bz2

echo "file -z:"
file -z *.bz2

echo
echo

echo "file -Z:"
file -Z *.bz2

Output:

file -z:
junk.bz2:     ASCII text (bzip2 compressed data, block size = 900k)
junk.tar.bz2: ASCII text (bzip2 compressed data, block size = 900k)
test.bz2:     POSIX tar archive (GNU) (bzip2 compressed data, block size = 900k)
test.tar.bz2: POSIX tar archive (GNU) (bzip2 compressed data, block size = 900k)


file -Z:
junk.bz2:     ASCII text
junk.tar.bz2: ASCII text
test.bz2:     POSIX tar archive (GNU)
test.tar.bz2: POSIX tar archive (GNU)

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.