0

How can I extract files from a .tar.gz archive while keeping a trace of the extracted files?

For example, let's say I have the following file structure...

ROOT
 ╠═▶ children
 ║    ╠═▶ joe.txt
 ║    ╚═▶ george.txt
 ╠═▶ bar.txt
 ╠═▶ foo.txt
 ╚═▶ A̲R̲C̲H̲I̲V̲E̲.t̲a̲r̲.b̲z̲
      ├─▷ children
      │    ├─▷ joe.txt
      │    └─▷ bob.txt
      ├─▷ hello.txt
      ├─▷ world.txt
      └─▷ foo.txt

Now, if I extract the files from the archive while keeping the newer files in place I'd like to know which ones have been extracted so I can do something like this:
tar xf ./ARCHIVE.tar.gz --keep-newer-files | xargs -I EXTRACTED_FILE echo EXTRACTED_FILE

2 Answers 2

2

I will use the option --to-command to trigger the print of all extracted files

the argument is a command that will use the env variable TAR_FILENAME to have the file name , and will receive thee file on STDIN

So you must read STDIN and create the extracted file , you can print on STDOUT the file you created

tar -xzf ../ARCH.tgz --keep-newer-files \
   --to-command="sh -c $(printf '%q' \
  'mkdir -p "$(dirname "$TAR_FILENAME")";dd of="$TAR_FILENAME" >/dev/null 2>&1;echo "#EXTRACTED#$TAR_FILENAME" ')" > output_tar.txt

and in output_tar.txt you will have a line for each extracted file .

See Writing to an External Program for more details

4
  • I tried it and I thought it worked for a second but I realized output_tar.txt contains the list of files that would be extracted has expected but nothing is extracted from the archive.It does not extract the files. Commented Oct 9, 2019 at 14:31
  • This why in my example , the command has 2 parts . one part for a dd command , and one part to echo the filename Commented Oct 9, 2019 at 16:27
  • Sorry, maybe I'm dumb but I don't understand. I tried to redirect stderr of dd into a file and it says dd: failed to open 'file/in/archive.txt': No such file or directory for every file that should be extracted Commented Oct 9, 2019 at 16:34
  • I managed to fix this issue by making sure the file pre-existed before calling dd. Here is the command I used for the --to-command argument of tar: sh -c $(printf '%q' 'mkdir -p $(dirname $TAR_FILENAME) ; touch $TAR_FILENAME ; dd of="$TAR_FILENAME" >/dev/null 2>&1 ; echo "#EXTRACTED#$TAR_FILENAME" ') Commented Oct 9, 2019 at 16:55
0

Depending of my requirement, I sometimes need to list all available extracted files matching the archive content (whether or not newer).

In that case, I can list the files effectively extracted, once the tar extraction was performed (or not).

I do it like this:

tar -xzf ./ARCH.tgz --keep-newer-files
for file in $(tar -tf ./ARCH.tgz 2>/dev/null)
do
  if [[ -e $file && ! -d $file ]]
  then
    echo $file
  fi
done > output_tar.txt

A maybe less explicit but concise solution using xargs is:

tar -tf ./ARCH.tar 2>/dev/null | xargs -L1 bash -c 'if [[ -e $0 && ! -d $0 ]]; then echo $0; fi;'

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.