20

I have numerous zip archives, each of which contains a number of zip archives. What is the best way to recursively extract all files contained within this zip archive and its child zip archives, that aren't zip archives themselves?

5
  • what do you mean by extracting things that are not zip files? you wan to copy them to another place? Commented Nov 27, 2010 at 12:30
  • I don't find your requirements clear. I find Shawn J. Goff and my interpretation about equally likely. Could you clarify? Commented Nov 27, 2010 at 13:30
  • @Gilles: Sorry, yeah it was a bit unclear. I changed it a bit, hopefully its more clear now. Commented Nov 28, 2010 at 0:56
  • I was going to post an answer, but I believe it should go as a comment: Nested Archives increase the space you need! You probably mean the Zip file format, not just gzip. every zip file is already compressed, compressing them again, just creates more overhead, effectively increasing needed space. Commented Nov 29, 2010 at 3:44
  • 1
    Yeah, I didn't do it :P. Unfortunately I'm subjected to this bizarre way of distributing files. Commented Nov 29, 2010 at 5:56

11 Answers 11

16

This will extract all the zip files into the current directory, excluding any zipfiles contained within them.

find . -type f -name '*.zip' -exec unzip -- '{}' -x '*.zip' \;

Although this extracts the contents to the current directory, not all files will end up strictly in this directory since the contents may include subdirectories.

If you actually wanted all the files strictly in the current directory, you can run

find . -type f -mindepth 2 -exec mv -- '{}' . \;

Note: this will clobber files if there are two with the same name in different directories.

If you want to recursively extract all the zip files and the zips contained within, the following extracts all the zip files in the current directory and all the zips contained within them to the current directory.

while [ "`find . -type f -name '*.zip' | wc -l`" -gt 0 ]
do
    find . -type f -name "*.zip" -exec unzip -- '{}' \; -exec rm -- '{}' \;
done
2
  • 1
    this while loop helped me a lot in an ethical hacking competition where they had prepared a nested zip file going 31337 levels deep, thanks! Commented Dec 10, 2015 at 9:17
  • 4
    you might like this variant which I use to recursively extract contents from nested ear, war, jar files : gist.github.com/tyrcho/479c18795d997c201e53 Major difference is it creates a nested folder for each archive. while [ "find . -type f -name '*.?ar' | wc -l" -gt 0 ]; do find -type f -name "*.?ar" -exec mkdir -p '{}.dir' \; -exec unzip -d '{}.dir' -- '../{}' \; -exec rm -- '{}' \;; done Commented Jan 21, 2016 at 17:17
4

As far as I understand, you have zip archives that themselves contain zip archives, and you would like to unzip nested zips whenever one is extracted.

Here's a bash 4 script that unzips all zips in the current directory and its subdirectories recursively, removes each zip file after it has been unzipped, and keeps going as long as there are zip files. A zip file in a subdirectory is extracted relative to that subdirectory. Warning: untested, make a backup of the original files before trying it out or replace rm by moving the zip file outside the directory tree.

shopt -s globstar nullglob
while set -- **/*.zip; [ $# -ge 1 ] do
  for z; do
    ( cd -- "$(dirname "$z")" &&
      z=${z##*/} &&
      unzip -- "$z" &&
      rm -- "$z"
    )
  done
done

The script will also work in zsh if you replace the shopt line with setopt nullglob.

Here's a portable equivalent. The termination condition is a little complicated because find does not spontaneously return a status to indicate whether it has found any files. Warning: as above.

while [ -n "$(find . -type f -name '*.zip' -exec sh -c '
    cd "${z%/*}" &&
    z=${z##*/} &&
    unzip -- "$z" 1>&2 &&
    rm -- "$z" &&
    echo 1
')" ]; do :; done
2

This perl script will extract each .zip file into its own subdirectory. Run the script more than once to handle nested zip files. It does not delete .zip files after extraction but you could make that change by adding an unlink() call.

#!/usr/bin/perl -w

# This script unzips all .zip files it finds in the current directory
# and all subdirectories.  Contents are extracted into a subdirectory
# named after the zip file (eg. a.zip is extracted into a/).
# Run the script multiple times until all nested zip files are
# extracted.  This is public domain software.

use strict;
use Cwd;

sub process_zip {
    my $file = shift || die;
    (my $dir = $file) =~ s,/[^/]+$,,;
    (my $bare_file = $file);
    $bare_file =~ s,.*/,,;
    my $file_nopath = $bare_file;
    $bare_file =~ s,\.zip$,,;
    my $old_dir = getcwd();
    chdir($dir) or die "Could not chdir from '$old_dir' to '$dir': $!";
    if (-d $bare_file) {
        chdir($old_dir);
        # assume zip already extracted
        return;
    }
    mkdir($bare_file);
    chdir($bare_file);
    system("unzip '../$file_nopath'");
    chdir($old_dir);
}

my $cmd = "find . -name '*.zip'";
open(my $fh, "$cmd |") or die "Error running '$cmd': $!";
while(<$fh>) {
    chomp;
    process_zip($_);
}
1

unzip doesn't do this, because the UNIX way is to do one thing and do that well, not handle all crazy special cases in every tool. Thus, you need to use the shell (which does the job of "tieing things together" well). This makes it a programming question, and since ALL possible programming questions have been answered on StackOverflow, here: How do you recursively unzip archives in a directory and its subdirectories from the Unix command-line?

2
  • 1
    I would definitely not call "using the shell" a programming question, and "shell scripting" is listed in the FAQ as on-topic Commented Nov 27, 2010 at 18:02
  • I was not meaning to imply that it was off-topic here at all, I just wanted to justify why it's on-topic at StackOverflow. Commented Nov 29, 2010 at 7:43
1

You'll want to be careful automatically unzipping zip files inside of zip files:

http://research.swtch.com/2010/03/zip-files-all-way-down.html

It's possible to concoct a zip file that produces a zip file as output, which produces a zip file as output, etc etc etc. That is, you can make a zip file that's a fixed oint of "unzip" the program.

Also, I seem to recall people making zip files that would "explode", that is a very small zip file would unzip to multi-gigabytes of output. This is a facet of the method of compression.

1

Maybe this will help (worked for me):

function unzipAll(){

# find and count archives
archLst=`find . -type f -name "*.*ar"`
archLstSize=`echo $archLst| awk 'END{print NF}'`

# while archives exists do extract loop
while [ "$archLstSize" -gt 0 ]; do

# extract and remove all archives (found on single iteration)
for x in $archLst; do 
mv "${x}" "${x}_";
unzip "${x}_" -d "${x}" && rm "${x}_"; 
done; #EO for

# find and count archives
archLst=`find . -type f -name "*.*ar"`
archLstSize=`echo $archLst| awk 'END{print NF}'`

done #EO while

}
1

The easiest way is to use atool: http://www.nongnu.org/atool/ It is a very good script that use zip, unzip, tar, rar etc. programs to extract any archive.

Use atool -x package_name.zip to unzip them all or if you want to use it in directory with many zip files use simple for loop:

for f in *; do atool -x $f; fi (you will have to cd into desired directory with zip files before you use this).

2
  • 1
    atool's behaviour here doesn't differ significantly from unzip I'd say, it doesn't recursively extract ZIP files either. Commented Nov 29, 2010 at 7:50
  • @Thomas Themel: Are you sure that it doesn't recursively extract ZIP files? It can extract from deb files tar.gz recurisvely but i have no time atm to test it with nested zip archives :\ Commented Dec 1, 2010 at 22:58
1

I needed a solution like Giles' from 2010, except I needed to preserve the folder structure, not unzip everything into the top level directory. Here's my take on his with three lines added/changed:

#!/bin/bash
shopt -s globstar nullglob
while set -- **/*.zip; [ $# -ge 1 ]
do
    for z
    do
        ( cd -- "$(dirname "$z")" &&
            z=${z##*/} &&
            cp -- "$z" "$z".bak &&
            mkdir -- "$z"dir &&
            unzip -- "$z" -d "$z"dir &&
            rm -- "$z"
        )
    done
done
0
1

To have a recursive extraction with proper tree structure while unzipping we can use

while [ "`find . -type f -name '*.zip' | wc -l`" -gt 0 ]
do
   find . -name '*.zip' -exec sh -c 'unzip -o -d "${0%.*}" "$0"' '{}' ';' -exec rm -- '{}' \;
done
  • find . -name '*.zip': Finds the files with .zip extension in current directory
  • unzip -o: unzips file and overrides if there is a file with same name
  • -d "${0%.*}": creates directory with the same name of the zip file
0

Checkout this java based utility nzip for nested zip files. Extracting and compressing nested zips can be done easily using following commands

java -jar nzip.jar -c list -s readme.zip

java -jar nzip.jar -c extract -s "C:\project\readme.zip" -t readme

java -jar nzip.jar -c compress -s readme -t "C:\project\readme.zip"

PS. I am the author and will be happy to fix any bugs quickly.

0

To avoid having to extract the intermediary zip files on disk, you could combine bsdtar (which can extract zip archives and convert to tar format on the fly) and GNU tar (which can pipe the contents of archive members to other commands instead of extracting them on disk) like so:

bsdtar cf - @file.zip | # convert to tar format on the fly for GNU tar
  tar -xf - --to-command='
    case $TAR_FILENAME in
      (*.zip) bsdtar xvvf - # pipe zip archive members to bsdtar for extraction
    esac'

To do it with more that one zip file, with GNU utilities:

find . -name '*.zip' -printf '@%p\0' |
  CMD='
    case $TAR_FILENAME in
      (*.zip) bsdtar xvvf - # pipe zip archive members to bsdtar for extraction
    esac
  ' xargs -r0 sh -c '
    bsdtar cf - "$@" |
      tar -xf - --to-command="$CMD"' sh

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.