2

Suppose I have a large zip file (>50GB) and I want to extract some files off of it from the command line.

To get the files I run the command:

unzip -l myfile.zip | grep "foo"

which gives me a list of zip entries; how do I extract those files that pass through the grep filter? I tried using xargs unzip -j but I'd like some cleaner solution as the zip entries require cleaning of useless information.

3
  • 4
    You can just do unzip -j myfile.zip '*foo*' Commented Jul 28, 2017 at 15:20
  • Nice and clean. Commented Jul 28, 2017 at 15:32
  • be careful if you're grepping for anything that unzip -l also reports in the header or trailer; e.g. "archive", "zip", "length", "date", "time", "name", "file", or numbers that match the total bytes or number of files. Commented Jul 28, 2017 at 16:32

1 Answer 1

6

Stéphane has the right idea to pass zip the wildcard corresponding to the filenames that you'd like to extract. Parsing the output of unzip means you have to watch out for the header and trailer lines that come along.

Use something like:

unzip -j myfile.zip '*foo*'

being careful to quote the wildcards from the shell.

If you continue along the direction of grepping unzip's output, strip out the header and trailer and reduce it to the filename column:

unzip -l myfile.zip | sed '1,3d; /---------                     -------/d; $d'|cut -c31-

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.