139

I'd like to download, and extract an archive under a given directory. Here is how I've been doing it so far:

wget http://downloads.mysql.com/source/dbt2-0.37.50.3.tar.gz
tar zxf dbt2-0.37.50.3.tar.gz
mv dbt2-0.37.50.3 dbt2

I'd like instead to download and extract the archive on the fly, without having the tar.gz written to the disk. I think this is possible by piping the output of wget to tar, and giving tar a target, but in practice I don't know how to put the pieces together.

6 Answers 6

184

You can do it by telling wget to output its payload to stdout (with the flag -O-) and suppress its own output (with the flag -q). The output is then made the input (via stdin) to the tar command by a pipe (|):

wget -qO- your_link_here | gunzip | tar xvf -

f - tells tar the archive is to be read from stdin. With some tar implementations, that's the default, in others, that's often a tape device.

Some tar implementations can detect compressions and decompress by themselves in which case you can remove the | gunzip, some support a z option to decompress gzip-compressed archives on the fly by themselves (often by invoking gunzip themselves).

To specify a target directory, if your tar supports -C:

wget -qO- your_link_here | gunzip | tar xvf - -C /target/directory

If not:

(cd /target/directory && wget -qO- your_link_here | gunzip | tar xvf -)

If you happen to have GNU tar, you can also rename the output dir:

wget -qO- your_link_here | tar --transform 's/^dbt2-0.37.50.3/dbt2/' -xvzf -

In libarchive's tar (bsdtar), or star, the equivalent is with the -s/pattern/replacement/ option like in the standard pax command.

5
  • 3
    To specified path should be: wget -qO- your_link_here | tar xvz - -C /target/directory Commented Sep 12, 2018 at 12:10
  • maybe just tell people to use tar instead wget then? Commented May 19, 2019 at 4:29
  • 4
    wget -qO- <url> | tar -xvz -C <target folder> worked on gnu tar. Commented Jun 22, 2019 at 14:22
  • will this require less than double the space of the archive on my disk? right now I'm having to unpack a huge tar and I'm looking for a way to optimize the operation space-wise Commented Jun 17, 2021 at 17:28
  • 2
    You don’t have to specify the stdin -. Commented Jun 22, 2021 at 10:08
49

Another option is to use curl which writes to stdout by default:

curl -s -L https://example.com/archive.tar.gz | tar xvz - -C /tmp
8
  • 3
    I like your option more than others but curl -s some_url | tar xvz - -C /tmp Commented Mar 18, 2019 at 17:44
  • 4
    as FiftiN suggested -> e.g. to view a filtered list of files inside repository one could use: $ curl -L https://api.github.com/repos/repo_owner/repo_name/tarball | tar tvfz - -C /tmp --wildcards *.py Commented Apr 24, 2019 at 9:45
  • 7
    Better curl with "-L" to follow redirects Commented Mar 27, 2020 at 15:37
  • 1
    What does the standalone - after tar xvz do? Does that mean STDIN instead of file? Commented Jul 4 at 18:48
  • 1
    Yes, that's it exactly @isapir Commented Jul 15 at 10:04
14

This oneliner does the trick:

tar xvzf -C /tmp/ < <(wget -q -O - http://foo.com/myfile.tar.gz)

short explanation: the right side in the parenthesis is executed first (-q tells wget to do it quietly, -O - is used to write the output to stdout).

Then we create a named pipe using the process substitution operator from Bash <( to create a named pipe. This way we create a temporary file descriptor and then direct the contents of that descriptor to tar using the < file redirection operator.

2
  • 1
    This would need -f - (for stdin) or -f <(wget... to work. Commented Nov 20, 2019 at 17:33
  • 1
    This should be something like tar zxvf - < <(wget -q -O - https://github.com/peak/s5cmd/releases/download/v2.1.0/s5cmd_2.1.0_Linux-64bit.tar.gz) Commented Jul 3, 2023 at 6:34
2

Named pipe with stdin solution and really mind the flags for tar's -xvz

tar -xvz -C /tmp/ -f <(wget -q -O - https://github.com/user/repo/release/download/v/v.tar.gz)
2

One liner that handles redirects and can extract tar.bz2 files. Use xzfor extracting gzip files.

curl -L https://downloads.getmonero.org/cli/linux64 | tar xj
0

The extraction part should take input from STDOUT. We may need tar -xzvf - -C <output_dir>

Example:


# this may not work
# It might complain 
# tar (child): -C: Cannot open: No such file or directory
wget -qO - https://dlcdn.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3-scala2.13.tgz | tar -xzvf -C /opt/spark --strip-component 1


# this should work. 
wget -qO - https://dlcdn.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3-scala2.13.tgz | tar -xzvf - -C /opt/spark --strip-component 1


1
  • How would one go about this using the wget -N flag? So only do this if the downloaded file has changed? I would imagine I would need to save the existing file for that? Commented Feb 13, 2023 at 21:43

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.