Skip to main content

I have large files which are generated on the fly to stdout, one every 24hours. I would like to archive thisthese files progressively on tapes, ideally in a single archive which potentially spans multiple tapes.

Tar is very good for managing the tapes, as it has built in-in functionalities to append to an archive and to load the next tape. But it is very poor at accepting data from stdin. No matter what I do, it ends up writing a special file (link or named pipe) to the archive, instead of its content.

Here is the example command, that I have been trying. The first day, generate a new archive:

ln -s /dev/stdin day1 # or use the --transform option of tar
data_generator | tar -c -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day1

the next day, I would like to just change -c to -A and save the new stream into a new file in appended to the tar archive, loading a new tape when it becomes is necessary.

data_generator | tar -A -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day2

As I said, all I find in the archive is a named pipe (with -h) or a symlink (without -h).

Some thingsideas that I have tried and are not good:

  1. Using split instead of tar is not viable, because it is too basic. It can only split to pre-defined dimension (not good if I do not start from the beginning of the tape), and it cannot concatenate the different days in an unpackable archive. Tar does not need to know the size of the data nor the tape, it will just switch to a new tape when it gets a write error.
  2. I've read the manuals of cpio, star and dar. I do not haveget the impression that they cope with pipes better than tar.

thankThank you for any hinthints.

Edit: I'm starting to think that it is impossible with tar, because it needs to know the size of the file before starting to write. In fact, an archive that can be expanded, appending is very tricky if you do write the size before the content.

I have large files which are generated on the fly to stdout, one every 24hours. I would like to archive this files progressively on tapes, ideally in a single archive which potentially spans multiple tapes.

Tar is very good for managing the tapes, as it has built in functionalities to append to an archive and to load the next tape. But it is very poor at accepting data from stdin. No matter what I do, it ends up writing a special file (link or named pipe) to the archive, instead of its content.

Here is the example command, that I have been trying. The first day, generate a new archive:

ln -s /dev/stdin day1 # or use the --transform option of tar
data_generator | tar -c -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day1

the next day, I would like to just change -c to -A and save the new stream into a new file in appended to the tar archive, loading a new tape when it becomes is necessary.

data_generator | tar -A -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day2

As I said, all I find in the archive is a named pipe (with -h) or a symlink (without -h).

Some things that I have tried and are not good:

  1. Using split instead of tar is not viable, because it is too basic. It can only split to pre-defined dimension (not good if I do not start from the beginning of the tape), and it cannot concatenate the different days in an unpackable archive. Tar does not need to know the size of the data nor the tape, it will just switch to a new tape when it gets a write error.
  2. I've read the manuals of cpio, star and dar. I do not have the impression that they cope with pipes better than tar.

thank you for any hint

Edit: I'm starting to think that it is impossible with tar, because it needs to know the size of the file before starting to write. In fact, an archive that can be expanded appending is very tricky if you do write the size before the content.

I have large files which are generated on the fly to stdout, one every 24hours. I would like to archive these files progressively on tapes, ideally in a single archive which potentially spans multiple tapes.

Tar is very good for managing the tapes, as it has built-in functionalities to append to an archive and to load the next tape. But it is very poor at accepting data from stdin. No matter what I do, it ends up writing a special file (link or named pipe) to the archive, instead of its content.

Here is the example command, that I have been trying. The first day, generate a new archive:

ln -s /dev/stdin day1 # or use the --transform option of tar
data_generator | tar -c -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day1

the next day, I would like to just change -c to -A and save the new stream into a new file appended to the tar archive, loading a new tape when it becomes necessary.

data_generator | tar -A -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day2

As I said, all I find in the archive is a named pipe (with -h) or a symlink (without -h).

Some ideas that I have tried and are not good:

  1. Using split instead of tar is not viable, because it is too basic. It can only split to pre-defined dimension (not good if I do not start from the beginning of the tape), and it cannot concatenate the different days in an unpackable archive. Tar does not need to know the size of the data nor the tape, it will just switch to a new tape when it gets a write error.
  2. I've read the manuals of cpio, star and dar. I do not get the impression that they cope with pipes better than tar.

Thank you for any hints.

Edit: I'm starting to think that it is impossible with tar, because it needs to know the size of the file before starting to write. In fact, an archive that can be expanded, appending is very tricky if you do write the size before the content.

added 250 characters in body
Source Link

I have large files which are generated on the fly to stdout, one every 24hours. I would like to archive this files progressively on tapes, ideally in a single archive which potentially spans multiple tapes.

Tar is very good for managing the tapes, as it has built in functionalities to append to an archive and to load the next tape. But it is very poor at accepting data from stdin. No matter what I do, it ends up writing a special file (link or named pipe) to the archive, instead of its content.

Here is the example command, that I have been trying. The first day, generate a new archive:

ln -s /dev/stdin day1 # or use the --transform option of tar
data_generator | tar -c -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day1

the next day, I would like to just change -c to -A and save the new stream into a new file in appended to the tar archive, loading a new tape when it becomes is necessary.

data_generator | tar -A -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day2

As I said, all I find in the archive is a named pipe (with -h) or a symlink (without -h).

Some things that I have tried and are not good:

  1. Using split instead of tar is not viable, because it is too basic. It can only split to pre-defined dimension (not good if I do not start from the beginning of the tape), and it cannot concatenate the different days in an unpackable archive. Tar does not need to know the size of the data nor the tape, it will just switch to a new tape when it gets a write error.
  2. I've read the manuals of cpio, star and dar. I do not have the impression that they cope with pipes better than tar.

thank you for any hint

Edit: I'm starting to think that it is impossible with tar, because it needs to know the size of the file before starting to write. In fact, an archive that can be expanded appending is very tricky if you do write the size before the content.

I have large files which are generated on the fly to stdout, one every 24hours. I would like to archive this files progressively on tapes, ideally in a single archive which potentially spans multiple tapes.

Tar is very good for managing the tapes, as it has built in functionalities to append to an archive and to load the next tape. But it is very poor at accepting data from stdin. No matter what I do, it ends up writing a special file (link or named pipe) to the archive, instead of its content.

Here is the example command, that I have been trying. The first day, generate a new archive:

ln -s /dev/stdin day1 # or use the --transform option of tar
data_generator | tar -c -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day1

the next day, I would like to just change -c to -A and save the new stream into a new file in appended to the tar archive, loading a new tape when it becomes is necessary.

data_generator | tar -A -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day2

As I said, all I find in the archive is a named pipe (with -h) or a symlink (without -h).

Some things that I have tried and are not good:

  1. Using split instead of tar is not viable, because it is too basic. It can only split to pre-defined dimension (not good if I do not start from the beginning of the tape), and it cannot concatenate the different days in an unpackable archive. Tar does not need to know the size of the data nor the tape, it will just switch to a new tape when it gets a write error.
  2. I've read the manuals of cpio, star and dar. I do not have the impression that they cope with pipes better than tar.

thank you for any hint

I have large files which are generated on the fly to stdout, one every 24hours. I would like to archive this files progressively on tapes, ideally in a single archive which potentially spans multiple tapes.

Tar is very good for managing the tapes, as it has built in functionalities to append to an archive and to load the next tape. But it is very poor at accepting data from stdin. No matter what I do, it ends up writing a special file (link or named pipe) to the archive, instead of its content.

Here is the example command, that I have been trying. The first day, generate a new archive:

ln -s /dev/stdin day1 # or use the --transform option of tar
data_generator | tar -c -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day1

the next day, I would like to just change -c to -A and save the new stream into a new file in appended to the tar archive, loading a new tape when it becomes is necessary.

data_generator | tar -A -h -M -f /dev/nst0 -H posix -F 'mtx -f /dev/sch0 next' day2

As I said, all I find in the archive is a named pipe (with -h) or a symlink (without -h).

Some things that I have tried and are not good:

  1. Using split instead of tar is not viable, because it is too basic. It can only split to pre-defined dimension (not good if I do not start from the beginning of the tape), and it cannot concatenate the different days in an unpackable archive. Tar does not need to know the size of the data nor the tape, it will just switch to a new tape when it gets a write error.
  2. I've read the manuals of cpio, star and dar. I do not have the impression that they cope with pipes better than tar.

thank you for any hint

Edit: I'm starting to think that it is impossible with tar, because it needs to know the size of the file before starting to write. In fact, an archive that can be expanded appending is very tricky if you do write the size before the content.

pipe->stdio
Link

Archiving pipestdout to multiple tapes

Source Link
Loading