Skip to main content
Tweeted twitter.com/StackUnix/status/1169128002891976704
untar typo, options should have been -xzf
Source Link

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzvxzf file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzf file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

edited body
Source Link

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After unzippinguntarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After unzipping, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After untarring, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

added 94 characters in body
Source Link

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After unzipping, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After unzipping, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up.

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

I am dealing with transferring large files from one machine to another (600GB+) and I'm tarring them up using

tar -cpvzf file.tar.gz -C PATH_TO_DIR DIR

Once finished with the tarring process, the following is done:

split -d -b 2G file.tar.gz file_part_

This creates a bunch of file_part_00, file_part_01, ... until the whole file is split into 2GB chunks. Before transferring the file, I loop through each part in the directory the tar was split and collect their md5 hashes using an equivalent to:

md5sum PART_NAME >> list_md5.start

Once each part has been hashed, I do the following:

sort -u list_md5.start

(This sorts them and remove duplicates, just to be safe ya know)

The parts are then transferred one by one in the order they're in the list_md5.start. Once they arrive on the other computer, their md5 hash is collected using the same method but in a different list let's call it list_md5_2.start. After the transfer, before putting the parts back together, I run the following:

diff list_md5.start list_md5_2.start

If no difference is found, I continue to the next part. Otherwise, I give up and delete all the parts. When it comes to putting them back together I do the following:

cat file_part_* > file.tar.gz.incomplete

(The incomplete is there because I have a watchdog waiting to untar any .tar.gz it comes across). Once the cat is done, the file is renamed using:

mv file.tar.gz.incomplete file.tar.gz

At this point, the watchdog detects it and untars it using:

tar -C DEST -xzv file.tar.gz --totals --unlink-first --recursive-unlink

At this point, I get an error I can't debug:

Tar Failed 2
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
 /PATH/TO/DEST

After unzipping, the tar is removed regardless if it failed or not (No point in keeping large files that failed to untar).

It is worth noting that sometimes the md5sum don't match up which also results in stopping the process (this is checked before the cat assembling step).

I have tried ensuring the names were not invalid. I've tried changing the part size to smaller sizes. I've tried manually going through the process and still either got an issue with a mismatch in md5sum or the EOF error.

This is all done on Ubuntu machines which have both been updated (No update pending).

Does anyone have an idea as to how to solve this issue?

Source Link
Loading