Skip to main content
added 307 characters in body
Source Link
Kusalananda
  • 355.8k
  • 42
  • 735
  • 1.1k

The sed, perl and awk commands that you mention may be correct, but they all read the compressed data and counts newline characters in that. These newline characters have nothing to do with the newline characters in the uncompressed data.

To count the number of lines in the uncompressed data, there is no way around uncompressing it. Your approach with zcat is the correct approach and since the data is so large, it will take time to uncompress it.

Most utilities that deals with gzip compression and decompression will most likely use the same shared library routines to do so. The only way to speed it up would be to find an implementation of the zlib routines that are somehow faster than the default ones, and rebuild e.g. zcat to use those.

The sed, perl and awk commands that you mention may be correct, but they all read the compressed data and counts newline characters in that. These newline characters have nothing to do with the newline characters in the uncompressed data.

To count the number of lines in the uncompressed data, there is no way around uncompressing it. Your approach with zcat is the correct approach and since the data is so large, it will take time to uncompress it.

The sed, perl and awk commands that you mention may be correct, but they all read the compressed data and counts newline characters in that. These newline characters have nothing to do with the newline characters in the uncompressed data.

To count the number of lines in the uncompressed data, there is no way around uncompressing it. Your approach with zcat is the correct approach and since the data is so large, it will take time to uncompress it.

Most utilities that deals with gzip compression and decompression will most likely use the same shared library routines to do so. The only way to speed it up would be to find an implementation of the zlib routines that are somehow faster than the default ones, and rebuild e.g. zcat to use those.

Source Link
Kusalananda
  • 355.8k
  • 42
  • 735
  • 1.1k

The sed, perl and awk commands that you mention may be correct, but they all read the compressed data and counts newline characters in that. These newline characters have nothing to do with the newline characters in the uncompressed data.

To count the number of lines in the uncompressed data, there is no way around uncompressing it. Your approach with zcat is the correct approach and since the data is so large, it will take time to uncompress it.