Skip to main content
replaced http://unix.stackexchange.com/ with https://unix.stackexchange.com/
Source Link

> makes grep think the file is binary because it is binary. The thing is, you emptied the file, but didn't stop the program that was filling it.

>output.txt creates output.txt if it doesn't exist, and truncates it to zero length if it does.

At the point you run >output.txt, there is a tee process which has the file open. Truncating the file doesn't affect the position at which tee is writing. Let's say it had written N bytes before the truncation. The next time tee writes after the truncation, it will start writing at position N. Writing at a position beyond the current end of a file is allowed and fills the beginning of the file with null bytes.¹ That's what happened here.

Grep sees a file that begins with some null bytes. It correctly reports the file as binary.

You can tell GNU grep to treat the file as text by calling grep -a. It will search the whole file, including the null bytes (which don't match, so they don't affect the result unless there's a match on the first line, but they may cause a slowdown if there are a lot of them).

A better solution is to tell tee to always write at the current end of the file. Fortunately (as Stephane Chazelas remarkedStephane Chazelas remarked), there's an option for that: tee -a (present on all POSIX-compliant systems). You'll need to truncate the file first.

>output.txt
nc -l -k -p 9100 | tee -a output.txt

¹ Most filesystems allow blocks that would entirely consist of null bytes to remain unallocated. This specialized method of compression is called making a sparse file.

> makes grep think the file is binary because it is binary. The thing is, you emptied the file, but didn't stop the program that was filling it.

>output.txt creates output.txt if it doesn't exist, and truncates it to zero length if it does.

At the point you run >output.txt, there is a tee process which has the file open. Truncating the file doesn't affect the position at which tee is writing. Let's say it had written N bytes before the truncation. The next time tee writes after the truncation, it will start writing at position N. Writing at a position beyond the current end of a file is allowed and fills the beginning of the file with null bytes.¹ That's what happened here.

Grep sees a file that begins with some null bytes. It correctly reports the file as binary.

You can tell GNU grep to treat the file as text by calling grep -a. It will search the whole file, including the null bytes (which don't match, so they don't affect the result unless there's a match on the first line, but they may cause a slowdown if there are a lot of them).

A better solution is to tell tee to always write at the current end of the file. Fortunately (as Stephane Chazelas remarked), there's an option for that: tee -a (present on all POSIX-compliant systems). You'll need to truncate the file first.

>output.txt
nc -l -k -p 9100 | tee -a output.txt

¹ Most filesystems allow blocks that would entirely consist of null bytes to remain unallocated. This specialized method of compression is called making a sparse file.

> makes grep think the file is binary because it is binary. The thing is, you emptied the file, but didn't stop the program that was filling it.

>output.txt creates output.txt if it doesn't exist, and truncates it to zero length if it does.

At the point you run >output.txt, there is a tee process which has the file open. Truncating the file doesn't affect the position at which tee is writing. Let's say it had written N bytes before the truncation. The next time tee writes after the truncation, it will start writing at position N. Writing at a position beyond the current end of a file is allowed and fills the beginning of the file with null bytes.¹ That's what happened here.

Grep sees a file that begins with some null bytes. It correctly reports the file as binary.

You can tell GNU grep to treat the file as text by calling grep -a. It will search the whole file, including the null bytes (which don't match, so they don't affect the result unless there's a match on the first line, but they may cause a slowdown if there are a lot of them).

A better solution is to tell tee to always write at the current end of the file. Fortunately (as Stephane Chazelas remarked), there's an option for that: tee -a (present on all POSIX-compliant systems). You'll need to truncate the file first.

>output.txt
nc -l -k -p 9100 | tee -a output.txt

¹ Most filesystems allow blocks that would entirely consist of null bytes to remain unallocated. This specialized method of compression is called making a sparse file.

Source Link
Gilles 'SO- stop being evil'
  • 865.4k
  • 205
  • 1.8k
  • 2.3k

> makes grep think the file is binary because it is binary. The thing is, you emptied the file, but didn't stop the program that was filling it.

>output.txt creates output.txt if it doesn't exist, and truncates it to zero length if it does.

At the point you run >output.txt, there is a tee process which has the file open. Truncating the file doesn't affect the position at which tee is writing. Let's say it had written N bytes before the truncation. The next time tee writes after the truncation, it will start writing at position N. Writing at a position beyond the current end of a file is allowed and fills the beginning of the file with null bytes.¹ That's what happened here.

Grep sees a file that begins with some null bytes. It correctly reports the file as binary.

You can tell GNU grep to treat the file as text by calling grep -a. It will search the whole file, including the null bytes (which don't match, so they don't affect the result unless there's a match on the first line, but they may cause a slowdown if there are a lot of them).

A better solution is to tell tee to always write at the current end of the file. Fortunately (as Stephane Chazelas remarked), there's an option for that: tee -a (present on all POSIX-compliant systems). You'll need to truncate the file first.

>output.txt
nc -l -k -p 9100 | tee -a output.txt

¹ Most filesystems allow blocks that would entirely consist of null bytes to remain unallocated. This specialized method of compression is called making a sparse file.