Iterating over the contents of a file

Question

I understand that the output of file.txt is treated as input for the next command in the pipe. Yet, I don't understand how while in conjuction with read traverses the file from one line to the next?

cat file.txt | while read line 
do
   # do something with $line here
done

It relies on the exit status of the read command - see the somewhat related What does while read -r line || [[ -n $line ]] mean? — steeldriver
– steeldriver, Commented Jul 11, 2024 at 16:51
In case the answers don't make this clear, neither while nor read traverse or even have any knowledge of the file at all. Only cat file reads the file, the other commands read the output of cat. So while and read only read from their standard input, they don't open or read or interact with file in any way. — terdon
– terdon ♦, Commented Jul 11, 2024 at 17:48
One factor you might be missing is that the pipe mechanism remembers how much data has already been consumed from the pipe, so it can deal with a succession of partial reads. — Paul_Pedant
– Paul_Pedant, Commented Jul 12, 2024 at 7:55
@Paul_Pedant, except well, it only needs to keep track of what the OS currently has buffered, or what's provided by an ongoing write() call. But after that, it doesn't need to remember anything about how much data has passed through the pipe, at least as far as the normal APIs are concerned. — ilkkachu
– ilkkachu, Commented Jul 12, 2024 at 17:45
@ilkkachu Agreed: I tried to sidestep the concept of stdin being of indeterminate length, at least until it ends. — Paul_Pedant
– Paul_Pedant, Commented Jul 12, 2024 at 22:10

kos · Accepted Answer · 2024-07-11 17:35:15Z

From help while:

Execute commands as long as a test succeeds.

From help read:

Read a line from the standard input and split it into fields.

and:

Exit Status:

The return code is zero, unless end-of-file is encountered [...]

I think your loop would best be explained step-by-step:

read is executed, consuming as much of STDIN as possible before either a newline character is found or the end of the file is reached. The consumed chunk is further processed as described in help read. Finally, read will return 0 if a newline character was found (or 1 if the end of the file was reached);
read's return code is evaluated. If it's 0, the body of the while loop will be executed (otherwise, the loop will be broken and its body will be discarded);
If the loop hasn't been broken, everything will start from #1 again.

In other words, the input file is consumed in "chunks" identified either by a terminating newline character (AKA "lines") or by the end of the file, until the end of the file is reached.

Note that because the condition for the body of the while loop to execute is, as per the above, at each iteration for read to have consumed a newline character before reaching the end of the file, this means that if the file doesn't end with a newline character, the last line will not be processed by the body of the while loop.

jesse_b · Accepted Answer · 2024-07-11 17:12:03Z

4

You haven't exactly asked any question here but I'll take a stab at answering you.

cat file.txt | .. will obviously send the contents of file.txt to the pipeline to read. read by default will read one line at a time. So as long as there are lines to be read while will continue to loop. Once the end of file is encountered read will return a non success code which will in turn tell while to stop iterating.

edited Jul 11, 2024 at 17:12

answered Jul 11, 2024 at 16:52

jesse_b

41.5k14 gold badges108 silver badges162 bronze badges

Add a comment |

Stack Exchange Network

Iterating over the contents of a file

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

Iterating over the contents of a file

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions