I want to write a C program, and I need to parse stdin. If I type cat file.txt | grep -v match, how does stdout from cat resolve with -v match? Are they concatenated? Are they two different strings? I ran cat /dev/pts/0 and file /dev/pts/0, so I didn't find anything (seemingly) useful there.
2 Answers
A pipe is a buffer allocated in the kernel with file descriptors associated with the read and write ends. When you run cat file.txt | grep -v match:
- The shell creates a pipe (using the
pipe()system call) - The shell
fork()s. The child process uses thedup2()system call to close its standard output stream and to duplicate the write end of the pipe to standard output. (After this, writes to standard output will go to the kernel buffer.) Then the childexec()scatwith the updated standard output. - The shell
fork()s again. The child uses thedup2()system call to close its standard input stream and to duplicate the read end of the pipe to standard input. (After this, reads from standard input will come from the kernel buffer.) Then the childexec()sgrepwith the updated standard input.
At this point, both cat and grep are running. If grep tries to read from standard input (the pipe) and the pipe is empty, the read will block. If cat tries to write to standard output (again, the pipe) and the pipe is full, the write will block. Otherwise, as cat writes to the buffer, grep can read from the buffer.
-
The child process uses the dup2() system call to close its standard output stream and to *duplicate the write end of the pipe to standard output. * How? Could you please explain that in more detail? As the manual of
dup2, which says that dup2() makes newfd be the copy of oldfd, closing newfd first if necessary. AFAIS, it seems thatdup2the said goal.John– John2022-06-15 02:07:23 +00:00Commented Jun 15, 2022 at 2:07
The standard definition for the main function of a C program is
int main(int argc, char *argv[])
Here, argc and argv are the command line arguments, -v and match for grep in this case. Note that they're not a single string, but the shell has already split the arguments to distinct strings (NUL/\0 terminated, as usual in C). argc contains the number of arguments, and argv the arguments themselves.
Standard input on the other hand is just a FILE *, you can use it directly with any of the stdio functions. fgets(buf, sizeof(buf), stdin) etc.
I'm not sure where you got cat /dev/pts/0. It would read from that particular pseudo-terminal, possibly conflicting with reads by your shell on that same terminal. (Try to open two terminals, xterm, SSH sessions, screen, whatever. Then run tty on the first one, it shows the name of the terminal there, e.g. /dev/pts/123. Run cat /dev/pts/123 (with the given name) in the second terminal, then try to type something in the first.)
-
I assumed the tty was a text file for some reason, so I tried to read it. When that didn't work, I checked what kind of file it was.Not me– Not me2020-07-07 00:44:10 +00:00Commented Jul 7, 2020 at 0:44
-
So
stdinhas the output of the piped command, andargc,argvhave the options and stuff used with the command? Canstdinhave the latter?Not me– Not me2020-07-07 00:58:38 +00:00Commented Jul 7, 2020 at 0:58 -
@MichaelChristensen, I would use the phrase "
stdinis connected to the pipe", but yes. You could run something likecat $(echo foobar), where the command substitution would take the output ofechoand put it in the arguments ofcat, socatwould see thatfoobarinargv. But that's a completely different construct fromecho foobar | cat.ilkkachu– ilkkachu2020-07-07 08:50:55 +00:00Commented Jul 7, 2020 at 8:50