Why do open() and close() exist in the Unix filesystem design?
Couldn't the OS just detect the first time read() or write() was called and do whatever open() would normally do?
Dennis Ritchie mentions in «The Evolution of the Unix Time-sharing System» that open and close along with read, write and creat were present in the system right from the start.
I guess a system without open and close wouldn't be inconceivable, however I believe it would complicate the design.
You generally want to make multiple read and write calls, not just one, and that was probably especially true on those old computers with very limited RAM that UNIX originated on. Having a handle that maintains your current file position simplifies this. If read or write were to return the handle, they'd have to return a pair -- a handle and their own return status. The handle part of the pair would be useless for all other calls, which would make that arrangement awkward. Leaving the state of the cursor to the kernel allows it to improve efficiency not only by buffering. There's also some cost associated with path lookup -- having a handle allows you to pay it only once. Furthermore, some files in the UNIX worldview don't even have a filesystem path (or didn't -- now they do with things like /proc/self/fd).
open/close, you'd be sure to implement stuff like /dev/stdout to allow piping.
Then all of the read and write calls would have to pass this information on each operation:
Whether you consider the independent calls open, read, write and close to be simpler than a single-purpose I/O message is based on your design philosophy. The Unix developers chose to use simple operations and programs which can be combined in many ways, rather than a single operation (or program) which does everything.
read and write aren't restricted to files that live on a file system, and that is a fundamental design decision in Unix, as pjc50 explains.
lseek)
The concept of the file handle is important because of UNIX's design choice that "everything is a file", including things that aren't part of the filesystem. Such as tape drives, the keyboard and screen (or teletype!), punched card/tape readers, serial connections, network connections, and (the key UNIX invention) direct connections to other programs called "pipes".
If you look at many of the simple standard UNIX utilities like grep, especially in their original versions, you'll notice that they don't include calls to open() and close() but just read and write. The file handles are set up outside the program by the shell and passed in when it is started. So the program doesn't have to care whether it's writing to a file or to another program.
As well as open, the other ways of getting file descriptors are socket, listen, pipe, dup, and a very Heath Robinson mechanism for sending file descriptors over pipes: https://stackoverflow.com/questions/28003921/sending-file-descriptor-by-linux-socket
Edit: some lecture notes describing the layers of indirection and how this lets O_APPEND work sensibly. Note that keeping the inode data in memory guarantees the system won't have to go and fetch them again for the next write operation.
creat, and listen doesn't create an fd, but when (and if) a request comes in while listening accept creates and returns an fd for the new (connected) socket.
pipe was introduced a few years after development on Unix started.
The answer is no, because open() and close() create and destroy a handle, respectively. There are times (well, all of the time, really) where you may want to guarantee that you are the only caller with a particular access level, as another caller (for instance) writing to a file that you are parsing through unexpectedly could leave an application in an unknown state or lead to a livelock or deadlock, e.g. the Dining Philosophers lemma.
Even without that consideration, there are performance implications to be considered; close() allows the filesystem to (if it is appropriate or if you called for it) flush the buffer that you were occupying, an expensive operation. Several consecutive edits to an in-memory stream are much more efficient than several essentially unrelated read-write-modify cycles to a filesystem that, for all you know, exists half a world away scattered over a datacenter worth of high-latency bulk storage. Even with local storage, memory is typically many orders of magnitude faster than bulk storage.
Open() offers a way to lock files while they are in use. If files were automatically opened, read/written and then closed again by the OS there would be nothing to stop other applications changing those files between operations.
While this can be manageable (many systems support non-exclusive file access) for simplicity most applications assume that files they have open don't change.
Reading and writing to a filesystem may involve a large variety of buffering schemes, OS housekeeping, low-level disk management, and a host of other potential actions. So the actions of open() and close() serve as the set-up for these types of under the hood activities. Different implementations of a filesystem could be highly customized as needed and still remain transparent to the calling program.
If the OS didn't have open/close, then with with read or write, those file actions would still have to perform any initializations, buffer flushing/management, etc each and every time. That's a lot of overhead to impose for repetitive reads and writes.
The Unix mantra is "offer one way of doing things", which means "factoring" into (reusable) pieces to be combined at will. I.e., in this case separate the creation and destruction of file handles from their use. Important benefits came later, with pipes and network connections (they are also manipulated through file handles, but they are created in other ways). Being able to ship file handles around (e.g. passing them to child processes as "open files" which survive an exec(2), and even to unrelated processes through a pipe) are only possible this way. Particularly if you want to offer controlled access to a protected file. So you can e.g. open /etc/passwd for writing, and pass that to a child process that isn't allowed to open that file for writing (yes, I know this is a ridiculous example, feel free to edit with something more realistic).
open()exists. "Couldn't the OS just detect the first time read() or write() and do whatever open() would normally do?" Is there a corresponding suggestion for when closing would happen?read()orwrite()which file to access? Presumably by passing the path. What if the file's path changes while you're accessing it (between tworead()orwrite()calls)?read()andwrite(), just onopen().