What is the difference between write() and os.write()?
It's analogous to the difference between the C functions fwrite(3) and write(2).
The latter is a thin wrapper around an OS-level system call, whereas the former is part of the standard C library, which does some additional buffering, and ultimately calls the latter when it actually needs to write its buffered data to a file descriptor.
Python 3.x adds some additional logic to a file object's write() method which does automatic character-encoding conversion for Python str objects, whereas Python 2.x does not.
Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?
In Python 3.x, the difference is more related to the way in which you opened the file.
If you opened the file in binary mode, e.g. f = open(filename, 'wb') then f.write() expects a bytes object, and will return the number of bytes written.
If, instead, you opened the file in text mode, e.g. f = open(filename, 'w') then f.write() expects a str object, and will return the number of characters written, which for multi-byte encodings such as UTF-8 may not match the number of bytes written.
Note that the os.write() method always expects a bytes object, regardless of whether or not the O_BINARY flag was used when calling os.open().
file.write()doesn't count the bytes properly in this case, I would consider that a bug, but in any case the problem could be rectified by making sure the file is opened in"wb"mode.tellis not the byte index in the file, nor the character index. It's just a number thatseekcan use to return to that position, but you aren't supposed to do much about it.