As grawity’s answer explains, Linux does this automatically (and so does essentially every other OS). The sync command (and related system calls and library functions, like syncfs() or fsync()) just serve to explicitly override this, possibly for a specific subset of the filesystem.
However, the way it handles it is somewhat complicated, and involves a couple of different sysctl values:
- vm.dirty_ratio/- vm.dirty_bytes: These control how much dirty memory any single process can have before the kernel will start forcing writeback itself.- dirty_ratioconfigures it as a percentage of available system RAM, while- dirty_bytesconfigures it as an absolute number of bytes of memory. Only the last one of these two sysctls that was written to has an effect. The default is for- vm.dirty_ratioto be set to 10, which means 10% of available system RAM.
- vm.dirty_background_ratio/- vm.dirty_background_bytes: These are exactly equivalent to- vm.dirty_ratioand- vm.dirty_bytes, but control the limits for the system as a whole. The default is for- vm.dirty_background_ratioto be set to 20, which means 20% of available system RAM.
- vm.dirty_expire_centisecs: This configures a time limit for how long any piece of data written can remain in memory before the kernel will start trying to write it out, in hundredths of a second. IOW, if the limits set by the above four sysctls are not met, the kernel will start writing out dirty memory once this amount of time has elapsed since that dirty memory was written to. The default is 30000, which corresponds to 30 seconds.
- vm.dirty_writeback_centisecs. This controls how frequently the kernel will attempt to write dirty memory out to disk, specified as hundredths of a second between attempts. The default is 5, corresponding to twenty times per second.
Note that ‘available system RAM’ is not the same as total installed RAM, it’s what the kernel sees after all the reservations from the firmware and devices are handled (and is usually just a bit more than what the free command reports as total memory).
If you’re thinking logically, you probably notice some issues with those default values, especially on modern systems. For a system with 32 GiB of RAM for example, the defaults mean that up to about 6 GiB of memory might be dirty before the kernel forces writeback, which is then likely to take quite a while. The usual recommendation, which some distros actually follow out of box, is to set vm.dirty_ratio to 1 and vm.dirty_background_ratio to 2, corresponding to 1% and 2% of available system RAM, which will usually significantly reduce the amount of data that may be lost.
If your storage server sees lots of small, infrequent writes but not much else, you may want to also reduce vm.dirty_expire_centisecs (though I would not reduce it below about 1000).