Timeline for Are files saved on disk sequentially?
Current License: CC BY-SA 3.0
        12 events
    
    | when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Feb 15, 2017 at 8:18 | comment | added | Stephen Kitt | @MSalters and that's the whole point of fallocate: to tell the OS how big the file is, before it allocates it. Of course there are many situations where even the writing program doesn't know how big a file is going to be (e.g. any document you update, log files...), but there are many cases where it does (VM images, downloaded files, chunked video...) and where reducing fragmentation is useful. | |
| Feb 15, 2017 at 8:02 | comment | added | MSalters | @jpaugh: "Worst fit" and "best fit" are typical CompSci algorithms assuming you know how well the item fits. But an OS often can't predict how big a file will become, when a program starts writing. Still, in that case "worst fit" = "largest hole" has the best chance of fitting the entire file. | |
| Feb 15, 2017 at 1:08 | comment | added | jpaugh | @zwol I bet that is actually to increase performance when reading from different areas of the large files, rather than to mitigate fragmentation. | |
| Feb 15, 2017 at 0:33 | comment | added | jpaugh | @MSalters Makes sense. IIRC from my CompSci days, "worst fit" (that is, taking the biggest chunk of free space available at the time) produces the least fragmentation, on average. "Best fit" (taking the smallest area that is large enough) of course performs much worse when you have files growing. | |
| Feb 14, 2017 at 12:44 | comment | added | MSalters | @zwol: It generally doesn't require recreating the whole file system when you tweak file placement parameters such as the largest allowed extent. And no, intentionally fragmenting files doesn't help overall fragmentation at all. After you've written a 64 MB fragment, the best place for the next 64 MB fragment is directly behind the first fragment. The problem is the unintentional fragmentation when you write to two files; where should you place the second file? Either you fragment free space, or you end up fragmenting the two files, but something will fragment. | |
| Feb 13, 2017 at 18:45 | comment | added | jamesqf | Also note that there are situations, like RAID systems, where having contiguous files is less efficient, if it's even possible. I think that's really the purpose of a disk/storage subsystem controller: to offload all the work of storing files as optimally as can reasonably be expected. | |
| Feb 13, 2017 at 15:02 | comment | added | Muzer | @hudac It's impossible to guarantee sequentiality in all cases (see the case with a drive that is close to being full), and to be honest with the rise of SSDs it matters less than it used to (for those who can afford them at least). | |
| Feb 13, 2017 at 14:26 | comment | added | zwol | Essentially all file systems more sophisticated than FAT -- this goes all the way back to the original Berkeley UFS -- will intentionally break up large files and spread them over multiple "allocation groups"; this helps them minimize the overall fragmentation of the disk. There may be a way to adjust how this works, but there's good odds you have to recreate the filesystem from scratch in order to do it, and there probably isn't a way to turn it completely off. | |
| Feb 13, 2017 at 13:31 | comment | added | Stephen Kitt | It can't ensure sequential allocation, it's just a hint. But you should definitely use it if you're writing 10GiB files! | |
| Feb 13, 2017 at 13:15 | comment | added | hudac | Will using fallocate(3)ensure file sequentiallity? or will just hint the filesystem? I can't fully understand it from the man pages. | |
| Feb 13, 2017 at 13:14 | vote | accept | hudac | ||
| Feb 13, 2017 at 12:46 | history | answered | Stephen Kitt | CC BY-SA 3.0 |