Where does ext4 store directory sizes? Are they stored in the directory inode?
For example, when I run du -h, it returns directories' size instantly, so I don't believe it calculates it at that time.
I'm using ext4 on Linux.
Using strace would seem to indicate that the file sizes are indeed calculated by querying the files within the directory.
Say I fill a directory with 3 1MB files.
$ mkdir adir
$ fallocate -l 1M adir/afile1.txt
$ fallocate -l 1M adir/afile2.txt
$ fallocate -l 1M adir/afile3.txt
Now when we trace the du -h command:
$ strace -s 2000 -o du.log du -h adir/
3.1M adir/
Looking at the resulting strace log file du.log:
...
newfstatat(AT_FDCWD, "adir/", {st_mode=S_IFDIR|0775, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
fcntl(3, F_DUPFD, 3) = 4
fcntl(4, F_GETFD) = 0
fcntl(4, F_SETFD, FD_CLOEXEC) = 0
getdents(3, /* 5 entries */, 32768) = 144
getdents(3, /* 0 entries */, 32768) = 0
close(3) = 0
newfstatat(4, "afile2.txt", {st_mode=S_IFREG|0644, st_size=1048576, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(4, "afile3.txt", {st_mode=S_IFREG|0644, st_size=1048576, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(4, "afile1.txt", {st_mode=S_IFREG|0644, st_size=1048576, ...}, AT_SYMLINK_NOFOLLOW) = 0
brk(0) = 0x231a000
...
Notice the newfstatat system calls? These are getting the size of each file in turn.
If you're interested here's a bit more on the subject.
The stat command provides no facility for querying anything other then the size of a filesystem object (directory or file).
$ stat adir/
File: ‘adir/’
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fd02h/64770d Inode: 11539929 Links: 2
Access: (0775/drwxrwxr-x) Uid: ( 1000/ saml) Gid: ( 1000/ saml)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2014-04-15 22:29:25.289639888 -0400
Modify: 2014-04-15 22:29:44.977638542 -0400
Change: 2014-04-15 22:29:44.977638542 -0400
Birth: -
Notice it's 4096 bytes. That's the actual size of the directory itself, not what it contains.
stat simply doesn't have a way to return multiple sizes—so it can only return the size of the directory itself, not of its contents. And also "directory size including contents" becomes less clear when you have hardlinked files.
dufeels instantaneous it's because the files are cached by the kernel. You can see this if you try to run it multiple times on a recently mounted file system - it'll be much faster after the first run.