With the GNU implementations of du, awk and xargs, to work with arbitrary file names, you do:
(
cd ~/foo &&
du --block-size=1 -l0d1 |
awk -v RS='\0' -v ORS='\0' '
$1 < 50*1024 && !/^[0-9]+\t\.$/ && sub("^[^\t]+\t", "")' |
xargs -r0 echo rm -rf --)
)
That is:
- specify a block size as otherwise which one GNU
du uses depends on the environment. 1 guarantees you get the maximum precision (you get disk usage in number of bytes).
- Use
-0 to work with NUL-delimited records (NUL being the only character that may not be found in a file path).
-d1 to only get the cumulative disk usage of dirs up to depth 1 (depth 0 (.) is excluded with !/^[0-9]\t\.$/ in awk.
-l to make sure files' disk usage are accounted against every directory they're found in as an entry, not just the first.
Remove the echo (dry-run) to actually do it.
Or with perl instead of gawk:
perl -0ne 'print $2 if m{(\d+)\t(.*)}s && $1 < 50<<10'
POSIXly, you'd need something like:
(
unset -v BLOCK_SIZE BLOCKSIZE DU_BLOCKSIZE
cd ~/foo &&
LC_ALL=C POSIXLY_CORRECT=1 find . ! -name . -prune -type d -exec sh -c '
for dir do
du -s "$dir" | awk '{exit $1<50*1024/512 ? 41 : 0}'
[ "$?" -eq 41 ] && echo rm -rf "$dir"
done' sh {} +
)
(the unset -v BLOCK_SIZE BLOCKSIZE DU_BLOCKSIZE and POSIXLY_CORRECT=1 being for GNU du to make sure it uses 512 as the block size as POSIX requires).