2

Most backup solutions save files that are not needed for individual file restore, e.g. caches. To avoid this, subvolumes can be used, but are restricted to subtrees. Full backup is more arduously then.

With rsync, exceptions can be declared, e.g. excluding all directories with the name trash, or single files, easily updated:

.cache/
cache/
[Tt]rash/
/swapfile

First variant is to create the snapshot on the sender's side as writable, remove all unwanted files and set it to read-only for the following btrfs send .... Done before send, then never changed again. Fairly simple.

Second variant is to delete the useless files when old versions are thinned out. The advantage is that the latest version(s) are complete and can be used for full restore, while those far in the past are used for file restore and can thus be much smaller. More complex, and snapshots are changed long after done.

The article Anyway to apply filter while btrfs subvolume snapshotting? tells that there is no filter in btrfs-utils, thus a similar procedure was proposed.

Which one is better, deleting before the send, or after the receive?

1
  • 1
    Hi! I'm not exactly clear what your question is? "Any experience or thoughts?" is not really a technical question; I think it would help us (you yourself and the potential answerers) if you could actually write down a specific question. Commented Oct 24 at 18:38

2 Answers 2

2

btrfs send is for subvolume snapshots, and cannot exclude specific files.

Further, the snapshots should be kept unchanged and read only if they are to be used as parents of incrementals. See the requirements in btrfs receive documentation. If you are going to send snapshots, leave them untouched and let fast incrementals work.

If you wish to for example backup /home but exclude $HOME/.cache for space reasons, consider a backup method that is file-based and can handle exclusion patterns. borg or tar can save to archives. rsync can send individual files across.

btrfs could still be useful to complement a file based backup.

  • snapshot a point in time and file backup off of that (local snapshot is not yet a backup, is not on different media)
  • read-write snapshot a previous snapshot to restore to that point very quickly
  • send a snapshot to a remote system, which might have more storage or be easier to backup from there
1

One solution is to create a sub-volume just for the data you DON'T want backed up and then move-and-symlink the directories

For example, assuming all the files & directories you want to exclude (except for /swapfile) are in your home directory, which is either a sub-volume itself or part of a sub-volume for all of /home:

NOTE: I would not recommend trying to execute the following as-is. it's intended as shell-like pseudo-code that you will need to adapt to your circumstances.

cd /
sudo btrfs subvolume create caches
sudo mkdir /caches/rainer /caches/root

# swapfile, presumably owned by root
sudo swapoff /swapfile
sudo mv /swapfile /caches/root/
sudo ln -s /caches/root/swapfile /
sudo swapon /swapfile
# BTW, instead of symlinking, a better alternative for this
# file would be to edit /etc/fstab to use the new location
# of the swapfile.

sudo chown youruser:yourgroup /caches/rainer
cd ~/

# Before proceeding you should stop all processes
# that have open files in the directories to be moved,
# e.g. web browsers, file viewers (such as epub or pdf), and 
# anything else that uses the cache director{y,ies} and is
# currently running.

for d in .cache/ cache/ [Tt]rash/; do
  mv "$d" /caches/rainer/
  ln -s "/caches/rainer/$d/" ./
done

# you can restart your stopped processes now

NOTE: you may run into a problem with [Tt]rash/, depending on the file browser you use and how it deals with the trash dir being a symlink rather than a directory. If that happens see if you can configure your file browser to use /caches/rainer/trash (but that might not be possible because most will use one hard-coded trash dir per filesystem).

In fact, it wouldn't hurt to do similar for all programs that use ~/.cache and ~/cache - move-and-symlink is a time-tested method that works and has worked for decades, but that's no reason to assume that software devs won't do stupid and lazy things when they get the chance.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.