0

I have BTRFS on my Ubuntu server, and I’m using PostgreSQL as a database. I’d like to make a .tar.gz archive from the current state of the database folder.

However, since the database is of course running while doing so, it might change during creation of the .tar. It’s around 15 GiB so it takes a while to archive it, making it likely that it will change during that time and create inconsistencies (my database automatically updates at least once per minute as it continuously fetches data from an API).

Now, I’m wondering, is there any way to tar or at least duplicate that folder (so I can tar it later) atomically?

My current idea is to somehow make a BTRFS snapshot and then tar that folder out of the snapshot and then delete the snapshot.

How would I accomplish the goal of making a tar that accurately represents a single point in time, as opposed to including conflicting changes while the database updates?

Is there an easier, alternative method that I’m not considering here?

1 Answer 1

0

PostgreSQL already has a way to backup databases — pg_dump. It makes it way easier to restore the backups later. 😉

I'd use something like:

for dbname in foo bar; do
    pg_dump ${dbname} | gzip > backups/${dbname}.gz
done

You could also use pg_dumpall to dump everything into one file.

Alternatively you could omit the gzip pipe and use BTRFS's inbuilt compression, like so: btrfs property set ./backups/ compression zstd:9.

3
  • I’m aware of pg_dump, but it’s really user-unfriendly and a logistical nightmare compared to just copying the folder. My database is running inside of docker, so I’d need to run pg_dump inside, then copy outside, then tar and transfer it to my dev machine. The importing is even worse, I’d need to delete the local db, start a new db without data, wait for it to be up, pg_import or whatever it’s called, then start the app. My current workflow is just copying the prod db to my dev machine, placing it in a docker mounted folder and start the app. I don’t like pg_dumpall and had many issues with it Commented Sep 24, 2022 at 17:08
  • I use borgmatic for backups, it’s just that I want to verify that what I’m doing on dev will work well in prod. For that, all I need is a somewhat accurate snapshot. Only reason I even care it writes while compressing is because it sometimes causes foreign key issues that I need to manually fix which is annoying and becomes more common the longer prod is running as it collects a lot of data every day. Commented Sep 24, 2022 at 17:09
  • Also, I’m not quite sure how compatible pg_dumpall is with postgres extensions. I’m technically using TimeScaleDB as I have a ton of time-series data. I wouldn’t be surprised if pg_dumpall has no clue what to do with the TimeScaleDB data Commented Sep 24, 2022 at 17:11

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.