security
prevent hacker modify any file
If you feel your server setup leaks like a sieve and has likely vulnerabilities,
that's worth pursuing on its own, perhaps with an OWASP checklist.
Asking the current script to shut the barn door after the cows got out
is attending to matters on the late side.
Being able to rebuild your production webserver from scratch
is a very good goal, and necessary when recovering from a known breach.
We store the source code somewhere that is less exposed to attack,
and trust those source files when creating a new webserver.
Consider using a Dockerfile to do that.
A great many cloud providers would be happy to run your docker image for you.
Of course, if there's some vuln in that image which the hacker knows about,
merely rewriting a backdoored server with a fresh image won't do much good,
since the hacker can quickly re-exploit it in the same way.
So detection and fixing vulnerabilities are still important.
remote server
BRANCH_NAME=main
AUTOBACKUP_BRANCH_NAME=main-autobackup
By hypothesis, you view the webserver as p0wned.
So an attacker could alter both branches,
and indeed could replace the repo or rewrite history to cover his tracks.
Putting the autobackup "source of truth" on an assumed compromised
server seems unwise.
shell
#! /bin/sh
Consider running under bash instead.
My concern is that your interactive shell is different from the target shell,
but it's pretty close, so you will test things interactively and then
be surprised by some subtle behavior difference in production.
stderr
This is nice:
if ! git rev-parse --is-inside-work-tree > /dev/null 2>&1; then
Possibly it was copy-n-pasted from somewhere.
Consider allowing stderr to be displayed.
We expect it will get zero lines of output,
but if it does they will be interesting.
I view the intent of the dev-null redirect
as simply discarding the expected "true".
In the error case we're going to have some [E] chatter anyway.
fatal error
msg_e "Failed to create branch ${AUTOBACKUP_BRANCH_NAME}."
That looks fatal to me.
You should probably bail at that point,
similar to the exit 1 up above in the case that there's no repo.
add vs remove
git add .
This grabs any new or edited files, great!
I note in passing that on the main branch we might have done git rm foo,
and the autobackup branch won't track such changes.
single column
The assignment to DIFF_INFORMATION turns a single-letter text column
into a multi-word column which cannot be reparsed.
In particular the awk in the following line makes
references to $1 and $2 which are clearly incorrect.
Filenames can contain SPACE characters, and the
"pairing broken" description contains a SPACE character.
Prefer e.g. "pairing_broken".
The very long one-liner awk script is inconvenient.
Consider breaking it up, making it a separate script file,
or using /usr/bin/join in its place.
undo
This check seems to appear a bit late:
if [ -z "$DIFF_OUTPUT" ]; then
msg_i "Last commit in $AUTOBACKUP_BRANCH_NAME is empty, deleting last commit."
git update-ref ...
Instead of an unconditional git diff-tree,
why not do that just in the case where we have diffs?
history
Aha! It looks like this is what tries to thwart an attacker who rewrites history:
git push origin $AUTOBACKUP_BRANCH_NAME
The OP script is somewhat long.
If I was an attacker who compromised the server, I think I would
just alter the OP script so it seems to be sending "happy happy!"
autobackups to the origin server, while ignoring the trojan binaries.
Consider adopting this alternate approach:
- A secure internal auditing server uses
rsync -a --checksum webserver:/opt/website . to efficiently grab a current copy of production files
- Audit server then performs diffing, change tracking, auditing and alerting on the local copy
This forces the attacker to rewrite the production rsync, which is certainly feasible.
A little better would be a webserver that is an NFS client which mounts /opt/website, and the audit server syncs from the NFS server, which presumably may have had its data files but not its code altered.