Here isHere's the code that gives the desired output mentioned in the question, for people who'd be interested. It's just a tiny adaptation of @Ed's really smart code.
BEGIN { print "#!/bin/bash" }
/^#/ { prt(); print; next }
{ files[$0] }
END { prt() }
function prt( file, isDate, isKeep, isDelete, backup, latest, pats) {
# file exists in a current backup directory (yes|no)
backup = "no"
# latest historical backup date
latest = "000000"
for (file in files) {
if ( file ~ /\/Library\// ) {
# files to check manually
isKeep[file]
}
else if ( file ~ /\/(labs data|backup-current)\// ) {
# backup files to keep
isKeep[file]
backup = "yes"
}
else if ( match(file, /\/(backup-disk-name\/|backup-)([0-2][0-9][0-1][0-9][0-3][0-9])\//, pats) >!= 0 ) {
# files in historical backup directories
if ( pats[2] > latest ) {
latest = pats[2]
}
isDate[file] = pats[2]
}
else {
# unclassified filefiles to check manually
isKeep[file]
}
}
for (file in isDate) {
if ( isDate[file] == latest && backup == "no") {
isKeep[file]
}
else {
isDelete[file]
}
}
for (file in isKeep) {
print "#", file
}
for (file in isDelete) {
# use single quotes to escape special characters in file
# use gensub() to escape single quotes in file
print "rm", "'" gensub(/'/,"'\\\\''", "g", file) "'"
}
delete files
}
Finally, I would like to share some thoughts. I hope I'm not disgressing too much.
A few weeks ago I resolved to finally cleanup that monstruous backup data (some files have more than 10 duplicates). But I couldn't find a tool to automate the task. And I didn't want to fire up a C program for that and didn't want to go the Perl way. So I knew I had to (and I wanted to) go the shell way. But I didn't know where to start and got stuck on the first lines.
After reading a lot, I was still very confused. So I decided to post my question on SE.
SoWhen I first read @Ed's code I thought "What the hell!". Then, when I got it, I realized it's a brilliant piece of code, highly efficient and clear.
So here we are. About one week ago, I didn't know anything about awk and very few about RegExp. Now, thanks to the @Ed's contribution, I've been able to write "my" first awk script, better understand the RegExp world, and complete the task at hand. More importantly, I'm now confident enough to dive deeper by myself into RegExp, awk and other text processing shell tools. It also motivates me to contribute to StackExchange.
When I first read @Ed's code I thought "What the hell!". And then, when I got it, I realized it's a brilliant piece of code, highly efficient and clearSE. I
I just wanted to share my personal experience, and give hope to others who may like me getbe stuck on a problem, like facing a mountain.