This weekend (as some of you may now) I accidently nuke this Pod’s entire data volume 🤦♂️ What a disastrous incident 🤣 I decided instead of trying to restore from a 4-month old backup (we’ll get into why I hadn’t been taking backups consistently later), that we’d start a fresh! 😅 Spring clean! 🧼 – Anyway… One of the things I realised was I was missing a very critical Safety Controls in my own ways of working… I’ve now rectified this…
So I re-write this shell alias that I used all the time alias dkv="docker rm" to be a much safer shell function:
dkv() {
if [[ "$1" == "rm" && -n "$2" ]]; then
read -r -p "Are you sure you want to delete volume '$2'? [Y/n] " confirm
confirm=${confirm:-Y}
if [[ "$confirm" =~ ^[Yy]$ ]]; then
# Disable history
set +o history
# Delete the volume
docker volume rm "$2"
# Re-enable history
set -o history
else
echo "Aborted."
fi
else
docker volume "$@"
fi
}
Then I cleaned up my shell history of all of the invocations I ever made of dkv rm ... to make sure I never ever have this so easily accessible in my shell history (^R):
$ awk '
/^#/ { ts = $0; next }
/^dkv rm/ { next }
{ if (ts) print ts; ts=""; print }
' ~/.bash_history > ~/.bash_history.tmp && mv ~/.bash_history.tmp ~/.bash_history && history -r
This is an example of what I believe every SRE should master and whatever Post Incident Review (PIR) should focus on. Where did the system fail. What are the missing or incomplete Safety Controls.
@lyse@lyse.isobeef.org I’m open to other suggestions 🤣 But hopefully both adding the additional prompt, not allowing it to enter shell history and removing from my shell history prevents me from doing such silly things in haste by pressing ^R and using fuzzy search which if you type fast you sometimes get wrong 😑
@kate@yarn.girlonthemoon.xyz Fair enough! 😂 Also a good approach, change the environment 🤣