JEB’s rules of IT

  1. Always have a backup.
  2. Always monitor your backups.
  3. Always backup your monitoring.
  4. Always investigate failures; work out how to monitor it to catch it quicker next time (hopefully before it fails).

Backups without monitoring is no backup at all; if your backups fail, you won’t know until you try to use them.

A note about RAID

Unmonitored RAID 1/5/6/10 is no better than a single disk; as each disk pops, you’ll never notice until the last one goes.

A note about UPS’s

UPS’s fail, OK.

Resilience and cost

Resilient, fast, and cheap; chose any two.

Lies in status messages

Just because something says OK, doesn’t mean it is OK.