We were ready for our Disaster Recovery exercise. The plan was simple: to fail-over to our DR site, and then to fail-back again.
But, during our last-minute safety checks, my colleague and I spotted a problem. We realised that, if we proceeded with the planned fail-over, there was a significant risk of data-loss.
Some of the team wanted to continue anyway. We were, after all, trying to simulate a real disaster situation. If a real disaster were to occur, they argued, we would have to deal with similar consequences.
Debating the merits of this position, I suggested that:
There is a difference between falling off a bridge and throwing yourself off. The first is an unfortunate accident, the second is downright recklessness.
In agreement, my colleague replied that:
Continuing with the exercise would be like throwing yourself off a bridge now just in case you fall off one later!
A deeper analysis of the risk suggested that, if we were careful, we could throw ourselves off this particular bridge safely. Although there was likely to be data loss, there was a manual process available that would recover the missing records.
So, off we jumped…