CUCC Expedition Handbook
Troggle - Kill it with Fire
This is a book: Kill It With Fire: Managing Aging Computer Systems (And Future Proof Modern Ones)
Read the brief reviews:
About a third of the book is about management and doing repair/rewrite inside large organisations, but
nearly half of it is directly relevant to us:
- Consider iteration in place as the default approach.
- Beware of artificial consistency as proposed to "improve" technical value.
- Don't assume that what you can't see is simple to replace: there may be years of effort hiding considerable complexity which appears simple on the surface.
- Previous modernization efforts may have left behind serious scar tissue which you need to consider.
- Many people try to fix things which are not, in fact, broken. Don't optimise beyond diminishing returns.
- Modernization needs momentum to finish the job: don't take on risks without also producing compelling value.
- Decisions made to avoid rewriting the new code later are usually bad decisions.
- The only thing worse than attempting to fix the wrong thing is leaving the fix attempt unfinished.
- Site reliability expressed as uptime percentages is rarely the real issue.
- Speed of recovery after failure is always more important than you think.
- A perfect record of no failures can always be broken, but resilience is an accomplishment that lasts.
- Know when to run a systemic Code Yellow hackathon and when not to.
- Always discover why previous modernisations failed first.
- Spend a lot of time problem setting before you start to think about problem solving.
- Working through a major modernisation is all about managing scope.
- Code is much easier to write than it is to read. Modernisaton means a lot of code reading.
- Human beings are absolutely terrible at estimating probabilities and risk. We always under-estimate the amount of work in a rewrite and over-estimate the likelihood of success.
- Success does not come all at once. What are the progressive success criteria during the reengineering?
- Use Diagnosis, Policy, Actions where there is little consensus about what success looks like.
- Use bullet journalling to maintain your own morale during a re-engineering project.
- When a failure is user input, fail gracefully with helpful message.
- When a failure is due to a programming or specification bug, fail hard and fast.
- Build simple, performant systems before you start on the bells and whistles.
- Prioritise simple code running fast early, so that you can iterate fast.
- An existing system is not a reliable specification for the proposed new system. Important behaviour will be implicit, not documented.
The book was published in March 2021.
Return to: Troggle intro
Return to: Troggle Programming Guide