Big emergency meeting!
I have a pet-peeve with big emergency meetings in Tech projects, read, War rooms.
From a management and industry perspective it’s acceptable! After all, it’s one of many techniques used for problems that need a shared heightened state of awareness. I concede that it can be very useful in some case-by-case situations.
My problem is not with the mechanism itself though. The wound is in the surrounding context often attached to big emergency meetings:
- They’re often a late symptom of bad systems design. Whoever was working in a problem didn’t have enough time, incentive, or tools to solve a problem well and early.
- They’re a magnet for stress and negativity of all those involved. Picture a group of people, one doing “the thing” and five other surrounding the first, observing. In emergency meetings we get more people. And instead of observing: screeching and shouting matches due to the “heat of the moment”.
- They make use of military terms in non military situations. …
Adrenaline and blood pressure intensify.
There are also edge-case situations. How many of us have endured some of these turning into the ending parts of the Downfall movie:
Out-of-touch leadership strategizes over a problem space by assigning teams to it. Most of those teams in reality understaffed, overworked, and sometimes inexistent. The problem space has also changed. Leadership ignores the outcries of other team members who clamor for a return to reality.
But! Hope is not lost… there’s a way out of these.
It involves designing our development project or target system in an alternative way. The guiding focus being: end surprises and make everything “boring”.
There’s a lot to say on making things “boring”. I’ll spare you the big text and will share a small set of personal guiding questions:
- Rethink observability: Can we observe a system and extract useful human readable information? Is the answer the same in the context of an emergency? Can someone with little implicit knowledge and context do meaningful observations?
- Self-repair systems: What repair actions do we often do that could be a shell script, a cron job, …? Can we make the system take those actions by itself?
- Deep testing feedback loops: what prevents our system from being testable? What makes the system’s testability vary through time? What keeps us from doing actual testing? What keeps us from testing often?
- Optimize communication: how do we currently communicate to solve any problem? How do we get agreement? How do we make decisions? How would we achieve the same with increasing limitations in synchronicity?
- Hands-off incident decision trees: Given an incident how do we study it? How do we revert any changes? Can we handle any incident with little preparation? Do we have meaningful documentation and tooling that allow us to do so in a self-sufficient way?
From these questions we could draw more questions and more answers. The exercise itself is the key. It’s a first step at getting rid of big emergency meetings in tech. And a first step at bringing the way we work to a saner and happier future.
Dare to ask, and keep pushing on for answers.
If you read this far, thank you. Feel free to reach out to me with comments, ideas, grammar errors, and suggestions via any of my social media. Until next time, stay safe, take care! If you are up for it, you can also buy me a coffee ☕