I recently as part of a facilitated discussion (The Codemania Conversations unconference) where I stood with my peers @joshrob, @petegoo and @kiwipom from Pushpay and in discussing blameless post-mortems and just engineering culture talked about my own difficulties I had experienced understanding how to be an effective participant in our blameless post-mortems.
The entire concept of running a post-mortem after an incident was a new thing to me, I had learned about post-mortems when joining Pushpay in 2014. I had some exposure and been part of many discussions around this blog post by John Allspaw. Initially it seemed like a ritual, a good one but I certainly didn't understand how beneficial it was and is such a key part of being able to move fast, experiment as an engineer and not fear change.
I had come from a background where the attitude to incidents and finding a remedy was entrenched in ideas we must hunt for the the single root cause of an issue, weeding out the careless bad apples and corrective measures being dealt down to those 'responsible'
When something would go wrong a lot of the behaviour surrounding the remediation was driven by fear of punishment and resulted in highly political and often combative meetings. The output from which would be mitigations that didn't catch all the problems and usually the addition of process being layered on so 'this can never happen again'. Usually from a management view while those involved awaited another meeting to find out what was to be done.
I now understand how it was both toxic and expensive use of our energy and didn't even give us the kind of mitigations we as leaders would hope for. If anything in some cases it was making us go SLOWER. But that is just how things are? Right?
So a blameless culture of trust and accountability seemed great, but was it actually real? One of my peers insisted 'You want people to know something went wrong, that way we can fix it and probably whole bunch of other things all together!'.
So what was it that really got me to understanding blameless post-mortems and restorative just engineering culture?
Well it was watching these small video talks by Sidney Dekker on his site here. There is just four sessions and they run about ten to fifteen minutes long.
Module 1 — Introduction
Module 2 — Retributive Just Culture
Module 3 — Restorative Just Culture
Module 4 — Second Victims
I really feel these videos are a critical first step for anyone seeking to understand why we should strive in our teams and companies to build a culture that embraces restorative justice by contrasting them with systems that employ a system of retributive justice.
A retributive justice culture needs people and particular systems be at fault and be the root cause. We had a incident and something is to blame!
It then follows there must be some process or decision tree that is gone through to decide what corrective actions are going to be taken and against who. During this process it's very likely as we determine who is to blame the finer detail around the incident and its time line become obscured be it through fear and or political behaviour.
This loss of the finer detail is one of the most critical reasons why blame is so harmful and toxic. Its likely some or all of the learnings a community or team may learn from an incident will become obscured or silenced through fear of punishment. Especially from the people who were involved who are the ones best able to tell the story of what happened. After all they were there!
Restorative justice culture on the other hand is one based on trust and accountability. A honest timeline of events produced by those involved with the assumption that they acted in everyone's best interest to the best of thier abilities with the information they had at hand. The exposure to all sort of things that contributed to an incident will become apparent if people are not afraid to share thier story with their team.
When the story is told with all the finer detail out for all to see we start to see other failures in our systems which also contributed to an incident. Its almost never just a single thing or root cause that triggered the incident.
Here is where we get some of the richest mitigations and learnings as a team and community of engineers. We often find a bunch of mitigations we can do as a team to reduce our peers being in a risky position in the future. And that is a really great place to be let me tell you.
Restorative justice culture enables us to come together as a community of engineers and learn from failure and really improve.
So ask yourself honestly ... which world would you rather live in?
I plan to talk a lot more about Blameless post-mortems in future posts but I hope the Sidney Decker videos helped contribute to your thinking about the kind of engineering culture you wish to build in your team or company.