BLAME GAME

Why Etsy engineers send company-wide emails confessing mistakes they made

Fear of failure is often baked into workplace culture in a way that makes people either unwilling to take risks, inclined towards hiding mistakes, or too ready to blame others. Etsy, the recently public craft-focused e-commerce site, has made a concerted effort to change that.

In a conversation yesterday (Sept. 17) with Quartz editor-in-chief Kevin Delaney, Etsy CEO Chad Dickerson revealed that people at the company are encouraged to document their mistakes and how they happened, in public emails.

“It’s called a PSA and people will send out an email to the company or a list of people saying I made this mistake, here’s how I made that mistake, don’t you make this mistake,” Dickerson said. “So that’s proactive and I think really demonstrates that the culture is self perpetuating.”

He was referring to the company’s efforts at practicing a “just culture,” based on the idea that blamelessness makes people more accountable, and more willing to admit mistakes.

As described by Etsy CTO John Allspaw in an Etsy blog post, engineers (and now others at the company) who mess up are given the opportunity to give a detailed account of what they did, the effects they had, their expectations and assumptions, and what they think happened. And, crucially, they can give that account without any fear of punishment or retribution, in what’s called “a blameless post-mortem.”

According to Allspaw, the first PSA emails began in late 2010 or early 2011 when engineers at Etsy encountered a particularly obscure, common-sense defying bug that they thought others might run into as well. They shared it around in the hope that it might save engineers a headache in the future. The practice spread from there.

The confessions are self-initiated, though employees encourage each other to send them. “At this point, it’s almost a tacit understanding that someone would do this, it’s a social contract that benefits both the author of the PSA (because they have to describe the conditions of the issue) as well as the readers. It’s similar to placing a ‘wet paint’ or ‘caution—slippery floor’ sign,” Allspaw wrote in an email to Quartz.

Here’s an example Allspaw shared with us, with a few identifying specifics removed, which gives a good sense of the tone and content of the emails:

Howdy!

While <doing some specific development> I introduced some bugs into the code. <Engineer #1> alerted me to what could have been a serious problem when they reviewed the code. I share this with you all to remind you of a few things:

  1. Tests tell you what you tell them to. I wrote tests, the tests passed. That made me confident that everything was okay when it really wasn’t. One of these tests in particular was literally proving that I was calling a method incorrectly. Lesson: you can write tests that pass when things are wrong – a passing test is great but doesn’t mean you’re done.
  1. I got the code reviewed but no one caught the problem (the first time). Lesson: get more eyes on your code. The more risk involved, the more eyes you’ll need. If I hadn’t gone to <specific team> to get this code reviewed by more folks, there could have been problems with <specific feature>. No one wants problems with <feature>! Additional Lesson: Be like <Engineer #1> – read reviews with care. Bonus Lesson: One of the bugs they caught had no direct relation to their team’s code. Domain knowledge is not a direct requirement for thorough code review.
  1. Manually test! In this case, the manual test would have failed. I hadn’t gotten to that yet, and wasn’t planning on skipping out on manual testing, but I’m mentioning it to reinforce the trifecta of confidence. Don’t skip manual tests!

….

Thanks for listening!

It’s not just major mistakes leading to something like a site outage that get shared and analyzed this way. Rigorously documenting “near misses” like the one above them is like a vaccine for Etsy, Allspaw believes. It helps the company better defend against more serious errors in the future, without harming anyone or anything in the process.

The company also gives out an annual award—a real three-armed sweater— to an employee who’s made an error. This demonstrates that accidents are seen as a source of data, not something embarrassing to shy away from, according to Allspaw. The sweater goes to whomever who made the most surprising error, not the worst one, as a reminder to examine the gap between how things are expected to happen and how they actually do.

And if you run into an error on the site “which hopefully you don’t very often,” Dickerson notes, you’ll find an image of a woman knitting a three-armed sweater.

home our picks popular latest obsessions search