“How screwed am I?” asked a recent user on Reddit, before sharing a mortifying story. On the first day as a junior software developer at a first salaried job out of college, his or her copy-and-paste error inadvertently erased all data from the company’s production database.
Posting under the heartbreaking handle cscareerthrowaway567, the user wrote:
The CTO told me to leave and never come back. He also informed me that apparently legal would need to get involved due to severity of the data loss. I basically offered and pleaded to let me help in someway to redeem my self and i was told that i “completely fucked everything up”…
I haven’t heard from HR, or anything and i am panicking to high heavens. I just moved across the country for this job, is there anything i can even remotely do to redeem my self in this situation? Can i possibly be sued for this? Should i contact HR directly? I am really confused, and terrified.
If the story is real, cscareerthrowaway567 is hardly alone in the agony of watching a painfully human goof metastasize into catastrophe.
In December, a coding error in Snap’s latest iOS update accidentally jammed the network that keeps more than 15 million computer systems synchronized to the clock. A typo from a busy Clinton campaign aide inadvertently opened the door to the Russian hack of John Podesta’s emails. The British Airways power outage that disrupted tens of thousands of flights last week was reportedly caused by a tech support worker accidentally flipping the power off.
The point is, any system in which humans are involved will at some point be disrupted by human error. Organizations distinguish themselves not by stamping out the possibility of error, but by handling the inevitable mistake well.
As subRedditors saw it, cscareerthrowaway567 made one mistake. The company made several. It didn’t back up the database. It had poor security procedures and a sloppily-organized system that encouraged the very error cscareerthrowaway567 made. Then, rather than taking accountability for those problems, the CTO fired the rookie who revealed them. Of all the errors this company made, that last might be the most destructive to their future success.
An extensive review of employee teams at Google found that the most successful were those with a high level of psychological safety. In other words, when employees felt safe enough to take risks (and make mistakes) without being shamed or criticized, they did better work.
“The wisdom of learning from failure is incontrovertible. Yet organizations that do it well are extraordinarily rare,” wrote Amy Edmondson, the Harvard Business School professor who coined the term “psychological safety.”
For a rare example of a better company response, let’s look back at the engineer error that caused an Amazon server outage on Christmas Eve 2012, which disrupted Netflix and other services that relied on the company for cloud computing. Amazon wrote a detailed account of the event, explaining how the outage occurred, how it was resolved, and what had been changed to prevent the problem from happening again. The focus was on fixing the problem, not blaming the individual.
“For all that’s wrong with Amazon, the best part was when someone fucked up, the team and the company focused only on how we make it never happen again,” a former employee wrote on the forum. “A human mistake was a collective failure, not an individual one.”
Another respondent related all too well.
“Hi, guy here who accidentally nuked GitLab.com’s database earlier this year,” wrote Yorick Peterse, the software developer who accidentally wiped out live production data during a late night work session and nearly melted down the site. GitLab chronicled its recovery efforts live on YouTube and in a Google doc, and treated it as a company problem instead of an individual one.
“GitLab handled this very well,” wrote Peterse, who is still with the company. “Nobody got fired or yelled at, everybody realised this was a problem with the organisation as a whole.”
Peterse now has enough distance from his own experience to also see humor in the sheer scale of such screw-ups, he told Quartz over email. Still, he recognized the pain of the young software developer.
“I also felt quite annoyed by how the company of the story supposedly handled the situation, potentially scarring the junior’s career and in particular confidence,” he wrote from his home in the Netherlands. “Somebody new should be guided through a setup procedure (especially when it involves production access), not thrown into the depths, only to be told to ‘leave and never come back’ when they make a mistake.”
Indeed, the unfortunate young developer’s experience seems to have struck a chord for many, and whether there’s legal action or not, the court of public opinion is on the new guy’s side: In a poll on the tech site the Register, less than 1% of 5,400 respondents thought the new developer should be fired. Forty-five percent thought the CTO should go.