In the age of automation, the “human error” that dooms a plane isn’t confined to the cockpit

Remembering JT610.
Remembering JT610.
Image: Reuters/Willy Kurniawan
By
We may earn a commission from links on this page.

Over a century of air travel, Boeing once noted, human error on the part of the pilot or mechanics became the major factor causing air accidents, as the machines themselves became ever safer.

Just days ahead of the one-year anniversary (Oct. 29) of the first deadly crash of the Boeing 737 Max, Indonesia’s final investigation report put the blame for the Lion Air crash as much on Boeing’s design of the aircraft as a range of other factors, including lapses in the US regulatory process, Lion Air maintenance, and flight crew response. The model was expected to revolutionize budget travel with its fuel savings, but after another deadly air crash in March of an Ethiopian Airlines flight, the plane was grounded globally. Together the two crashes of the same brand-new model killed 346 people in under six months.

“We found nine items we consider contributes to this accident,” Indonesia’s National Transportation Safety Committee chief investigator Nurcahyo Utomo told reporters on Oct. 25. “If one of them is not occurring on the day, the accident maybe would not happen.”

Nevertheless, of the 25 safety recommendations the Indonesian investigators issued, more than half were directed to Boeing and the US Federal Aviation Administration (FAA). The company’s CEO, Dennis Muilenburg, will face questions about its missteps when he appears on Tuesday before a US Senate committee investigating the crashes.

In an age where planes are ever more complex, the human errors that can doom a plane are no longer confined to the cockpit. Many of the most crucial ones involving the Max occurred years ago, in Boeing offices and in simulators, well before the pilots of Lion Air JT610 climbed into its cockpit. Those earlier errors included, among others, the incomprehensible design decision to allow the new flight-control system that was linked to both crashes—the Maneuvering Characteristics Augmentation System (MCAS)—to activate based on data from a single sensor, and stunningly sunny assumptions about pilot reaction time under stress.

Pilots say a more urgent realization of how heavily the weight of flawed assumptions can fall on them has taken hold since the Lion Air crash, and is shaping how they’re dealing with Boeing and the FAA in the effort to re-certify a re-designed Max, which Boeing hopes to return to service by the end of this year.

“It’s brought back an intellectual aggression to know more about how these aircraft are designed,” said Dennis Tajer, spokesman for the Allied Pilots Association, which represents American Airlines pilots. “As dark as it’s been we’re going to be in a better place after this.”

A woman holds a picture of a relative, one of the 189 people killed in a Lion Air plane crash, during a rally calling for Lion Air to put safety over profit and for Indonesian President Joko Widodo to ensure all remains were recovered, in Jakarta, Indonesia December 13, 2018.
A woman holds a photograph of a relative who died in the Lion Air crash at a rally in Jakarta in December calling on the airline to improve its safety.
Image: Reuters/Darren Whiteside

Lion Air JT610 and “the effect of surprise”

Last year, when Indonesian investigators released a preliminary crash report, they described a grim battle between the pilots of Lion Air flight JT610 and MCAS, the automated anti-stall system, which was put on the 737 Max so it would perform similarly to earlier models in the 737 line. That would allow it to be certified as a variant of an earlier model, and be inducted by airlines with little additional training at a time when Boeing was trying to play catch-up (Quartz member exclusive) with Airbus’s more fuel-efficient A320neo.

On Lion Air JT610, MCAS kicked off just minutes after the flight took off from Jakarta, and began forcing the plane’s nose down in reaction to incorrect data from a single sensor—a design decision that made it far less safe than the better-designed version that Boeing put on a military plane. Each time the 31-year-old Indian captain, Bhavya Suneja, tried to redirect the plane, the system would reset and foil him—and require greater force on his part to stabilize the plane. At one point Suneja handed control to his co-pilot, who didn’t realize how strongly he had to fight the plane. “Wah, it’s very…” he exclaimed, before trailing off.

The back-and-forth between pilot and machine happened more than 20 times, until eventually the plane crashed into the sea near the capital Jakarta less than 15 minutes after takeoff. Neither pilot ever thought to carry out the steps that, according to Boeing, would have prevented such a catastrophe.

Chart showing trim occurrences from Indonesia's preliminary report on Lion Air JT610 crash issued in Jakarta on November 28, 2018.
The blue line shows pilot efforts to redirect the plane, the orange ones are MCAS actions.
Image: Indonesia National Transportation Safety Committee

This is not what Boeing expected at all. In designing and testing the 737 Max, Boeing assumed—in keeping with FAA guidance for test pilots—that in case of an undesired activation of MCAS, it would take pilots one second to figure out what the problem was, and three seconds to react appropriately, by flipping switches that would override it. It turned out not to be realistic.

On Oct. 29, 2018, Suneja, the captain, had the flu, coughing as many as 15 times in a single hour ahead of takeoff, according to the cockpit voice recorder as detailed in the latest Indonesian report. His co-pilot, Harvino, who only went by one name, wasn’t even scheduled to fly that day, and had been woken up at 4.00 am and told to report for the flight. In Suneja’s past assessments, comments suggested he needed to do better on teamwork, such as communications to his co-pilot. Harvino, meanwhile, had received multiple evaluations that flagged issues over the years. One trainer remarked he needed to manage his stress while another registered a concern over “situational awareness”—the ability to synthesize information and predict what’s about to happen. In the years before he flew the Max, though, his evaluations improved.

Their skills were almost certainly not on par with the Boeing pilots who evaluated the hazards of the 737 Max design during the plane’s development. And the same could likely be said of many pilots at the airlines around the world to whom Boeing was selling the aircraft.

On the day of the crash, the erroneous data that activated MCAS led to a number of alerts also going off, including an item called a “stick shaker” that vibrates noisily and other audible warnings. Meanwhile, instructions issued by Indonesian Air Traffic Control, which didn’t realize how serious the issue was, added to their workload.

“There was a tsunami of distractions going off in an airplane. Unless you precisely come in and interrupt it you are heading for a plane that is heading for the ground,” said Tajer, the spokesman for the Allied Pilots Association, adding, “You shouldn’t have to count on superhuman behavior when you’re designing an aircraft.”

But Boeing never tested for this toxic brew of factors. And in certifying the plane, the FAA, which didn’t fully understand the new system—or how powerfully it had been souped up along the way to meet other FAA requirements—didn’t either. In the single case of pilots recovering from an incorrect MCAS activation in the real world, which occurred on a Lion Air flight that took place the day before the crash, it took 3 minutes and 40 seconds for them to get there—with the help of a third pilot who happened to be present.

The assumptions made by Boeing and the FAA have been sharply criticized in three separate reports, from the US National Safety Transport Board, from the Joint Authorities Technical Review (JATR), a group composed of representatives of global civil aviation agencies, and from Indonesia’s safety investigators.

The JATR report said that it could find no studies to support the FAA’s 4-second rule guidance and that the agency’s own studies found that it could take pilots “many seconds, and in some cases many minutes” to recognize and respond to a malfunction. It urged the agency to conduct “substantive scientific studies” of pilot reaction time, including to account for “the effect of surprise” (pdf, p. 14).

One thing that might have mitigated the effect of surprise was pilot knowledge about MCAS—for example, in the form of training. That would have given them a mental map of what was going on, according to Mica Endsley, a former chief scientist for the US Air Force whose work was cited in the Indonesia report. She’s now the president of SA Technologies, which works with the aviation industry and the military, among others, to improve situational awareness around complex systems.

In 2016, however, the pilot interacting with the FAA about training recommended that references to MCAS be removed from flight manuals and computer-based training, according to an email chain reviewed by Quartz. The pilot boasted in the emails of “jedi mind tricking” regulators to accept his recommendations. When the plane entered service, MCAS was not mentioned in flight manuals or in the iPad lesson–just an hour in some cases—that pilots familiar with earlier versions of the 737 did before flying the Max.

Preventing “automation confusion”

As automation spreads, the need for experts in human-machine interactions who can imaginatively work through worst-case scenarios is becoming more and more vital. According to Endsley, more than two decades of developments in automation means there are lots of people now who are pretty good at this kind of work. But the 737 Max approval process shows the importance of these people to the design and certification process isn’t being given due weight at Boeing or at the FAA.

“The problem is not one of personnel availability, it is one of understanding the importance of human-factors science and prioritizing it in aircraft certification,” said Endsley in an email. Of Boeing, Endsley added, it “at one time had a very highly regarded human factors program, and they’ve lost a lot of those people over the last 10-15 years.” Endsley attributed this in part to experienced people retiring, but also to a deterioration in safety culture at the organization.

The FAA referred Quartz to its statement issued after the release of the Indonesian report. The agency said it welcomed the recommendations and was “committed to ensuring that the lessons learned from the losses of Lion Air Flight 610 and Ethiopian Airlines Flight 302 will result in an even greater level of safety globally.”

In September, after a five-month review, Boeing announced steps to beef up its approach to safety, including setting up an organization to unify safety-related responsibilities across company units. The MCAS has now been redesigned so that it only activates after comparing data from two sensors. It will also only go off once. Training decisions are still being finalized.

Boeing didn’t immediately respond to questions for this story.

One key finding from the kind of research Endsley and others have done into how humans behave around automation shows a problem of “automation confusion”—the more automation there is, the worse human beings tend to perform when they unexpectedly have to intervene and diagnose a problem.

On a Boeing or Airbus commercial plane, pilots report typically flying the plane manually for about three to seven minutes of a flight—during takeoff and the initial climb, and then on landing.

“People have a much lower understanding of events when they are not actively doing something but rather just passively watching someone else, in this case the automation, do those tasks,” said Endsley. “It is as if you were the passenger in a car who went to a party, and then at the end of night found out you were expected to drive home. Your knowledge and understanding of how you got there would likely not be as good as if you had been the one driving. In many ways automation turns us all into passengers.”

That makes “automation transparency” to the end user all the more important—signaling clearly that there’s new automation and what it’s likely to do. In the case of MCAS, that might have meant both informing pilots about the automated anti-stall system in advance, and indicating on the flight display when it was activated—which would of course have meant different, and more expensive training.

“A lot of automation has been ‘silent but deadly,’ acting in the background but not communicating well with the people who ultimately have the responsibility for the safety of the operation,” said Endsley. “The biggest reason for this is that the engineers designing the automation assume that their system will behave properly… This of course is a very bad assumption.”