Elected officials in the US Congress are worried that artificial intelligence might be used to generate videos and audio of them saying things that they never actually said. It’s fake news in its highest form.
At a Senate Intelligence Committee hearing yesterday (Sept. 6), Congress members asked Facebook’s chief operating officer Sheryl Sandberg and Twitter CEO Jack Dorsey about how they were preparing to deal with artificially generated video on their respective platforms.
“Americans typically can trust what they see, and suddenly in video they can no longer trust what they see, because the opportunity to be able to create video that’s entirely different than anything in reality has now actually come,” James Lankford, a senator from Oklahoma, said during the hearing.
Politicians face a unique challenge with AI-generated video, sometimes called “deepfakes,” though that term more specifically refers to using AI to stitch one person’s face onto another person’s body, often for pornography. Given that much political discourse, from legislative meetings to hearings like the ones yesterday, is televised in the US, there’s an abundance of video where politicians are sitting down in good lighting, talking clearly into a microphone and facing a camera. It’s a goldmine for anyone trying to replicate them.
The AI technology used to create generated videos is called deep learning—a technique that’s roared into use since 2012, taking massive amounts of data to learn how to complete increasingly complex tasks. In 2012, it was impressive for a researcher’s AI to achieve 85% accuracy recognizing an image—now those same computer scientists are figuring out how to generate video of events that never existed.
As this type of AI learns how to complete its task better as it gets more data, it will be more likely to produce a realistic video of politicians who are always on camera. These systems typically work, as researchers have demonstrated before, by taking an original video of someone speaking and morphing it into the politician.
Since the AI has seen a politician’s face in so many combinations of expressions and orientations, it’s able to predict what their face would look like if they were making the same expression as the person in the original video. By making thousands of these predictions and stitching them into a video, a new video with the face of a politician is generated. Companies like Lyrebird are also working to clone a person’s voice, and the audio needed to train that algorithm can also be taken from videos of politicians.
As far as policing this technology, Facebook and Twitter seem unprepared. Dorsey had nothing to say on the subject, and Sandberg said that Facebook would explore the technology.
“Deepfakes is a new area, and we know people are going to continue to find new [areas of deceptive technology],” Sandberg said. “It’s a combination of investing in technology and investing in people.”
Given Facebook’s struggle to even educate its own moderators to police content on the site, expecting people with little expertise in image authenticity analysis to catch fake video might be a stretch.