For years we’ve been recorded in public on security cameras, police bodycams, livestreams, other people’s social media posts, and on and on. But even if there’s a camera in our face, there’s always been a slight assurance that strangers wouldn’t really be able to do anything that affects us with the footage. The time and effort it would take for someone to trawl through months of security footage to find a specific person, or search the internet on the off-chance they’ll find you is just unrealistic. But not for robots.
Long possible in Hollywood thrillers, the tools for identifying who someone is and what they’re doing across video and images are taking shape. Companies like Facebook and Baidu have been working on such artificial intelligence-powered technology for years. But the narrowing rate of error and widening availability of these systems foretell a near future when every video is analyzed to identify the people, objects, and actions inside.
Artificial intelligence researchers struggled for years to build algorithms that could look at an image and tell what it depicts. The complexity of images, each containing millions of pixels that form unique patterns, was just too complicated for hand-coded algorithms to reliably work.
Then in 2012, researchers demonstrated that a technique called deep learning, a system that took the general idea of our brains’ interconnected neurons and translated it into mathematical functions, worked far better when working with large amounts of images. If a deep neural network, as the system was called, was given enough examples it could suss out shared patterns between images, like the shape and textures common between cats.
Since then the systems have grown in complexity and scale: Researchers began making larger networks of “neurons,” while hardware manufacturers like Nvidia began building specialty processors to make the networks exponentially faster. The result has been an explosion in what the systems can accomplish. Given a large dataset of images or video, these systems can be trained to learn what a person’s face looks like, and reliably identify it again and again.
The largest public example of this is MegaFace, a project out of the University of Washington. The dataset contains nearly 5 million images of 672,000 people, sourced from Flickr’s creative commons. In July, the MegaFace team presented the latest scores for algorithms trained on the dataset. When tested on matching two images of the same person in a separate dataset of 1 million face images, top-ranking teams touched 75% accuracy when given one chance to guess, and more than 90% accuracy when allowed to give 10 options.
“We need to test facial recognition on a planetary scale to enable practical applications—testing on a larger scale lets you discover the flaws and successes of recognition algorithms,” Ira Kemelmacher-Shlizerman, a UW professor who oversees MegaFace, told the UW press shop.
Video, which uses similar techniques to still images but requires higher processing power, also allows AI to understand what’s happening over time. Baidu, the Chinese search giant, announced in late August 2017 that it had won the ActivityNet challenge, correctly labeling the actions of humans in 300,000 videos with 87.6% accuracy. These are actions like chopping wood, cleaning windows, and walking a dog.
Facebook has also demonstrated interest in this technology to understand who’s in livestreams on the site and what they’re doing. In an interview last year, director of applied machine learning Joaquin Quiñonero Candela said that, ideally, Facebook would understand what’s happening in every live video, in order to be able to curate a personalized video channel for users.
Facial recognition in still images and video is already seeping into the real world. Baidu is starting a program where facial recognition is used instead of tickets for events. The venue knows who you are, maybe from a picture you upload or your social media profile, sees your face when you show up and knows if you’re allowed in. Paris tested a similar feature at its Charles de Gaulle airport for a three-month stint this year, following Japan’s pilot program in 2016, though neither have released results of the programs.
US governments are already beginning to use the technology in a limited capacity. Last week the New York department of motor vehicles announced that it had made more than 4,000 arrests using facial recognition technology. Instead of scanning police footage, the software is used to compare new drivers’ license application photos to images already in the database, making it tougher for fraudsters to steal someone’s identity. If state or federal governments expand into deploying facial recognition in public, they will already have a database of more than 50% of American adults from repositories like DMVs. And again, the bigger the dataset, the better the AI.
And that might not be far off. Axon, a company once known as Taser and the largest distributor of police body cameras in the US, has recently ramped up ambitions to infuse artificial intelligence into its products, acquiring two AI companies earlier this year. Axon CEO Rick Smith told Quartz previously that the ideal use case for AI would be the objective generation of incident reports, giving police more time out from behind desks. Facial recognition, he noted, isn’t active now but could be in the future. Motorola, another major bodycam supplier, pitches its software on its ability to quickly learn faces, highlighting a scenario where an officer is looking for a lost child.
Security cameras are also getting a boost of AI. Intel announced in April that it had built hardware for security cameras capable of “crowd density monitoring, stereoscopic vision, facial recognition, people counting,” and “behavior analysis.” Another camera, called the DNNCam, is a deep learning camera that’s waterproof, self-sufficient, and claims to be virtually indestructible, meaning it can be set to work in remote environments away from internet connections or behind a cash register for “regular customer recognition,” according to the website.
So what’s a privacy-minded, law-abiding citizen to do when surveillance becomes the norm? Not much. Early research has identified ways to trick facial recognition software, either by specially-made glasses to fool the algorithms or face paint that throws off the AI. But these often require knowledge of how the facial recognition algorithm works. This is just a heads up. Maybe wear a big hat?