The US government research unit serving intelligence agencies is looking to expand its ability to monitor thousands of people at once, with a new request for companies, cities, and the academic sector to help compile a massive video dataset.
The Intelligence Advanced Research Projects Activity (IARPA) team, which falls under the purview of the Office of the Director of National Intelligence (ODNI), posted the solicitation to a federal contracting database.
Artificial-intelligence algorithms like the ones the government wants to train require large amounts of data to be accurate.
“Further research in the area of computer vision within multi-camera networks may support post-event crime scene reconstruction, protection of critical infrastructure and transportation facilities, military force protection, and in the operations of National Special Security Events,” the IARPA posting explains.
IARPA seeks partners who can provide at least 960 hours of video footage collected over the course of four days. Camera networks should cover an area measuring 10,000 sq meters—about the size of a city block—utilize a minimum of 20 cameras, and capture a minimum of 5,000 pedestrians who are unaware they are being filmed. IARPA is also asking for at least 200 volunteers to imitate pedestrians and perform specific movements or tasks in the coverage area. This footage would be labelled in specific ways to make it easy for AI algorithms to analyze.
“All of these IARPA projects add up to a dystopian society that I worry quite a bit about,” Dave Maass, senior investigative researcher at the Electronic Frontier Foundation, tells Quartz. “The government is involved in collecting images and using them to train systems that will be used to surveil us.”
IARPA did not immediately respond to a request for comment.
A facial-recognition dataset needs faces to work. Amassing enough pictures of those faces, however, can be a logistical and ethical minefield. “This is the dirty little secret of AI training sets,” law professor Jason Schultz told NBC News after it was revealed that IBM had scraped nearly a million pictures posted by users on Flickr. “Researchers often just grab whatever images are available in the wild.”
Maass says the FBI has long populated its facial-recognition databank with booking photos of dead people, who no longer have any legal expectation of privacy. Still, about 80% of the FBI’s photos come from driver’s licenses and passports. (The system is reportedly inaccurate in roughly 15% of cases and more often misidentifies black people than it does whites.) The National Institute of Standards and Technology, part of the US Department of Commerce, tests algorithms using images of exploited children (with the stated purpose of assisting in investigations), US visa applicants, and deceased arrestees, according to a recent Slate investigation. Additional images are generated by Department of Homeland Security staff posing as civilian travelers, similar to what IARPA is proposing for its street project.
IARPA is asking potential partners to address the use of “non-interacted pedestrians,” people on sidewalks who will have no idea they are participating in the project: specifically, how to “plan and prepare for human subject research collection” and its necessary protocols.
“Cities are putting up these camera networks all over the place and they tell the public that it’s being used for one thing—crime control, traffic congestion—but they don’t tell them that the data is going to be used to train surveillance systems,” says Maass, adding that IARPA is “asking the question, ‘Can this be done,’ rather than asking, ‘Should this be done?’”
Correction: This story has been updated to correct that NIST does not train algorithms, but helps test them.
Read the full text of the IARPA RFI here: