AI, like all software, isn’t immune to being hacked. In recent months, security researchers have shown that machine learning algorithms can be reverse-engineered and made to expose user data, like personal photos or health data.
So how can we protect that information? New research from OpenAI and Google shows a way to build AI that never sees personal data, but is able to function as if it had.
Ian Goodfellow, a researcher at OpenAI, compares the system to medical school.
“The doctors who teach in medical school have learned everything they know from decades of experience working with specific individual
people, and as a side effect they know a lot of private medical histories,” Goodfellow says. “But they can show the student how to practice medicine without divulging all of that history.”
The “student” algorithm, which will be exposed to the world, learns to mimic decisions of its teachers through millions of simulated decisions, but doesn’t have any of the underlying information used to inform the teacher algorithms. The teachers exist only to teach the student—they never see the light of day. The student algorithm also learns from nonsensitive, public data to fine tune results and further obscure the teachers’ data points. OpenAI and Google’s work builds on previous research that taught a student algorithm using multiple teacher algorithms, but makes the process faster and less reliant on teachers, meaning more secure. They have also released the code for others to adapt their system.
What makes reverse-engineering the algorithm even harder is the fact that there’s not just one teacher. In tests, researchers trained up to 250 teachers for one student algorithm, meaning the student doesn’t rely on any one sensitive data point, but an aggregate of information. Even if the AI was reverse-engineered, the research claims, an attacker couldn’t get any information from it.
Goodfellow’s medical school analogy works well because this system could be especially beneficial in developing medical AI by crowdsourcing information across hospitals, while keeping patient records private. An algorithm could learn from each individual hospital’s data, like radiology scans or patient data, and a student AI could automatically learn from the network of hospitals.
Companies like Google could also use this technology to learn from their users’ photos, without ever having to see them. Now, AI is trained on each user’s photos, but that AI is limited to the user’s account. A technique similar to this research could set each user’s AI as a teacher, to train a much more accurate student for Google’s Photos app, which automatically recognizes faces and objects.
This approach is a kind of differential privacy, which seeks to keep individual user information safe when it’s in a large database. For instance, anonymizing patient data in hospitals is a weaker form of differential privacy.
“Differential privacy addresses the paradox of learning nothing about an individual while learning useful information about a population,” write Cynthia Dwork of Microsoft Research and Aaron Roth of University of Pennsylvania in their book on the subject.
This is the tightrope that differential privacy walks: Accuracy against privacy. The better an algorithm mimics its teacher, the higher the possibility it can betray its teacher’s data.
“All the research in this space explores a tension between privacy and utility, as more privacy means utility goes down,” says Thomas Ristenpart, a machine learning security researcher, in an email to Quartz.
Despite this tradeoff, Goodfellow says they’ve gotten the student AI to perform within 2% of the teachers, down from 5% of the previous state of the art.