Skip to navigationSkip to content
REUTERS/Yuriko Nakao
When it comes to anonymous data, putting the pieces together is easier than you think.

How easy is it to find one patient among 1.3 million anonymized medical records?

Member exclusive by Youyou Zhou & David Yanofsky for The data boom

If your colleague was in the hospital but didn’t want to tell you why, could you still figure it out? Maybe.

Publicly available data often contains enough personal information to allow casual acquaintances to locate specific people in medical records, even though the data is considered to be “de-identified.” Patient-level information including hospital name, patient age, race, ethnicity, length of stay, and, detailed diagnoses can all be used to glean information most people think is private.

Let’s say you’re trying to find “Jordan.” You know a few basic things about Jordan. You know Jordan’s gender, race, ethnicity, and generally, Jordan’s age. You also know the county Jordan lives in New York.

You are reading a Quartz member exclusive.

Become a member to keep reading this story and unlock unlimited access to all of Quartz.

Membership will also get you: