Researchers use machine learning to re-identify ‘anonymous data’

pharmafile | July 24, 2019 | News story | Research and Development Big Data, anonymous data, data, patient data, privacy, research 

Anonymised data can be traced back to individuals, according to research published in the journal Nature Communications.

Anonymised data can be reverse engineered using machine learning to re-identify individuals, even when anonymization techniques have been put in place.

Researchers from UCLouvain were able to re-identify 99.8% of Americans in any available anonymised dataset by using just 15 characteristics including age, gender and marital status.

Advertisement

Dr Luc Rocher, who authored the paper, explained: “While there might be a lot of people who are in their thirties, male, and living in New York City, far fewer of them were also born on 5 January, are driving a red sports car, and live with two kids (both girls) and one dog.”

Senior author Dr Yves-Alexandre de Montjoye added: “This is pretty standard information for companies to ask for. Although they are bound by GDPR guidelines, they’re free to sell the data to anyone once it’s anonymised. Our research shows just how easily – and how accurately – individuals can be traced once this happens.

“Companies and governments have downplayed the risk of re-identification by arguing that the datasets they sell are always incomplete. Our findings contradict this and demonstrate that an attacker could easily and accurately estimate the likelihood that the record they found belongs to the person they are looking for.”

The researchers say the findings should act as a wakeup call to politicians and lawmakers as to the need to make data protection laws watertight.

Dr Julien Hendrickx, said: “We’re often assured that anonymisation will keep our personal information safe. Our paper shows that de-identification is nowhere near enough to protect the privacy of people’s data. It is essential for anonymisation standards to be robust and account for new threats like the one demonstrated in this paper”

The paper raises the issue of how patient data is being handled and used. As said by Dr de Montjoye, “The goal of anonymisation is so we can use data to benefit society. This is extremely important but should not and does not have to happen at the expense of people’s privacy.”

Louis Goss

Related Content

Novo Holdings Invests in Oxford Nanopore Technologies 

leading life science investor Novo Holdings, announced a £50m investment in Oxford Nanopore Technologies. Oxford …

handshake

Inizio Engage and Nye Health form a strategic alliance to enhance patient support and outcomes 

 Nye Health’s digital platform will be coupled with the patient support experience available at Inizio …

Drug discovery and development partnership announced between Apollo Therapeutics and Oxford University

Portfolio therapeutics company Apollo Therapeutics has announced earlier this week that it will provide capital …

The Gateway to Local Adoption Series

Latest content