2014-09-16

New article in ACM Queue discusses data privacy concerns and the release of high-quality open data

A new article in the ACM (Association for Computing Machinery) Queue entitled "Privacy, Anonymity, and Big Data in the Social Sciences" by MITx (Massachusetts Innovation & Technology Exchange) and HarvardX MOOC (massive open online course) scholars Jon Daries, Justin Reich, and six of their colleagues provides a clear illustration of the considerable tension that can exist between data privacy concerns and the release of high-quality open data.

As Justin Reich, one of the authors of the article, summarizes in a blog post:

"Many people have called for making science more open and transparent by sharing data and posting data openly. This allows researchers to check each other's work and to aggregate smaller datasets into larger ones. One saying that I'm fond of is: "the best use of your dataset is something that someone else will come up with." The problem is that increasingly, all of this data is about us. In education, it's about our demographics, our learning behavior, and our performance. Across the social sciences, it's about our health, our beliefs, and our social connections. Sharing and merging data adds to the risk of disclosing those data."

He and his co-authors conclude that "you can have anonymous data or you can have open science, but you can't have both." (Source)

The article "Privacy, Anonymity, and Big Data in the Social Sciences" can be found here.