Subject: Re: [Air-l] AOL and research ethics
I'm wondering what kind of sanitization method(s) would have worked in this case?
Data sanitization is very difficult to do, and just about equally difficult to audit or evaluate. It may go as far as a state in which *single inferences* that can be made from the data are damaging - for example, a user who searches for several proper names (uncommon ones) *plus* a string of terms regarding some pretty unsavory pornography. How hard, in this case, to trace out their social network and figure out who the person really is? I would guess that you might be able to sanitize this sort of data by removing all of the nouns from it. *smirk* --elijah
On 8/29/06, Barry Wellman <wellman@chass.utoronto.ca> wrote:
Besides legal stuff, it's clear that AOL didn't follow what's SOP procedures for preserving respondent, privacy in the social sciences. They thought they were by not directly releasing user-holder's name and accounts, but were so eager to be helpful that they didn't sanitize the data well. Always a concern, but one routinely dealt. OK, I gotta stop myself or else I go on another rant about computer scientists not knowing any social science -- in this case methods.