Fwd: Facebook data destruction
bertil wrote: Couldn't a for-statistical-purpose-only access have been a possible option? this is a complicated and interesting question. technically, there are two approaches that deal with the question of publishing/accessing large data sets for statistical analysis while avoiding re- identification of individuals: the first one is called privacy preserving data publishing(ppdp) the second differential privacy (the new kid on the block). ppdp has proven that it is not possible to publish anonymized datasets and provide sufficient guarantees with respect to re-identification. differential privacy on the other hand is query based: no dataset is published, instead researchers/analysts can pose queries up to a certain point, and the system guarantees that given the queries, no analysis can lead to re-identification of individuals. microsoft has hired most of the prominent researchers working on differential privacy, while facebook i think won over lars backstrom, who is also from the differential privacy gang (gang membership being based on co-authorship). now, as typically it is the case in most privacy research, the concern is with personal re-identification. the sort of categorization and potential social sorting is not the concern of these algorithmic approaches. so profiling is ok, as long as the analyst cannot identify. that is, as many surveillance studies authors discuss, a myth. at least in cases where social sorting is not based on individual identification but on matching behavior or attributes, e.g., sort customers based on knowledge of past shopping behavior, or based on their connectedness in a social network. one of the ways of dealing with categorization and social sorting is through transparency and engagement. but that brings up a lot of accountability issues for which we do not have models of practice. but i think the discussions on this list are fruitful for thinking about the problem and conceiving new practices and learning from existing ones e.g., the aol query dataset, the report on ethnicity in facebook. cheers, s. Message: 2 Date: Thu, 25 Mar 2010 21:34:57 +0100 From: Bertil Hatt <bertil.hatt@ensae.org> To: jkd <jkd@email.unc.edu>, Christophe Prieur <christophe.prieur@liafa.jussieu.fr> Cc: AoIR-L <air-l@listserv.aoir.org> Subject: Re: [Air-L] Fwd: Facebook data destruction Message-ID: <111a48f01003251334h4ad6d42fsb27b385ea26f4f89@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Couldn't a for-statistical-purpose-only access have been a possible option? I'm not familiar with handling such massive data, and I assume that could only have made sense through a paying service, but? allowing scripts to pull some aggregated data (say, preventing any results that didn't involve 10,000+ accounts) would have respected most privacy concerns, no? Having those data anywhere, hackable, is a legal risk and that enough justifies Facebook's threats but I'm still hoping for an academic access of this amazing database. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
participants (1)
-
Seda Guerses