Hi Stu, Deleted data and research data retention are among the most nuanced issues out there, in part because Member State laws take different positions on the issue of data retention. That’s one of the main reasons the Code provides a compendium of select Member State laws. That said, in general, there are a few things to keep in mind: First, while GDPR embraces the right to erasure (the “right to be forgotten” is outdated terminology), it is applicable when a data subject affirmatively invokes the right. Deleting one’s data from social media does not mean researchers are automatically required to delete said data from their own datasets. The draft Code addresses the right to erasure in two ways. 1) It discusses the issue of transparency and data subject transparency notifications. (See Part I.) Researchers (through their institutions) must provide a mechanism by which European data subjects can assess whether their data is likely to be part of a research project and potentially request removal. But data subjects must have appropriate grounds for this request, and those grounds are limited. (See Articles 17 & 19 of the GDPR.) Institutional DPOs should lead the way in determining what is a valid request. 2) Data retention/destruction plans are very much related to this question. As part of the Data Needs and Management Plan described in Part II of the Code, researchers will need to provide solid justification for choosing how long they will retain data and the steps by which they will destroy it. That justification should be tied to considerations of users’ rights, as well as considerations balancing scientific research needs (eg, study replication). GDPR recognizes both. Second, in requiring API users to delete suspended and deleted data, Twitter’s TOS go well beyond GDPR requirements. The Code cannot speak to Twitter’s TOS, only to the law. Twitter is free to impose additional restrictions on researchers. Third and finally, all of the platforms are skittish about their own long-term data retention policies. They generally delete data from their core pipelines after 90 days. We hope that a delegated act under the DSA (Article 31 of which will compel data access for researchers in certain circumstances), will help us tackle this. The Code itself hints at, but doesn’t yet go all in on, the need for platforms to retain data longer than this 90-day period when it is needed for independent research. But how to identify such data prospectively is tricky. We’ll need the DSA’s help (and, honestly, maybe some case law) to spell that out more. Hope this helps clarify—and offers greater insight into an issue that we researchers typically understand quite poorly. Chairing this Working Group was revelatory for me in so many ways. Rebekah On Tue, May 31, 2022 at 19:41 Shulman, Stu <stu@texifter.com> wrote:
Fascinating report. The detail is remarkable, insightful and helpful. Thank-you for sharing and all the work. These reports take incredible patience. It is well written as well.
I notice deletions are mentioned in the context of holding data but not (unless I missed it) in the context of user-generated deletions. When a user generates a social data action (Tweet, RT, reply, etc.) then later deletes it, any researcher who may be holding the datapoint must also delete it or render it inaccessible. Very few do, in my experience. The "right to be forgotten" is still operational under GDPR, in my understanding of it, though I'd be happy to be updated. However, if some of you are holding my deleted Tweets in Europe in a spreadsheet, are you compliant? The term forgotten is not in the report. The term suspended does not appear in the report either but is fundamental to compliance with Twitter research terms. You cannot look at data from suspended accounts. Many academics are doing Twitter or Reddit data work because of generous data access options. I would say account suspensions and user deletions are fairly significant issues that should be kept in focus as systemic and ethically problematic failures in the current spreadsheet-centric paradigm for examining social data artifacts.
On Tue, May 31, 2022 at 1:14 PM Charles M. Ess via Air-L < air-l@listserv.aoir.org> wrote:
Dear colleagues,
as a quick follow up - first of all, a tremendous shout out to Rebekah for her work as chair of this project. Bringing together representatives from major platforms, experts in GDPR and related law, NGOs, practicing researchers (and even an ethicist) into sharp and focused dialogue over the year+ leading up to this publication, coupled with the agreements over various aspects and elements of the Code of Conduct, was an all but superhuman task. As someone privileged to participate under Chatham House Rules, I am allowed to say that there was universal and enthusiastic consensus affirming Rebekah's extraordinary work in getting us to this place - a place that one at the outset could reasonably doubt we would ever see.
Secondly: the draft Code endorses the AoIR ethics guidelines 3.0 as follows:
The research should follow the Ethical Guidelines for Internet Research of the Association of Internet Researchers (as well as any other specialized or sector-based guidelines relevant to the research) and be reviewed and approved before data is requested from a DSO by an institutional, or appropriate third-party, ethical review board, as described in Part II of the Code. (p. 27). In addition, there will be reference to an affiliated document titled "Best practices and reflection questions for the Code of Conduct." The document cross references the 3.0 guidelines with several key issues raised in the draft Code, and is designed to serve as a springboard for further ethical reflection on the part of those developing the sorts of research envisioned and circumscribed therein. This latter document will soon appear on the EDMO website as well along with other documents affiliated with the draft Code. Here I would like to thank especially: aline shakti franzke (University of Duisburg-Essen); Stine Lomborg (Copenhagen University); Elizabeth Buchanan (Marshfield Clinic Research Institute) Rich Ling ( Norwegian Academy of Science and Letters); and Michael Zimmer (Marquette University) for their very great helps in putting this document together.
A thousand thanks to Rebekah and the Working Group, and I very much look forward to seeing how all of this unfolds.
All best, - charles
On 31/05/2022 16:30, Tromble, Rebekah via Air-L wrote:
Dear colleagues,
Earlier today the European Digital Media Observatory's Working Group on Platform-to-Researcher Data Access published its official report < https://edmo.eu/wp-content/uploads/2022/02/Report-of-the-European-Digital-Me...
.
The multi-stakeholder group has been hard at work for the last year. Our main charge was to draft a Code of Conduct under Article 40 of the GDPR that would facilitate better access to data for independent researchers. This report contains that draft Code.
Among other things, the draft Code lays out a framework for assessing the level of risk involved in accessing and conducting research with different types of platform data. It then lays out a number of safeguards that can be put in place to mitigate different levels of risks--helping to promote research that is ethical and responsible. (I tweeted more about it here <https://twitter.com/RebekahKTromble/status/1531611984944316419>.)
Getting to this point has entailed tremendously hard work by everyone involved, and, as the report itself notes, the work is far from over. But publishing the report and draft Code represent a major step forward. Though certain requirements are necessarily tied to specifications under the GDPR, the general principles and proposed solutions the report offers are instructive well beyond the European context.
Please feel free to circulate widely. And let me know if you have any questions, thoughts, etc.
Rebekah Dr. Rebekah Tromble Director, Institute for Data, Democracy & Politics, George Washington University | Associate Professor, School of Media & Public Affairs, George Washington University | Visiting Researcher, The Alan Turing Institute (London) | www.rebekahtromble.net iddp.gwu.edu _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Professor Emeritus University of Oslo <http://www.hf.uio.no/imk/english/people/aca/charlees/index.html>
3rd edition of Digital Media Ethics now available: <http://politybooks.com/bookdetail/?isbn=9781509533428> _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics*
-- Dr. Rebekah Tromble Director, Institute for Data, Democracy & Politics, George Washington University | Associate Professor, School of Media & Public Affairs, George Washington University | Visiting Researcher, The Alan Turing Institute (London) | www.rebekahtromble.net iddp.gwu.edu