I was super curious about the dump of those emails. When we did the same thing in 2011 with Bin Laden emails in the wake of his killing. We posted 2,000,0000 in English and 2,000,000 in Arabic in a downloadable format. The next day we heard from legal at Twitter. We have made every effort to be compliant since. This means if you want to study the Bin Laden corpus, I can share it with you on our platform, if you are a registered user, but I cannot create a duplicate copy of all the data and send you a link to download it. (Yes: let me know if you want access.) This is a small but important part of a larger, meta discussion about whether we should preserve distributed innovation by having everyone create their own collections, play by their own rules, build their own tools, obey the ToS when they see the wisdom, resist when the cause is just...or...should we start settling on a set of shared practices/systems, perhaps one stream for the open source DIY crowd and another stream for those willing to accept that R & Python cannot work for every student, teacher, or researcher, and that there are options (open source & commercial) rooted in NSF-funded academic research to standardize the way academic Twitter data is stored, searched, shared, displayed (non-trivial), redacted (when necessary) and reported on. Compliance means we as a small business can continue to help academics work within the rules set by Twitter. As many of you know, the technical requirements of compliance are trickier than the good intention to be compliant. We struggle with a few of the technologically daunting aspects of storing and updating sets that comprise only about 1 billion curated Tweets. In answer to your question, I felt excitement and dread when I saw the news. It immediately reminded me of the Bin Laden experience. I am one of those crazy political science PhDs open to the idea American democracy may never be safe for voters after 2016. It was never perfect, but it moved toward a more perfect union. Now we are on a different path and I can see a public interest case for ignoring the ToS. There is that slippery slope issue...for now I worry less that we cannot do the research because we are constrained and more that the thing we are studying will never be the same. Puzzling times and there is definitely no one right way through. Best to keep talking (and meeting?) about this. Stu On Thu, Aug 9, 2018 at 11:45 AM, Deen Freelon <dfreelon@gmail.com> wrote:
Jean,
I'm glad you brought this issue up. In a forthcoming journal article, book chapter, and a series of recent talks, I have articulated the argument that ToS are not, and should not be considered, ironclad rules binding the activities of academic researchers. Think about it: does it really make sense to have social media companies dictating what we can and can't study? Remember, ToS can include almost anything, so a platform could try to forbid all research even remotely involving content posted to it. I don't think researchers should reasonably be expected to adhere to such conditions, especially at a time when officially sanctioned options for collecting social media data are disappearing left and right.
Another related issue deals with the difference between human subjects protections and the purpose of ToS. Like human resources, ToS exist to protect the parent company, not users. Any user protections are completely optional and subject to change without notice. Thus, just as compliance with ToS does not guarantee user protection, ToS violations do not necessarily imply user harm. These issues are entirely distinct and should be handled accordingly by researchers and IRBs.
Finally, I have advocated for what I call a "public interest rationale" in violating ToS for certain research purposes. For example, consider the recent dump by FiveThirtyEight of nearly 3M tweets posted by the Internet Research Agency: https://fivethirtyeight.com/fe atures/why-were-sharing-3-million-russian-troll-tweets/ This dataset violates Twitter's terms of service in two ways: first, by posting full Twitter metadata rather than only Twitter IDs; and second, by encouraging the study of deleted content, which third parties are supposed to dispose of. But given the importance of understanding foreign attempts to undermine democracy--acknowledged at the highest levels of the US government--I believe these infractions are more than justified. As the 538 authors write, "Reassembling this corpus of tweets is an exercise in a certain kind of national security." The public deserves to know when and how their democratic process is being messed with, and I don't think it's a good idea to let corporate red tape prevent them from acquiring that knowledge.
Curious to know what others think about this issue. Best, /DEEN
On 8/8/2018 7:27 PM, Jean Burgess wrote:
And of course, the subject line should not be APIs, but Terms of Service/Use!
On 9/8/18, 9:24 am, "Air-L on behalf of Jean Burgess" < air-l-bounces@listserv.aoir.org on behalf of je.burgess@qut.edu.au> wrote:
Dear colleagues, I’m keen to hear of your experiences with your own research ethics boards/committees, especially in the PCA (Post-Cambridge Analytica) era: 1. Have any of you noticed a recent increase in IRBs/ethics committees requiring proof of compliance with social media platforms or apps’ Terms of Service/Terms of Use as part of ethical clearance requirements? Please note I’m not only interested in so-called data-driven methods or API access here, but all kinds of ethnographic, qualitative social research, and critical-interpretative approaches as well. 2. If so, are questions of ToS compliance restricted to reasonable questions of harm to participants (for example, your methods may inadvertently induce a participant to share the content of another user, hence violating their own contract with the platform provider) 3. Where questions of ToS appear to exceed the bounds of human research ethics considerations (perhaps appearing more concerned with institutional risk avoidance), how have you responded? Not actually asking for a friend Jean _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Deen Freelon, Ph.D. Associate Professor School of Media and Journalism, UNC-Chapel Hill http://dfreelon.org | @dfreelon <https://twitter.com/dfreelon> | https://github.com/dfreelon _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Cell: 413-992-8513 LinkedIn: http://www.linkedin.com/in/stuartwshulman