Hey Amanda, thanks a lot for putting this list together. I missed my chance to contribute, but my needs are well covered by what other colleagues have added to the list (esp. Jean's comments). My point of view is that the best way to facilitate research is for Twitter to give us as much direct access to the data as possible. Providing visualizations or specific stats (trending topics etc) alone is not sufficient, because our questions are too varied (which isn't to say that providing these things isn't great). The discussion reminds me a bit of the debate around virtual research environments in the Humanities and Soc. Sciences. Often builders of e-infrastructure assume that research processes are standardized and that accordingly what we need are tools that hide the complexity of the data from us and simplify interaction with the data. The API approach is superior to that, even if Twitter didn't intend for us to use the API for gathering data. It would be great if Twitter engaged in a continued dialogue with the research community about what we need (and, in return, what we can do with their data that might be interesting to them). - Cornelius On Fri, Mar 11, 2011 at 11:44 PM, Amanda Lenhart <alenhart@pewinternet.org>wrote:
Hi Andrea & AoIR List
Sure, in the name of transparency, here's the text of the email I sent to the person I know at Twitter, with all the personal pleasantries edited out. I also included an attachment with more detailed needs from the people who emailed me, but as those were off-list messages, I don't feel like I have permission to repost them here. However, if anyone who emailed me wants to repost their message to the list to get a broader discussion going, that's fine with me.
And of course, let me know if I missed anything or misrepresented anyone's needs. * * *
"I'm going to try to summarize the main feedback. Basically, the type and breadth of research even among the 12 or so projects that I heard about from my outreach are so varied that at least one researcher out there is using every element of a tweet, profiles, and the interface to measure or look at something. For example there's a guy looking at regional variations in linguistic forms on twitter - so he's using geolocation, tagging and the words in individual tweets for analysis. There is a big project looking at crisis response by looking at twitter hashtags, and other projects where people are using twitter data to do network mapping - so they need @reply data, and individual user ids -- names or something else, at a minimum, that give them access to the body of tweets from individuals. One Australian researcher, Jean Burgess, put it well: "There are two main research needs: one is the need for public, (semi)open and reusable archives of tweets, particularly for keywords a nd hashtags, [but also location data (Amanda add)]; and the other is the more specific needs of particular research groups to track *users* and their social networks, for whatever reason."
For academic researchers and serious scholarly work, researchers can't just use a cool website that pulls up pie charts of stats about users - they need to be able to access the data as directly as possible.
Here's a list of the types of data people are using for their research work:
-geotagging -location -institutional tweets -@replies -hashtags and post keywords -collocating a single user's tweets -RT counts and tracking -words within tweets -follower/following data -number of tweets -links -images/avatars -lists
Sometimes researchers need subsamples of the twitter stream based around: 1. A date and time (e.g. all tweets two days before, during and two days after the Superbowl) 2. Hashtags or keywords 3. Individuals (e.g. top ten most active people in a particular community of practice) 4. Random subsample of the public twitter stream
I'm sure I've missed some items that creative researchers will want to use. I think the big takeaway is that researchers are doing a ton of creative things with Twitter data and would be hugely grateful for the most robust access you can offer them. Also, that as you continue to update Twitter's capabilities, researchers will want to take advantage of the new functionality.
Scholars (myself included) find Twitter fascinating and truly want to do research that examines the ways in which Twitter is being used, which I think helps scholars, helps Twitter, and helps your users.
I hope this is useful. Let me know if there's anything else I can do that would help make the case that Twitter can and should release its data to academic and non-profit researchers."
-Amanda
Amanda Lenhart Pew Research Center alenhart@pewinternet.org
-----Original Message----- From: Andrea Kavanaugh [mailto:kavan@cs.vt.edu] Sent: Friday, March 11, 2011 5:09 PM To: Amanda Lenhart Subject: Re: [Air-L] Twitter no longer allowing use for scholarship - Update
hi Amanda, can you send us what you sent to Twitter as our collective research needs? Andrea
On Mar 11, 2011, at 12:26 PM, Amanda Lenhart wrote:
Thanks to everyone who wrote with feedback and details about their twitter-oriented research needs. I've pulled together everyone's requests and I've forwarded them to my friend at Twitter. I'll update the list when I hear anything.
Thanks,
Amanda
Amanda Lenhart Pew Research Center alenhart@pewinternet.org<mailto:alenhart@pewinternet.org>
_ _ _ _ _ _ _
On Friday, I had a conversation at conference with someone I know who works at Twitter. We talked about this exact issue. And while Twitter can't change back the API because of other problems the change was fixing, she would very much like to give academics and non-profit researchers access to Twitter data. However, she has to push through a proposal internally to make this happen. She said it would help her make the case if I could tell her what parts of the data set researchers wanted to access.
I offered to ping the AoIR list to get a sense of what people want and need from Twitter to be able to do/continue their research.
Also, one thing my friend did mention -- because Twitter data can never be fully anonymized, there might be some limitations on what kind of analysis you could do - mostly along the lines of limits on analysis that would reveal information about the individual that they had not made explicit and which might be harmful (e.g. Using network analysis to speculate on users' sexual orientation).
So, please email me off-list and I'll compile the types of data requests and send them along to my Twitter friend.
Thanks,
Amanda
Amanda Lenhart Pew Research Center alenhart@pewinternet.org<mailto:alenhart@pewinternet.org>
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Cornelius Puschmann, M.A. Department for English Language and Linguistics Heinrich-Heine-Universität Düsseldorf Building 23.11, Level 1, Room 21 Universitätsstrasse 1 40225 Düsseldorf Germany +49 211 81 15927 (office) Nachwuchsforschergruppe "Wissenschaft und Internet" / Junior Researchers Group "Science and the Internet" http://nfgwin.uni-duesseldorf.de