Re: [Air-L] Academic replacements for TwapperKeeper.com?

23 Feb 2011

      Hey All,

I run 140kit.com, and I can tell you right now that the Twitter TOS is somewhat ambiguous when it comes to handing data out like this when its done in a research setting - from my own (optimistic) reading, I think there is still space for a purely research based environment - they want to avoid having their business in data analysis usurped by other marketers, I think, so we don't really need to worry too much. I have pinged Twitter about Twapper Keeper folding the other day, and am waiting for a reply. At any rate, I'm certainly willing to go to the mat to keep some form of data collection around for researchers. 

We are a much smaller operation than Twapper Keeper. We have only cataloged about 100 million tweets over about 70 million users, and as of yet, have not put any money in for upgrading our tiny server set. The seed money was put in by my college last year, and that dries up in May, at which point I will pay out of pocket until a better solution presents itself. 

So, to answer Cornelius' points:

- We do hashtag/search processes
- We export data in csv and sql, as from my own experience sql is much easier to deal with when repurposing data like this
- We have a whole range of analytical processes built on top of the system as well, with a way for other programmers to plugin new analytics, which is great
- We are certainly not stable financially or uptime-wise. We are reliable when the word is in our system - the collection servers are run at Berkman and rarely falter.
- Long-term, I would like to work on this more, but we need to figure out how to build an environment where this is acceptable conduct for Twitter and someone can afford to work on it full-time. I would love to do that, but I don't know how, short of getting back in an academic setting and getting free reign, which just isn't hugely likely.

Feel free to ask me any questions about this service or about the current situation - I'm hitting up Twitter folk all day to get answers and advice.
Devin

On Feb 23, 2011, at 11:18 AM, Matt Munley wrote:
...
Cornelius,
How well would something like 140kit (http://140kit.com/) meet your needs?
Here's their description from their site:
"We use our cluster of machines to collect your data using our access to the
Twitter API. If you search for tweets with a term, we employ the streaming
API to collect data in a distributed fashion. When your data collection is
finished, depending on your access level, we conduct an array of analytics
on the data set, ranging from the ordinary dump of data in MySQL/CSV to
Network graph visualizations, gender breakdowns, and more."
It seems to hit most of your bullet points; though I can't speak to their
stability or long-term viability.
Matt
On Wed, Feb 23, 2011 at 12:04 PM, Cornelius Puschmann <
cornelius.puschmann@uni-duesseldorf.de> wrote:
...
*Note:* I've also blogged this (in case links in the post don't work) and
will list all alternatives suggested to me in that blog post:
http://blog.ynada.com/616
Dear all,
A few days ago, the people behind Twitter archival site
TwapperKeeper.com<http://twapperkeeper.com/> announced
that they will be discontinuing the export feature of the service on March
20, 2011<
http://twapperkeeper.wordpress.com/2011/02/22/removal-of-export-and-download...
...
.
Apparently the feature is in violation of Twitter’s terms of service, at
least in the form it’s currently implemented in TwapperKeeper.
Unfortunately this cuts off a number of academics who are investigating
communication on Twitter for scientific purposes from a convenient data
source. While it’s fairly easy to get data directly via the Twitter
API<http://apiwiki.twitter.com/> (which
is what TwapperKeeper was doing), I know many people who want to
concentrate
on the data itself, rather than running their own servers to scrape Twitter
on a regular basis. What’s more is that Twitter’s attitude is worrisome:
many of us have tried to get an exemption from API rate limits in the past,
to no avail. Twitter doesn’t give researchers privileged access to their
data, and now they’re crippling TwapperKeeper on top of that.
Bottom line: what will we use after March 20? Ideally, a replacement would
provide the following:
- the hashtag/search query functionality of TwapperKeeper,
 - the export functionality of TwapperKeeper,
 - exclusive use for academic purposes (on the grounds that this might
 keep Twitter from shutting it down),
 - stability and reliability,
 - long-term viability.
The last point is important, because I don’t think it will be difficult to
set up a server somewhere to suit the needs of a few people, but a
larger-scale solution seems more sensible in the long run. Maybe
JISC<http://www.jisc.ac.uk/> can
do something like that, based
onyourTwapperKeeper<http://code.google.com/p/yourtwapperkeeper/>
(which they supported<
http://twapperkeeper.wordpress.com/2010/04/16/jisc-funded-developments-to-tw...
...
)?
Or one of the big institutes (OII, Berkman)? Either way it would be nice to
find an alternative that doesn’t give those of us with devs and major IT
support behind them a huge edge over the rest…
Thanks in advance for your comments,
Cornelius
---
Dr. Cornelius Puschmann
Department of English Language and Linguistics
Heinrich-Heine-University Düsseldorf, Germany
Junior Researchers Group "Science and the Internet"
http://nfgwin.uni-duesseldorf.de
_______________________________________________
The Air-L@listserv.aoir.org mailing list
is provided by the Association of Internet Researchers http://aoir.org
Subscribe, change options or unsubscribe at:
http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers:
http://www.aoir.org/
_______________________________________________
The Air-L@listserv.aoir.org mailing list
is provided by the Association of Internet Researchers http://aoir.org
Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers:
http://www.aoir.org/

Re: [Air-L] Academic replacements for TwapperKeeper.com?

Devin Gaffney