It is proving harder to turn off the Twitter APIs than imagined. Data is still flowing and there remains no clear messaging about the launch of the new regime. It seems that operating Twitter itself depends on the API. Many Twitter researchers already have important but understudied collections. For example, these are my "Top 100" largest datasets: https://tinyurl.com/100LargeDatasets. These 100 curated "medium" datasets total 185 million records. I have about 1,000 smaller datasets collected over the last decade. As just one person in a massive ecosystem of disparate collections, my guess is that across all researchers at all institutions there are more highly relevant, valuable, understudied Twitter datasets than academia can fully parse. This is not to say the fight to keep the data flowing and low-cost or free is unimportant. However, take a quick look at my Top 100 then imagine all the data already stored by the academics (especially in computer science departments) who work at a scale 10X-100X of what I do. The key is to create more accessible repositories so that we can support teaching and research with (please forgive me) the "bird in the hand" and not lose hope over unlimited birds in the bush. To that end, if you see a dataset on the list and you want to study it or teach it (maybe both), that is legal, free, and possible on demand via DiscoverText. ~Stu -- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics*
participants (1)
-
Shulman, Stu