Twitter Collection and Analysis Toolkit seeks caring new owner
Dear Colleagues and Friends, I hope this message finds everyone doing well and managing through this difficult time. I am posting to this list because, after about 7+ years, I am decommissioning a Digital Methods Initiative Twitter Collection and Analysis Toolkit (DMI-TCAT) installation that I oversee. More details on this system, developed by Erik Borra and Bernhard Rieder, are available on github here https://github.com/digitalmethodsinitiative/dmi-tcat While this is an open source platform that was generously made freely available by the developers (sincere and deep thanks to Erik and Bernhard), there can be costs involved with hosting and storage of the system and data. So while there are a variety of reasons for my decision, including a finite amount of research funds that I can dedicate each year, I am truly saddened to be pulling the plug especially because I am fully aware of the value this system carries and its important as a resource for academic research. To cut to the chase, it is roughly $370 per month to host and store this TCAT on Amazon Web Services (AWS), and right now, this TCAT install is at about 70% elastic capacity with more than *560 million tweets* collected over the last few years, on a wide variety of topics, many that relate to political and health communication. If anyone is interested to take over this system, or if there are questions out there, please reach out to me off list and I'm happy to discuss. If there are no takers, this data will no longer be maintained by me after June 30, and repositories of literally hundreds of millions of tweets on COVID-19 and other topics will be lost to the ether. Sorry for the long-ish message and thanks for your consideration. Best regards, Jacob -- Dr. Jacob Groshek Ross Beach Chair in Emerging Media Research and Associate Professor Kansas State University jacobgroshek.com | @jacobgroshek <https://twitter.com/jacobgroshek> | google scholar <https://scholar.google.nl/citations?user=G1XXhccAAAAJ&hl=en> Honorary Associate Professor, Roskilde University <https://ruc.dk/en/department-communication-and-arts> Associate Director, CMCS @ <https://sites.bu.edu/cmcs/> Boston U <https://sites.bu.edu/cmcs/> | Founding Editor, *JoCTEC <http://www.joctec.org/>* Previously: Erasmus U <https://www.eur.nl/en/eshcc/research/ermecc/people/research-fellows> | NeSCoR <http://nescor.socsci.uva.nl/> | Boston Civic Media <http://bostoncivic.media/> | IAST <http://www.iast.fr/> +1-857-615-4709
Hi Jacob, I am sorry to hear about DMI-TCAT needing to be turned off. But I certainly appreciate the costs of keeping a service like this online. I wonder if it might be feasible to generate tweet ID datasets for the various collections and add them to the Documenting the Now Catalog [1]? The tweet IDs could then be "hydrated" by people who want to use the data, using tools such as the Hydrator [2] or twarc [3]. If this sounds like it might be appropriate and you needed some help I would be willing to lend a hand, since I work on the Documenting the Now project. Sincerely, Ed Summers [1] https://catalog.docnow.io [2] https://github.com/docnow/hydrator [3] https://github.com/docnow/twarc On May 30, 2020 2:28 PM, Jacob Groshek <jgroshek@gmail.com> wrote: Dear Colleagues and Friends, I hope this message finds everyone doing well and managing through this difficult time. I am posting to this list because, after about 7+ years, I am decommissioning a Digital Methods Initiative Twitter Collection and Analysis Toolkit (DMI-TCAT) installation that I oversee. More details on this system, developed by Erik Borra and Bernhard Rieder, are available on github here https://github.com/digitalmethodsinitiative/dmi-tcat While this is an open source platform that was generously made freely available by the developers (sincere and deep thanks to Erik and Bernhard), there can be costs involved with hosting and storage of the system and data. So while there are a variety of reasons for my decision, including a finite amount of research funds that I can dedicate each year, I am truly saddened to be pulling the plug especially because I am fully aware of the value this system carries and its important as a resource for academic research. To cut to the chase, it is roughly $370 per month to host and store this TCAT on Amazon Web Services (AWS), and right now, this TCAT install is at about 70% elastic capacity with more than *560 million tweets* collected over the last few years, on a wide variety of topics, many that relate to political and health communication. If anyone is interested to take over this system, or if there are questions out there, please reach out to me off list and I'm happy to discuss. If there are no takers, this data will no longer be maintained by me after June 30, and repositories of literally hundreds of millions of tweets on COVID-19 and other topics will be lost to the ether. Sorry for the long-ish message and thanks for your consideration. Best regards, Jacob -- Dr. Jacob Groshek Ross Beach Chair in Emerging Media Research and Associate Professor Kansas State University jacobgroshek.com | @jacobgroshek <https://twitter.com/jacobgroshek> | google scholar <https://scholar.google.nl/citations?user=G1XXhccAAAAJ&hl=en> Honorary Associate Professor, Roskilde University <https://ruc.dk/en/department-communication-and-arts> Associate Director, CMCS @ <https://sites.bu.edu/cmcs/> Boston U <https://sites.bu.edu/cmcs/> | Founding Editor, *JoCTEC <http://www.joctec.org/>* Previously: Erasmus U <https://www.eur.nl/en/eshcc/research/ermecc/people/research-fellows > | NeSCoR <http://nescor.socsci.uva.nl/> | Boston Civic Media <http://bostoncivic.media/> | IAST <http://www.iast.fr/> +1-857-615-4709 _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/
Hi Jacob, Is there an option for those of us on the AoIR list to collectively support this valuable resource? This tool has been invaluable to my work many times over the years, as I'm sure it has for others. Thank you for facilitating the tool for this extended period. I'd hate to see it disappear, so AoIR folk, is there any way we could collectively resource this valuable tool? Regards Jonathon. DR JONATHON HUTCHINSON | Senior Lecturer Online Communication and Media Department of Media and Communication | Faculty of Arts and Social Sciences HDR Coordinator Treasurer of Australian and New Zealand Communication Association (ANZCA) Secretary of International Association of Public Media Research (IAMPR) THE UNIVERSITY OF SYDNEY Rm N233, John Woolley Building (A20) | The University of Sydney | NSW | 2006 T +61 2 9351 2821 | F +61 2 9351 2434 | M +61 421 178 971 E jonathon.hutchinson@sydney.edu.au | W sydney.edu.au <http://sydney.edu.au> | W jonathonhutchinson.com.au <http://www.jonathonhutchinson.com.au/> On 1/6/20, 10:32 am, "Air-L on behalf of ehs@pobox.com" <air-l-bounces@listserv.aoir.org on behalf of ehs@pobox.com> wrote: Hi Jacob, I am sorry to hear about DMI-TCAT needing to be turned off. But I certainly appreciate the costs of keeping a service like this online. I wonder if it might be feasible to generate tweet ID datasets for the various collections and add them to the Documenting the Now Catalog [1]? The tweet IDs could then be "hydrated" by people who want to use the data, using tools such as the Hydrator [2] or twarc [3]. If this sounds like it might be appropriate and you needed some help I would be willing to lend a hand, since I work on the Documenting the Now project. Sincerely, Ed Summers [1] https://protect-au.mimecast.com/s/6uFmCr81nyt3NgORs7lgy-?domain=catalog.docn... [2] https://protect-au.mimecast.com/s/VMuGCvl1rKi0BQZnfXvFM6?domain=github.com [3] https://protect-au.mimecast.com/s/6Vj6CwV1vMfZOBnEf9XhEJ?domain=github.com On May 30, 2020 2:28 PM, Jacob Groshek <jgroshek@gmail.com> wrote: Dear Colleagues and Friends, I hope this message finds everyone doing well and managing through this difficult time. I am posting to this list because, after about 7+ years, I am decommissioning a Digital Methods Initiative Twitter Collection and Analysis Toolkit (DMI-TCAT) installation that I oversee. More details on this system, developed by Erik Borra and Bernhard Rieder, are available on github here https://protect-au.mimecast.com/s/bISxCxngwOfqGYj5FwM78L?domain=github.com While this is an open source platform that was generously made freely available by the developers (sincere and deep thanks to Erik and Bernhard), there can be costs involved with hosting and storage of the system and data. So while there are a variety of reasons for my decision, including a finite amount of research funds that I can dedicate each year, I am truly saddened to be pulling the plug especially because I am fully aware of the value this system carries and its important as a resource for academic research. To cut to the chase, it is roughly $370 per month to host and store this TCAT on Amazon Web Services (AWS), and right now, this TCAT install is at about 70% elastic capacity with more than *560 million tweets* collected over the last few years, on a wide variety of topics, many that relate to political and health communication. If anyone is interested to take over this system, or if there are questions out there, please reach out to me off list and I'm happy to discuss. If there are no takers, this data will no longer be maintained by me after June 30, and repositories of literally hundreds of millions of tweets on COVID-19 and other topics will be lost to the ether. Sorry for the long-ish message and thanks for your consideration. Best regards, Jacob -- Dr. Jacob Groshek Ross Beach Chair in Emerging Media Research and Associate Professor Kansas State University jacobgroshek.com | @jacobgroshek <https://protect-au.mimecast.com/s/J0AVCyojxQTmjx3PFNCF1l?domain=twitter.com> | google scholar <https://protect-au.mimecast.com/s/DyuyCzvkyVCNP5zYcwYZut?domain=scholar.google.nl> Honorary Associate Professor, Roskilde University <https://protect-au.mimecast.com/s/2VVICANpgjCJmk5KT28GKO?domain=ruc.dk> Associate Director, CMCS @ <https://protect-au.mimecast.com/s/kNUSCBNqjlCJZYNXTr1nCe?domain=sites.bu.edu> Boston U <https://protect-au.mimecast.com/s/kNUSCBNqjlCJZYNXTr1nCe?domain=sites.bu.edu/> | Founding Editor, *JoCTEC <https://protect-au.mimecast.com/s/jGvgCD1vlpTr64ZLu88bBr?domain=joctec.org>* Previously: Erasmus U <https://protect-au.mimecast.com/s/i-bRCE8wmrtoAEj8F4Bp8t?domain=eur.nl > | NeSCoR <https://protect-au.mimecast.com/s/pN--CGv0oyCZyv9NfXoSGs?domain=nescor.socsci.uva.nl> | Boston Civic Media <https://protect-au.mimecast.com/s/J9twCJyBrGfL4NoWfmnIRQ?domain=bostoncivic.media> | IAST <https://protect-au.mimecast.com/s/UqTMCK1DvKTxAVkyhyGajk?domain=iast.fr> +1-857-615-4709 _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://protect-au.mimecast.com/s/R_4HCL7EwMfMo62DsGZwvf?domain=aoir.org Subscribe, change options or unsubscribe at: https://protect-au.mimecast.com/s/6f9vCMwGxOtVoNYyH4_1_-?domain=listserv.aoi... Join the Association of Internet Researchers: https://protect-au.mimecast.com/s/cIs6CNLJyQU5n4pqcxshcm?domain=aoir.org _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://protect-au.mimecast.com/s/R_4HCL7EwMfMo62DsGZwvf?domain=aoir.org Subscribe, change options or unsubscribe at: https://protect-au.mimecast.com/s/6f9vCMwGxOtVoNYyH4_1_-?domain=listserv.aoi... Join the Association of Internet Researchers: https://protect-au.mimecast.com/s/cIs6CNLJyQU5n4pqcxshcm?domain=aoir.org/
participants (3)
-
ehs@pobox.com -
Jacob Groshek -
Jonathon Hutchinson