I believe many of these authors (see the first link below) trust and acknowledge our contribution to their work on these sorts of data management problems over the last 14 years: https://tinyurl.com/DT4Twitter When academics sign up for a free TrustDefender account, which is DiscoverText with some cool new features that people should try out for teaching or research, this is "the welcome email" you get below. As it points out, we have continued to provide access to tools and data despite massive changes in the ecosystem. It is not a perfect friction-free data world, but if you take the briefing we offer, you will have answers to your list of questions. https://calendly.com/discovertext We can get you from novice to intermediate in under 90 minutes. We can get you data. We have unique theories, methods, and tools. We use the Twitter display, so you get all the visual elements. We segment metadata and make it highly amenable to filtering. We provide unique deduplication and sampling tools for creating purposive samples. We enable multi-coder crowd source annotation and measurement of inter-rater reliability. We provide a one-of-a-kind adjudication system for refining annotation, creating gold standard training sets, and ranking human annotators. We provide automated loading of data and keystroke coding. We provide on-board machine-learning and deliver all of these elements in a graphical user interface. The five pillars of text analytics are all in one place. --- The Welcome Email --- Yearlong TrustDefender Group License I have just issued a yearlong TrustDefender group license to you. Did you receive the email? It sometimes is directed to a spam folder. TrustDefender is an improved version of DiscoverText specifically designed to make teaching and collaborative research easier. You will be the administrator of your own group account and will have the ability to send out licenses to other people you would like to collaborate with. Any recipient of a license from you who registers an account will automatically be available to collaborate via your peer network. We have streamlined this specific peer collaboration process in TrustDefender. This service is 100% free for academics. Each member of your peer group can get a license either as part of your account or you can send me their emails and I will send them each a group account they can control. Either way, this part is important, you need to remain "visible" in the TrustDefender Peer Network to form connections and collaborate. If you are a professor and you want to use TrustDefender in class, feel free to let me know if you have questions or want a meeting about your research design or implementation pedagogy. Please review some of the introductory videos: https://vimeo.com/showcase/5553857 If you do want to upload spreadsheets, that is easy to do: https://vimeo.com/622539257 Book a meeting with the inventor: https://calendly.com/discovertext We are working on a keyword list for users: https://tinyurl.com/DTManual You can find a robust DiscoverText literature and creative methods ideas in these academic papers: https://discovertext.com/mentions/ There is a lot of uncertainty in the Twitter data ecosystem. You can no longer collect Twitter data in real time using an API. Contact me directly about these legacy datasets, which can be shared via the DiscoverText peer network. https://vimeo.com/503173700 There is an option to access Twitter data produced over the last 12 months via Meltwater for a fee. Please contact me for a demo. https://www.meltwater.com/en Many scholars have massive stored Twitter datasets in the raw JSON format. You can upload any historical Twitter data in JSON format to TrustDefender for analysis: https://vimeo.com/679097662 No academic should study Twitter data in a spreadsheet until they have spent 7 minutes watching this "Case Against...": https://vimeo.com/526218014 Going forward, the opportunities to collect unique, tailored, specific real time or historical data will be very sharply curtailed by Twitter. As time passes, archival questions of what remains and what is lost to history may track closely to the history of newspapers. I completed a dissertation in 1999 about crumbly newspapers from the Progressive Era. Some were available, others were not. This applies to Twitter data now and it always did. Some data was preserved, much is lost or will be lost, even with serious archival efforts. I invite you to book a web meeting or send me a note if you have questions about what remains possible. There are important questions about what comes next in the history of information and how we work together to preserve research opportunities. Many people ask about Facebook, Instagram, and other social data. DiscoverText has not been connected to Facebook's API since 2014. We do not store or access any social data except Twitter. There is some Reddit data in the Meltwater pipeline. A caveat is that some academics do have legal access to non-Twitter social data and that data, when stored in spreadsheets, can be uploaded by researchers into their DiscoverText account like any other spreadsheet: https://vimeo.com/622539257 A lot of people want to code transcripts of interviews and other semi-structured data. This is not the ideal use case, but if your data fits in a spreadsheet, it may be possible to make use of these tools. I look forward to supporting your work, ~Stu On Thu, Nov 21, 2024 at 6:36 AM Ollier Malaterre, Ariane via Air-L < air-l@listserv.aoir.org> wrote:
Hello amazing community,
I'm working with Emilie Szwajnoch on a project to document the birth of the Chinese "social credit system" chimera in Western anglophone countries and we have naïve questions on how to extract tweets as this is our first time doing this and the situation with X is evolving rapidly. I hope experts in the community can help!
a. How can we extract tweets in their entirety (text, pictures, and links)?
b. If we extract tweets with shortened links, will the links work after extraction?
c. Is it possible to extract all tweets posted in specific countries based on an X user's location? Or will we not have a full sample if an X user is not sharing their location (may this depend on countries' privacy regulations)?
d. Is there a provider that this community trusts to do the extraction in case we don't do it ourselves?
Thank you very much! Ariane Ollier-Malaterre, PhD< https://sites.google.com/site/olliermalaterre/home> Canada Research Chair on Digital Regulation at Work and in Life< https://digitalregulation.uqam.ca/en/home-english/> New Book Living with Digital Surveillance in China< https://www.routledge.com/9781032517704> New Book Le management à l'ère numérique< https://www.puq.ca/catalogue/livres/management-ere-numerique-4352.html>
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics*