I can imagine why it's hard to find good data mining tool in social science. One reason is the sampling issue. Sample representativeness is essential to most social science research (like probability-based phone survey or multi-stage stratified sampling). While sampling scheme varies across studies, customized code is often preferred. We did random sampling on Weibo and Twitter. The methodology is provided in the following two PLOS ONE papers. Hope this helps. Fu, KW, Chau M (2013) Reality Check for the Chinese Microblog Space: A Random Sampling Approach. PLoS ONE 8(3): e58356. doi:10.1371/journal.pone.0058356 Liang, H, & Fu, KW. (2015). Testing Propositions Derived from Twitter Studies: Generalization and Replication in Computational Social Science. PLoS ONE, 10(8), e0134270.doi:10.1371/journal.pone.0134270 King-wa Fu Associate Professor, Journalism and Media Studies Centre, The University of Hong Kong Visiting Associate Professor 2016-2017, MIT Media Lab (Fulbright Scholar) website: https://sites.google.com/site/fukingwa/ -----Original Message----- From: Air-L [mailto:air-l-bounces@listserv.aoir.org] On Behalf Of Gillian Bolsover Sent: Wednesday, May 10, 2017 9:33 PM To: Stefania Vicari <s.vicari@sheffield.ac.uk>; Helen Kennedy <h.kennedy@sheffield.ac.uk> Cc: air-l@listserv.aoir.org Subject: Re: [Air-L] Chinese language social media data mining tools As part of my PhD, I did a lot of research based on data collected from both Weibo and Twitter. Finding few existing, functional tools, I wrote custom python codes to download and process various sorts of data from both Twitter and Weibo, including a code to tokenize weibo posts. Seeing this thread brings up an issue I have been thinking about in terms of how the community of Internet researchers work with code. Other academics I know who work in sciences share all their codes online (git hub etc.), have a practice of working together to debug this code and receive academic credit when their codes are used by others. I’ve seen very little of this in social science research. Are there any Internet researchers who share code they have created who could advise as to what their practices are in this regard? Is there any sort of standard among Internet researchers (and should there be) in terms of sharing code created for academic purposes with other academics? Gillian Bolsover Researcher Oxford Internet Institute University of Oxford PGP Key: 17EC60B3 ________________________________________ De : Air-L [air-l-bounces@listserv.aoir.org] de la part de Stefania Vicari [s.vicari@sheffield.ac.uk] Envoyé : mardi 9 mai 2017 19:51 À : Helen Kennedy Cc : air-l@listserv.aoir.org Objet : Re: [Air-L] Chinese language social media data mining tools It may be worth looking at: https://api.anacode.de/landing/ Best, S On 9 May 2017 at 16:58, Helen Kennedy <h.kennedy@sheffield.ac.uk> wrote:
Hello clever AOIR folks
Asking for postgrad students: any recommendations of social media data mining tools that work on Chinese social media platforms / with Chinese languages?
Thanks!
Helen
-- Professor Helen Kennedy, Chair in Digital Society Department of Sociological Studies / Faculty of Social Sciences Elmfield, Northumberland Road Sheffield S10 2TU T: 0114 2226488 E: h.kennedy@sheffield.ac.uk
LATEST ARTICLE: *'*The Feeling of Numbers: emotions in everyday engagements with data and their visualisation <http://journals.sagepub.com/doi/abs/10.1177/0038038516674675?journalC ode= soca>', *Sociology*, 2017. _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/ listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Stefania Vicari Senior Lecturer in Digital Sociology Programme Manager for the MA Digital Media and Society Department of Sociological Studies The University of Sheffield Elmfield, Northumberland Road Sheffield S10 2TU Webpage: http://www.sheffield.ac.uk/socstudies/staff/staff-profiles/stefania-vicari Email: s.vicari@sheffield.ac.uk Twitter: @stefaniavicari <https://twitter.com/stefaniavicari> Recent paper: Vicari, S. & Cappai, F. (2016) Health Activism and the Logic of Connective Action <http://www.tandfonline.com/doi/full/10.1080/1369118X.2016.1154587>. *Information, Communication & Society* 19(11): 1653-1671. _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/ _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/