Dear Sina Özdemir, If I understand it correctly, I might suggest the link below searching for a trained dataset. https://www.kaggle.com/datasets Good luck, Giz *Ar.Gör. Ayşe Giz Gülnerman* İstanbul Teknik Üniversitesi İnşaat Fakültesi - Geomatik Mühendisliği Bölümü 34469 Maslak İstanbul / Türkiye gulnerman@itu.edu.tr ----------------------------------------------------------------------------------------------------------------- *R.A. Ayse Giz Gulnerman* Istanbul Technical University Faculty of Civil Engr. - Geomatics Engr. Dep. 34469 Maslak Istanbul/ Turkey gulnerman@itu.edu.tr On Wed, Apr 29, 2020 at 12:08 PM Sina Furkan Özdemir <sina.ozdemir@ntnu.no> wrote:
Dear all,
I have been following some 800 Twitter accounts for my Ph.D. dissertation over the last four months. I have ended up with 400.000 tweets that I need to categorize by four mutually exclusive categories.
I looked up some previous works with similar tasks, and it seems that the best way is to use a combination of word embeddings and recurrent neural networks with LSTM structure.
The problem I am having right now is that I couldn't find training data for the classification. Can anyone recommend me some literature on sampling strategies for short-text classification tasks?
Best, Sina Özdemir Ph.D. Candidate NTNU, Trondheim M.A Comparative and International Studies ETH Zurich & University of Zurich, Switzerland B.A. Political Science and International Relations Middle East Technical University, Turkey
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/