Some insights for the Russian language It seems that the state of the art for Russian is LSTM with word2vec (http://www.dialog-21.ru/media/3380/arkhipenkoetal.pdf). However, from our experience, a dictionary approach with SentiStrength (http://sentistrength.wlv.ac.uk) yield comparable results. Our lab develops a dictionary for texts on social and political topics, and it's available here http://www.linis-crowd.org (in Russian). You can learn more about the dictionary from this paper http://www.dialog-21.ru/media/3400/koltsovaoyuetal.pdf. Best regards, Sergei Pashakhin Laboratory for Internet Studies, National Research University Higher School of Economics https://linis.hse.ru/en/ <https://linis.hse.ru/en/>
On 29 Oct 2018, at 09:22, Xanat Meza via Air-L <air-l@listserv.aoir.org> wrote:
There is a benchmark that automatically translates text to English and then does sentiment analysis: https://www.researchgate.net/publication/261959618_iFeel_a_system_that_compa...
Xanat V. Meza
Ph.D. candidate - Kansei, Behavioral and Brain SciencesUniversity of Tsukuba M.A. Media and Communication Yeungnam University B.D. Graphic Communication Design Universidad Autonoma Metropolitana
El sábado, 27 de octubre de 2018 8:11:58 p. m. GMT+9, Stuart Shulman <stuart.shulman@gmail.com> escribió:
DiscoverText.com works with everything we have tested it on, including Hebrew, Arabic, Mandarin, and others. I would like to hear from you if we can add Russian to the list. I will send you a sponsored license to test it out. Please let us know if it works. We are happy to sponsor anyone working on research to protect democratic societies from authoritarian assaults on the ballot box.
From our product description:
"Most text analytics software packages work well with English text and a handful of other languages; however, many of these tools fail when analyzing non-Latin, multilingual texts, such as Arabic, which appears correctly only in a right-to-left format. Further, many software solutions have additional problems tokenizing text when it is an ideograph-based language (e.g. Chinese or Korean). Texifter’s software, DiscoverText, is unique in that it is capable of effective operations on multilingual texts and the coding platform builds effective custom machine classifiers on the fly and at scale for these corpora."
Stu Shulman <https://twitter.com/StuartWShulman>NEFC-West <https://www.nefc.us/west> 2008 Boys Head Coach
On Fri, Oct 26, 2018 at 4:37 PM John P. Bell <John.P.Bell@dartmouth.edu> wrote:
Hi all,
I’m looking for tools to do sentiment analysis and general mining on Russian language tweets. I see there are some options out there, but if anyone has experience trying to do this I’d appreciate it if you could share some insight on the software you used. While I’d be more interested in something I can set up and run myself than subscribing to a service, I’m not absolutely committed to that idea.
Any suggestions?
Thanks,
- John
— John P. Bell, PhD Lead Application Developer (Digital Humanities), Dartmouth Research Computing Asst. Prof. of Digital Curation, University of Maine http://johnpbell.info
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/ _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/