Sally, Machine generated sentiment analysis scores are sometimes abused as a shortcut to avoid certain forms of manual/mental labor in a variety of commercial and academic contexts. Language tools are in this scenario treated as a magic buttons to be deployed against corpora in the name of charts untouched by serious validation. I prefer it when humans are in-the-loop, which itself is recursive (meaning you repeat until there is no room to improve), using tools as filters to generate purposive samples that humans annotate and collectively validate using a systematic process. Sentiment problems range from hard to harder and hardest, where hardest means you cannot do it in a manner that can be validated by any means. There is no easy on this scale of tasks if false positive or negatives could cost a life or some other serious consequence, but to make it easier, requires a process, grossly boiled down below: 1. Collect a relevant and representative corpus of data, 2. Build a SPAM detection classifier to remove non-relevant data (ex., wrong language OR no discernible sentiment), 3. Build a topic classifier and focus on one key topic first (not all topics at once), 4. Solve the Rubik's cube of how many codes and what they really mean (ex., happy/sad OR angry/frustrated/both/neither...), 5. Test the topic-specific annotation scheme with a group of no less than five independent annotators (not just two), 6. Crowd source the task to larger groups when possible, using memo writing to identify boundary cases that kill/modify models, 7. Use iteration to identify elite annotators through recursive validation, memo reviews, and scoring against a gold standard. The goal is to build task- and language-specific machine classifiers using the best possible human experts in the process. The main idea, however, is to keep a critical role for humans. ~Stu On Thu, Sep 5, 2019 at 4:11 PM Dr. S.A. Applin <sally@sally.com> wrote:
Dear Charles (and List),
I see this as an ethics issue.
How reliable are “emotion analysis” tools? How would outcomes from them be used?
As you say, there is a lack of clarity in some in terms of “explaining emotional categories.” To me, this signals (along with obvious knowledge about the limitations and problems with algorithms), that there is opportunity here to be very, very, very wrong about people’s opinions, and any algorithmically interpreted “emotional” state.
For example, how would one interpret or finesse “frustration,” vs “anger”? The written word is contained within a language. Not all commenters will be native speakers to that language, and not all native speakers have the language tools required (even within their own language) to adequately express themselves, even in the best of times. What makes anyone think an algorithm would do better at this than a human trained in qualitative methods and with cultural and media and language knowledge?
There is way too much margin of potential error here for this to be automated, or “useful.” It is much more likely that things will be assumed incorrectly by limited algorithms in the first place.
Furthermore, does your student see any problem with this exercise? That their tool analysis might get it very wrong? That the wrong might lead to assumptions or outcomes that are harmful to entities, people, governments?
What safeguards are in place for wrong assumptions and outcomes?
Kind regards,
Sally
Sally Applin, Ph.D. .......... Research Fellow HRAF Advanced Research Centres (EU), Canterbury Centre for Social Anthropology and Computing (CSAC) .......... Research Associate Human Relations Area Files (HRAF) Yale University .......... Associate Editor, IEEE Consumer Electronics Magazine Member, IoT Council Executive Board Member: The Edward H. and Rosamond B. Spicer Foundation .......... http://www.posr.org http://www.sally.com I am based in Silicon Valley .......... sally@sally.com | 650.339.5236
On Sep 5, 2019, at 3:52 AM, Charles M. Ess <c.m.ess@media.uio.no> wrote:
Dear colleagues,
One of our students is wanting to analyze emotional content in in the comment fields of a major newspaper vis-a-vis specific hot-button issues.
She has a good tool (I think) for scrapping the data - but she is stymied over the choice of an emotion analysis tool. She has looked at Senpy (http://senpy.gsi.upm.es/#test) and Twinword < https://www.twinword.com/api/emotion-analysis.php> - the latter seems the most accurate, but it is also expensive. She has recently discovered DepecheMood emotion lexicons (Staiano, J., & Guerini, M. (2014). Depechemood: a lexicon for emotion analysis from crowd-annotated news. arXiv preprint arXiv:1405.1605.) - but this suffers from a lack of clarity in terms of explaining its emotional categories: awe, indifference, sad, amusement , annoyance, joy, fear and anger.
For my part, I am entirely clueless. Any suggestions that she might pursue would be greatly appreciated.
best, - charles ess -- Professor in Media Studies Department of Media and Communication University of Oslo <http://www.hf.uio.no/imk/english/people/aca/charlees/index.html>
Postboks 1093 Blindern 0317 Oslo, Norway c.m.ess@media.uio.no _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Cell: 413-992-8513 LinkedIn: http://www.linkedin.com/in/stuartwshulman