Text Data in Marketing: Data Sources, Linguistic Features, and Software Programs
Dear Internet Researchers: I am delivering a demo to my department faculty, titled: Text Data in Marketing: Data Sources, Linguistic Features, and Software Programs. I request you to critique my coverage and suggest changes/additions. As the title indicates, I will cover the following three aspects: 1. Show, via an example, text data in marketing: a. firm-generated (e.g., earnings calls) b. consumer/user-generated text data (e.g., Twitter) c. other-generated, marketing-relevant text data. Other could include market stakeholders (competitors, suppliers, organizational customers) and nonmarket stakeholders (news media, consumer organizations, regulators, legislators) I will take the example of the Volkswagen emissions scandal and show how this event led to text data generated by Volkswagen and its varied marketing stakeholders. I will mention various secondary data sources that my colleagues can use to obtain/buy text data. If you know of any other source, or an example more insightful than Volkswagen, please help me. 2. I will then proceed to discuss linguistic features of the text. These features include sentiment, emotion, cognition, named entities, readability, subjectivity, structural complexity, lexical complexity, and topic modeling/mining. I choose these features because I have used them in my research and can talk about them. If you know of any other useful feature that I am missing, please respond. 3. Software programs, both paid (e.g., LIWC) and free R/Python libraries, that take text as input and output the above linguistic features. I will start with LIWC, explain its variables from psychological standpoint. I will then demo syuzhet and sentimentr showing how their output variables offer new and different insights relative to LIWC. I will mention other R and Py packages such as TensorFlow, MXNet, and TextBlob). I will demo MALLET GUI for topic modeling. I will then mention how researchers can use MTurk to annotate their text data and then write a classifier (or hire a Py programmer to write it for them). If you know of any paid software program (that is as easy to use as LIWC) or any other R/Py package, please suggest. Thank you! Vivek Astvansh Assistant Professor of Marketing, Kelley School of Business, Indiana University http://kelleyschool.iu.edu/astvansh | +1 (812) 855-8953
See this article... Netzer, Oded, Ronen Feldman, Jacob Goldenberg and Moshe Fresko (2012), *"Mine Your Own Business: Market Structure Surveillance through Text Mining."* *Marketing Science*, 31 (3), 521-543 On Wed, Oct 9, 2019 at 9:26 AM Astvansh, Vivek <astvansh@iu.edu> wrote:
Dear Internet Researchers:
I am delivering a demo to my department faculty, titled: Text Data in Marketing: Data Sources, Linguistic Features, and Software Programs. I request you to critique my coverage and suggest changes/additions.
As the title indicates, I will cover the following three aspects:
1. Show, via an example, text data in marketing:
a. firm-generated (e.g., earnings calls)
b. consumer/user-generated text data (e.g., Twitter)
c. other-generated, marketing-relevant text data. Other could include market stakeholders (competitors, suppliers, organizational customers) and nonmarket stakeholders (news media, consumer organizations, regulators, legislators)
I will take the example of the Volkswagen emissions scandal and show how this event led to text data generated by Volkswagen and its varied marketing stakeholders. I will mention various secondary data sources that my colleagues can use to obtain/buy text data.
If you know of any other source, or an example more insightful than Volkswagen, please help me.
2. I will then proceed to discuss linguistic features of the text. These features include sentiment, emotion, cognition, named entities, readability, subjectivity, structural complexity, lexical complexity, and topic modeling/mining. I choose these features because I have used them in my research and can talk about them.
If you know of any other useful feature that I am missing, please respond.
3. Software programs, both paid (e.g., LIWC) and free R/Python libraries, that take text as input and output the above linguistic features. I will start with LIWC, explain its variables from psychological standpoint. I will then demo syuzhet and sentimentr showing how their output variables offer new and different insights relative to LIWC. I will mention other R and Py packages such as TensorFlow, MXNet, and TextBlob).
I will demo MALLET GUI for topic modeling. I will then mention how researchers can use MTurk to annotate their text data and then write a classifier (or hire a Py programmer to write it for them).
If you know of any paid software program (that is as easy to use as LIWC) or any other R/Py package, please suggest.
Thank you! Vivek Astvansh Assistant Professor of Marketing, Kelley School of Business, Indiana University http://kelleyschool.iu.edu/astvansh | +1 (812) 855-8953
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
participants (2)
-
Astvansh, Vivek -
Thomas Ball