You may also want to look at the tools Anatoliy Gruzd has developed for analysis of threaded texts. See www.textanalytics.net or https://www3.isrl.illinois.edu/~agruzd2/icta_web/ /Caroline ---- Original message ----
Date: Thu, 18 Jun 2009 19:16:45 -0400 From: Dhanaraj Thakur <dthakur@gatech.edu> Subject: [Air-L] question on how to identify email threads on listserv To: <air-l@listserv.aoir.org>
hey all,
part of the research I am doing requires that I identify threads on a listserv for analysis. Threads consist of emails that are a series of responses to an initial email.
of course the easiest way to do this is to sort emails by subject line. however as you might know this is not complete as, for example, some participants will change the subject for a variety of reasons while still remaining in the same thread. Thus one could analyze info in the email header to identify threads, but in my case this data is not always available. Alternatively, one could manually scan though the text of the emails - which is very time consuming when using a large email corpus.
Therefore, what I need is a method (preferably automated) that can identify email threads by looking at the texts of the emails. I can imagine some software that does this and can create clusters of emails based on semantic similarities that I could equate to threads - but I haven't been able to identify any just yet...
the units of analysis that I have described are fairly common and, I imagine, so is my problem. Thus perhaps people on this list can point me to existing methods/software/papers that have already addressed this issue?
thanks Dhanaraj
Dhanaraj Thakur Ph.D. Candidate School of Public Policy Georgia Institute of Technology
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Caroline Haythornthwaite Professor, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign, 501 East Daniel St., Champaign IL 61820 haythorn@illinois.edu OR haythorn@uiuc.edu