*Hi All, * I use Web content extractor from Newprosoft ( http://www.newprosoft.com/web-content-extractor.htm). The good thing is that I can extract specific fields (like date, name, user or all the post), add some manipulations thru javascript code and export all the dataset to Excel. *Regards, Yohanan Ouaknine MA student, Information studies (Knowledge management), Bar Ilan University (Israel) * On Fri, Jan 20, 2012 at 6:05 AM, Wendy Christensen <wchriste@bowdoin.edu> wrote:
Hi All,
I used SiteSucker to download entire blogs (
http://www.sitesucker.us/home.html). It worked very well for both blogspot and wordpress blogs, but excess files had to be cleaned up and deleted before analysis.
I ran into issues trying to find content analysis software that would
allow me to code html files. If anyone has suggestions for software for qualitative analysis of websites and/or downloaded html files, I'd love to hear about it!
Best, Wendy
Wendy M. Christensen, Ph.D.
Visiting Assistant Professor Department of Sociology and Anthropology Bowdoin College wchriste@bowdoin.edu<mailto:wchriste@bowdoin.edu>
On Jan 20, 2012, at 3:40 AM, Jarkko Moilanen wrote:
hi,
Quoting Stuart Shulman <stuart.shulman@gmail.com<mailto:
stuart.shulman@gmail.com>>:
WORDPRESS has a feature for this:
http://en.blog.wordpress.com/2006/06/12/xml-import-export/
If it is a WORDPRESS blog, you can ask the owner to create a bulk export
in
XML.
If you are archiving blog that you don't have access to export functions, I would use 'wget'. It contains features to get everything, no matter how deep the structure is.
http://en.wikipedia.org/wiki/Wget
/Jarkko
Better still is the new offering from GNIP:
http://blog.gnip.com/gnip-and-automattic-make-whole-new-universe-of-data-ava...
The future is bright for getting big collections.
~Stu
On Thu, Jan 19, 2012 at 9:31 PM, C Sosnowy <c_sosnowy@yahoo.com> wrote:
I would like to be able to archive an entire blog (and ideally be able to download it) for analysis. I've looked at WebCite and Zotero but neither seem to have this capability. Does anyone know of another way?
Collette Sosnowy M.A., Ph.D. Candidate Environmental Psychology Program The Graduate Center of the City University of New York _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
--
Dr. Stuart W. Shulman people.umass.edu/stu
Editor Emeritus, JITP jitp.net <http://www.jitp.net>
Director, QDAP-UMass umass.edu/qdap <http://www.umass.edu/qdap>
Founder and CEO, Texifter texifter.com <http://www.texifter.com>
LinkedIn: linkedin.com/pub/stuart-shulman/10/351/899<
http://www.linkedin.com/pub/stuart-shulman/10/351/899>
Twitter: twitter.com/#!/StuartWShulman< http://twitter.com/#%21/StuartWShulman> _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
**************************** Jarkko Moilanen (+358 45 8877 150) M.Soc.Sc. (Political Science) PhD Student, Information studies, University of Tampere Blog: http://blog.ossoil.com/ ------------------------- Founder of Hackerspace 5w, Finland, Tampere - http://5w.fi Founder of MeeGo Network Finland, http://meegonetwork.fi Founder of Open Coral - http://open-coral.org Founder of Finnish Biohacker community, http://biohakkeri.fi **************************** _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- יוחנן ועקנין Yohanan Ouaknine 050-6279777 yohanan.ouaknine@ois.co.il http://il.linkedin.com/in/yohananouaknine See who we know in common