Internet Explorer has an option called a waff file -- it allows you to archive a website as many links deep as you like, excluding links to other sites if you so desire, including pictures, movies, etc. In Internet Explorer, go to your File... Save as... and choose the Web Archive function. :-D. Deanya Lattimore, ABD Syracuse Writing Program http://www.deanya.com/ On Sunday, June 5, 2005, at 03:06 PM, air-l-aoir.org-request@listserv.aoir.org wrote:
Date: Sun, 05 Jun 2005 13:59:39 +0100 From: <s.vicari@reading.ac.uk> Subject: [Air-l] SW to store webpages To: air-l@listserv.aoir.org Message-ID: <E1Deuj5-0007Br-00@vimb3> Content-Type: text/plain; charset=utf-8
Hi,
I am a PhD student at the University of Reading, Uk. I am running a study on 200 protest group websites. Would you suggest any good SW to store whole websites offline?
Thanks a lot, at the moment I am a bit lost in links and buttons... ste
Stefania Vicari PhD student in Sociology University of Reading PO Box 218, Reading, RG6 6AA, United Kingdom.
Deanya Lattimore wrote: I take the liberty to quote your header:
X-Mailer: Apple Mail (2.552)
Internet Explorer has an option called a waff file -- it allows you to archive a website as many links deep as you like, excluding links to other sites if you so desire, including pictures, movies, etc.
This feature seems to only be available on Macs! WinHTTrack, though integrates into IE (although there are many reasons, why Firefox and Opera are more suitable browsers for research purposes, Opera, e.g. automatically time stamps and attaches the URL into the body of the HTML-file) Thomas -- thomas koenig, ph.d. http://www.lboro.ac.uk/research/mmethods/staff/thomas/
elijah wright wrote:
This feature seems to only be available on Macs!
On Windows the extension is .mht.
The only thing that thing seems to make is to transform a single web page into some (fortunately at least human-readable) gibberish, embedding css-files and the like in a single file, but it doesn't mirror entire web sites. BTW: Funny thing: I am using (not normally, but for present puropses) IE 6.0.2800. The mht file claims I used "<Saved by Microsoft Internet Explorer 5>". I guess, IE will feature instead of QWERTY in future textbook examples for path dependency theory. -- thomas koenig, ph.d. http://www.lboro.ac.uk/research/mmethods/staff/thomas/
Internet Explorer has an option called a waff file -- it allows you to archive a website as many links deep as you like, excluding links to other sites if you so desire, including pictures, movies, etc.
This feature seems to only be available on Macs!
most features are only available on macs or unix-based systems.... kindly leave your exclamation marks at home. Jeremy Hunsinger Center for Digital Discourse and Culture () ascii ribbon campaign - against html mail /\ - against microsoft attachments
Jeremy Hunsinger wrote:
This feature seems to only be available on Macs!
most features are only available on macs or unix-based systems...
I doubt it. Especially, since MS tends to redefine bugs as "features." ;-) But apart from bugs, there is, of course, much more software available for Win than for any other OS.
. kindly leave your exclamation marks at home.
IE is a product of MS, and MS's OS is Windows. So, I would expect IE to have the same or even more features in its native environment than in its Mac-Version. That seems to be a straightforward expectation. Since I am therefore astonished, that IE for Mac carries a possibly useful feature IE for Win doesn't offer, I translate my astonishment into an exclamation mark. Why not? ASCII anyways has limited expressive possibilities, as millions of unnecessary flame wars testify. So, why restrict the available code even further? I'm aware that you might have interpreted the exclamation mark possibly differently (How?), and, indeed, I also used it as a caution for Windows users, who, I suspect, are the overwhelming majority on this list. Again: What's wrong with that use? -- thomas koenig, ph.d. http://www.lboro.ac.uk/research/mmethods/staff/thomas/
The ability to scour the web automatically brings with it the challenge of dealing with all the data. So, another relevant question to this thread is, what are the automation tools (if any) that people are using for, say, content analysis? Don Holeman
Cox wrote:
The ability to scour the web automatically brings with it the challenge of dealing with all the data. So, another relevant question to this thread is, what are the automation tools (if any) that people are using for, say, content analysis?
I personally use a lot of software to aide me in analysis. I put an overview of this software here: http://www.lboro.ac.uk/research/mmethods/research/software/index.html Another good place to start is: http://www.textanalysis.info/ But since you ask about "automatic tools", I linked the "automatic concept mappers" I am aware of here: http://www.lboro.ac.uk/research/mmethods/research/software/stats.html#map I tried out several of those programs. The good thing about them: they are very fast: If they produce useless results, you haven't lost much time. If they do: Great! The bad thing: They have not really been been assessed using traditional content analytical criteria, as far as I know. I know only of one exception, a (in my view very useful) paper about Leximancer forthcoming in "Behavior Research Methods". Maybe there are others, but I don't know about them. So, for now I use them as a heuristic device, rather than as "analysis". You probably would need to give more details, about what you understand under "content analysis", if you would like more pointed suggestions. Thomas -- thomas koenig, ph.d. http://www.lboro.ac.uk/research/mmethods/staff/thomas/
most features are only available on macs or unix-based systems...
I doubt it. Especially, since MS tends to redefine bugs as "features." ;-) But apart from bugs, there is, of course, much more software available for Win than for any other OS.
you might doubt it, but it is a fact of the interface and its limitations. you could disagree there, but then you would be sorely pressed when faced with the full powers of the unix command-line. which is also one of the reasons a unix user can do more with wget than httrack can do, but there is plenty of evidence in that arena already. think of it like this each program can be scripted and all data can be piped or saved and reread, so that anything that you can think that httrack can do, can be done and then beyond that we can pipe it through any number of further systems from simply stripping the html, to counting every word, to doing a full statistical abstract of the text alone, or the text and code. if you know how to use the unix commandline each command is a multiplier.
as for the explanation marks, it was turning very clearly into a troll war. little escalations such as the exclamation mark will be interpreted in many ways, and i don't know your intent, but I do know that at least one person saw it as a slight. let's not go there. We have provided plenty of answers to the question, they all work, and they all can work to at least one user's satisfaction, so unless new information is available.... jeremy hunsinger jhuns@vt.edu www.cddc.vt.edu jeremy.tmttlt.com www.tmttlt.com () ascii ribbon campaign - against html mail /\ - against microsoft attachments
Jeremy Hunsinger wrote:
you might doubt it, but it is a fact of the interface and its limitations. you could disagree there, but then you would be sorely pressed when faced with the full powers of the unix command-line.
1) Are you aware that HTTrack also offers a command line interface for Unix? 2) I am a social scientist, as I (maybe wrongly) thought were most of the people on this list. I am familiar with UNIX, which already seems to be kind of a rarity among social scientists. Most of us aren't and, I might add, shouldn't. For most purposes WebCopier seems an entirely sufficient tool. Why shouldn't it? Give me examples, where its limitations pose a problem.
which is also one of the reasons a unix user can do more with wget than httrack can do, but there is plenty of evidence in that arena already.
If there is, why don't you refer to that evidence? I gave already several sources, including the article by Kellogg, which concludes: "With little coding, HTTrack can be extended to meet immediate mirroring needs."
as for the explanation marks, it was turning very clearly into a troll war.
Now, who's the other troll besides myself? I really don't appreciate this "dissent" equals "trolling" allegations. And I do think that the metaphor of a troll has become overused in CMC.
little escalations such as the exclamation mark will be interpreted in many ways, and i don't know your intent, but I do know that at least one person saw it as a slight.
Well: If an exclamation mark is considered to be "an escalation", what's next? Banning of question marks? An exclamation mark might be ambiguous, although, I think, I clarified now at length what I wanted to say. In contrast: "kindly leave your exclamation marks at home." is much less ambiguous in my view. It is clearly an imperative, which I happen to disagree with.
let's not go there.
Why not? Obviously, there is disagreement on this list about it, so the only way to solve this disagreement seems to me to voice one's opinions. I, for one, think, that the approach to try not to offend anyone no matter what, is a bad one, because it stifles discussions. That does not mean that one should not try to be courteous, one clearly should, but academic deliberations live from dissent, which is bound to offend those, whose primary goal is harmony. I personally am not willing to phrase my postings in a way, so that I make sure that nobody gets offended. I try to follow the rules offered by Joshua Cohen:
1. Deliberations take place in an argumentative fashion, through the ordered exchange of reasons and information among counterparts that make proposals and submitt them to criticism. 2. Deliberations are inclusive and public, and all the potentially affected by their decisions must have equal oppportunities to paticipate and to decide in them. 3. Deliberations are free of inner coercions able to undermine the equal position of participants, and everyone must have the opportunity to be heard, to introduce new issues, to make proposals and to criticize them. The coercion without coercions of the best argument is the only rule for accepting or refusing an argument. 4. Deliberations are generally oriented to reach a rationally grounded agreement, and can in principle be continued or resumed in any given moment, yet the need for decision demands them to have an agreed final point. 5. Political deliberations reaches all those issues that can be regulated on behalf of the public interest, but that is not to say that issues traditionally judged as ‘private’ must forcefuly remain out of discussion. 6. Political deliberations are also extended to the interpretation of needs and to changes in prepolitical attitudes and preferences.
(Cohen, Joshua (1996), “Procedure and Substance in Deliberative Democracy”, in S. Benhabib (ed.), Democracy and Difference. Contesting the Boundaries of the Political, Princeton: Princeton University Press, pp. 95-119.) Habermas, in "Between Facts and Norms" (Faktizitaet und Geltung), quotes BTW the same rules. These rules were initially intended for political decision making, but also seem appropriate for academic deliberations.. I don't see, that I violated any of theses rules. Frankly, I even guess you and I would agree on most of these points and on many other points. So, it's even more interesting to see, where we disagree. Thomas, preffering Unix over Windows, but acknowledging that the latter is the the OS of this decade. (BTW: I also switched from HighCom to Dolby for the same reasons) -- thomas koenig, ph.d. http://www.lboro.ac.uk/research/mmethods/staff/thomas/
for my part, i only agree with 5 and 6 of cohen and find the rest indicative of the academic and politically privileged act of definition, which some would term hegemonic. to claim deliberation is argumentative, aiming toward reason, etc. is to make a normative claim that the world should be 'like Cohen imagines his world' instead of trying to describe the world. as for all prior discussion, i'm going to bed. jeremy hunsinger jhuns@vt.edu www.cddc.vt.edu jeremy.tmttlt.com www.tmttlt.com () ascii ribbon campaign - against html mail /\ - against microsoft attachments
This comment raises I think an interesting issue, to what extent ought we to be familiar with computer science as Internet researchers? In other words, should we know the basics of programming, of UNIX, of html, etc.? My own knowledge is pretty patchy - I did two semesters of algorithms (in Pascal as I recall) about 15 years ago in college; I can hand- code a website with basic HTML and CSS but no scripting; I used to be able to write AppleScripts; and I can navigate up and down a unix system (I basically know the 'ls' command and the 'cd' command). I know what a webserver does and can read log files. And owing to my research I now know something about how search engines function :-) That's it, though -- perl and python are strangers to me, I can't gzip or untar things, and as for the grep commands in AtlasTi... well, let's just say I'm probably not using the program to its full extent. Still, I know more than most of the other people I know who are studying new media. But is that right? Should I know more, should they know more? Do you think there is a minimum level of technical competence that you need? Elizabeth
2) I am a social scientist, as I (maybe wrongly) thought were most of the people on this list. I am familiar with UNIX, which already seems to be kind of a rarity among social scientists. Most of us aren't and, I might add, shouldn't.
Hi, That is indeed an interesting question. The answer would be pretty much determined by what the individual wishes to research, and the particular focus they wish to take. I see absolutely no reason, in theory, a social scientist should have to know perl, grep or other scripting languages. If they do it may enable them to take their research in a different direction. I use an Apple Powerbook with MacOSX installed, but I never have to use UNIX for my day to day work and my research. Having said that, I have a good knowledge of UNIX and can use grep and javascript if I have to. I could also use perl but would rather jump of a bridge into a swiftly flowing river first. My technical background has enabled me to understand the reasons why certain things function the way they do on the Internet and use some of this knowledge in explicating my findings from a sociotechnical point of view. Obviously, there will be some tools/software etc that people need to use because of the direction they wish to take their research and they may have been developed for command line use only because that is all the original developer needed. I guess that would be one of the factors others have to consider when adopting the tool. Can they learn how to use it? Or is there a better tool available that still meets budgetary constraints. In other words if you only know how to use a spade, is there any point in hiring a mechanical digger that you don't know how to operate to dig that drain you promised you would do 6 months ago. Or do you pay a commercial operator to come in and do it for you? Or take a course in mechanical digger operation? Andrew -- email: andrewwenn@mac.com internet: http://homepage.mac.com/andrewwenn/ On 06/06/2005, at 3:22 PM, Elizabeth Van Couvering wrote:
This comment raises I think an interesting issue, to what extent ought we to be familiar with computer science as Internet researchers? In other words, should we know the basics of programming, of UNIX, of html, etc.?
My own knowledge is pretty patchy - I did two semesters of algorithms (in Pascal as I recall) about 15 years ago in college; I can hand-code a website with basic HTML and CSS but no scripting; I used to be able to write AppleScripts; and I can navigate up and down a unix system (I basically know the 'ls' command and the 'cd' command). I know what a webserver does and can read log files. And owing to my research I now know something about how search engines function :-) That's it, though -- perl and python are strangers to me, I can't gzip or untar things, and as for the grep commands in AtlasTi... well, let's just say I'm probably not using the program to its full extent.
Still, I know more than most of the other people I know who are studying new media. But is that right? Should I know more, should they know more? Do you think there is a minimum level of technical competence that you need?
Elizabeth
2) I am a social scientist, as I (maybe wrongly) thought were most of the people on this list. I am familiar with UNIX, which already seems to be kind of a rarity among social scientists. Most of us aren't and, I might add, shouldn't.
_______________________________________________ The Air-l-aoir.org@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers:http://www.aoir.org/
I am a social science student with a previous degree in statistics. I often claim that knowing statistics allows me to better read social science studies because the statistics in a study is something I can understand. I am uniquely me.. wildly interdisciplinary... and sure, many social sciences require that a student to be a professional understand statistics. That said some very important professors of mine who write reports for national commissions IMHO use statistics at a very shallow level. Sometimes a critique will involve attacking someone's statistics but then on the next page of this paper a misuse of statistics will occur. Peter Timusk B.Math Just trying to stay linear www.crystalcomputing.net >blog> http://logbook.crystalcomputing.net www.webpagex.org >blog> http://notebook.webpagex.org
Again a comment from a student. Once upon a time one could submit a hand written essay. As recent as 2000 all written work in a GIS course had to be printed on a computer or a type writer. One course called Law in the Information Society in 2003 required students to use blogs for their work. In another course students were graded on their quality of posts to a course newsgroup. This was a sociology of science and technology course. Ok these are simply editing of English software's. We are now at this level required by students. Thus we can assume the professors of these course also had these skills. As a student I can do a class room presentation with an Open Office ( aka power point) presentation and know I will impress, even though up against a professional private sector power pointer my presentation will be very bad. I can play real audio files in a presentation and score high. Technology will help a student score higher at this moment in history the point is. Perhaps technical skills should be looked at with some suspicion as they may hide weak research. Personally I am coming from the sciences and just completed a coding course in data mining within my legal studies BA and have found a scientist studying information security who may need my skills in data mining to support a study into computer crime. I should also mention that searching the library is aided by knowing code and the idea of unforgiving coding. Peter Timusk B.Math Just trying to stay linear www.crystalcomputing.net >blog> http://logbook.crystalcomputing.net www.webpagex.org >blog> http://notebook.webpagex.org
participants (8)
-
Andrew Wenn -
Cox -
Deanya Lattimore -
elijah wright -
Elizabeth Van Couvering -
Jeremy Hunsinger -
Peter T. -
Thomas Koenig