Re: [Air-L] on the Wayback Machine (was public/private [part 1 of 2])
But to complicate your argument below (and Hughie's in a separate thread), a couple of thought experiments:
Say I place content on a "publicly accessible" webpage without creating any incoming links or notifying anyone. Web crawlers won't find it. A search engine won't index it. While on the open and public Internet, unless a random URL-generator happens to guess the precise address of the page, no one will ever read it. Is this content "fair game for researchers"?
I don't profess to be a "hardware" or a "search engine" expert, but I can tell you that the above thought experiment only works if you have complete control of the hardware and software involved in the webpage. Many hosts webpages link to pages of folks hosted by their service. It's a term and condition of service...so it's not nearly as easy to be completely off the search engines as one may think. Beyond that point, for those of us that do "in the wild" research...how would we find your thought experiment if it's not indexed? I guess it would be possible to have a crawler that would randomly generate URLs and check their validity. That might be useful for some research question, but in that case I would assume the unit of analysis would be the URL, at most, and more likely it would be a proof of concept test of some kind - so no human subject would be involved. Now beyond that, I would argue that from a research perspective the blog was still open to study, if it can be found. I say that because the best terrestrial analogy I can think of to your thought experiment is someone who posted a broadsheet on a pole but with a cover over the broadsheet...lots of things could pull that cover off...the wind, rain, or a passing person who didn't like the color of the cover. In short, a person who would go to that much trouble to hide in plan sight...well it's still plan sight. I think that's part of why this "what if" game gets bogus, a person who would make an effort to do the above, isn't all that likely to be "real" and I study real people. "What if's" are valuable in these discussions but again we have to remember that we can dream up far more elaborate scenarios then we are ever likely to see in the wild.
Consider a different scenario: Walking down the street with a friend, I comment on the fact my rent check bounced, and she offers condolences. We're in public. We make the utterances loud enough that someone next to us could easily hear it, but not loud enough that someone across the street can. Now, this is a public utterance - we didn't have the conversation behind locked (password-protected) doors. Can this be archived for research? Better yet, can we setup microphones to automatically record every conversation uttered in public? Does the fact these conversations *can* be recorded and archived mean they, by default, *should be*? I feel we're saying the same about utterances on the Web - so what's the fundamental difference from a user perspective between the street and the Web? Both are a place where we engage in conversations and maintain relationships....
I fear we're confusing what the Internet *is* and *is capable of* with *how it is used*. Yes, comments left on a personal blog are open for anyone to see. Yes, discussion board conversations can be archived. But that's not, I suspect, going through the mind of many casual users of this technology. It is a medium for communication, for connecting with people. I leave messages on my neighborhood parents discussion board because that's how I connect with that community. I use my real name because I'm among friends. Does that mean my comments are de facto fair game for any researcher who wants to scrape the database?
Actually I think the "confusion" is at a different level. Your second thought experiment gets at it fairly well. When I am designing research I can only know the "potential" harms, not the actual ones...potentials are before the fact, actuals are retrospective. Let's change your experiment a bit. You are home and in the shower getting ready to meet your friend. You think about the stuff you are going to tell them and you decide to mention your bounced check. And maybe it crosses your mind that you need to be careful where you say that so fewer people over hear. When you decide to tell your friend you don't know who is really around...or what someone who overhears might do with the utterance. You only know that you are going to do your best to make sure that you are careful. Same with research design and human subjects work. We have to learn from actual happenings but if we sit and do statistically, oh my a qualitative researcher just used the s-word, improbable "what-if's" then no research will happen at all. Personally I believe the following - - Research is never risk-free. Neither is life. If you are working toward absolute zero risk in your research you will either give up the pursuit, or build a protocal for your research that lies to you. Yes there are ways to protect YOUR subjects but most of them can "endanger" others in the process (more on this below). Merry Christmas that is not zero risk, it's just minimal risk for the research and probably a heck of a lot of your work that didn't buy anyone more "safety." All of the aforementioned "harm" and "protective" terms were used only as part of the discussion, because what is the real chance of harm from most of our research, one-in-a-million, one-in-a-billion, one-in-a-trillion? Medical experiments don't get to absolute zero risk...they just shift the potential risk - which may be minimal as well - from the researcher to the patient...unless something unexpected happens and then the risk is back on the researcher. And how do they "know" the potential risks? From computer simulations - based on facts known from old (and often questionable) human experiments, and/or from animal trials. So their "potential" risk knowledge comes from analyzing actual risk after a trial, not from "what if's." - I design research...I predict the potential harm from my "intervention" and I work to minimize the risk ("minimal risk" is the operative term). I do not mind read or assume my position as researcher places me in a superior position and that my subjects need special protections from me. Sorry folks but I truly find that last concept to be pretty funny. I am not injecting people with foreign substances, I look at artifacts they produced and placed online. Personally, I have never had a artifact's author/artist come back to me after I published and complain. I have had a single author question my use of his work without permission - I work with teens remember - and so we talked and by the end of the discussion he understood and told me he was pleased his work was in the paper he just thought he HAD to give permission first. Plus he found out by tracking his visitor logs and finding I had visited repeatedly, so it was not because of an unpleasant experience on his part. Now before I sound like a totally heartless researcher who only sees her subjects as things...when nothing could be further from the truth...if even one of the kids I work with were hurt by my research I would have a crisis of conscience. But again I am not God, I know that if I had done the best job I could do working through my research design and trying to predict all the potential harms likely to happen with my intervention then I have done everything I can do. Some years ago, I wrote a classroom paper that discussed why I don't pseudonymize my subjects in my chatroom research. The simple reason is that there is no way I could come up with nicknames that would cover the participants and not potentially deflect on to another chatroom user. I'm good but I'm not THAT good. So if I did pseudonymize I might protect my participants but in that one-in-a-billion case where someone was in harm's way from something I did...I might well have created the situation rather then hidden it. I'll let you "what if'ers" out there run that one through the filters. Lois
Say I place content on a "publicly accessible" webpage without
creating any incoming links or notifying anyone. Web crawlers won't find it. A search engine won't index it. While on the open and public Internet, unless a random URL-generator happens to guess the precise address of the page, no one will ever read it. Is this content "fair game for researchers"?
I have not read the full thread so please forgive me if I am repeating the same information. A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL. If you don't want to be crawled create a robot.txt file on your web server and search engines will skip you. http://www.robotstxt.org/wc/norobots.html Martin.
I have not read the full thread so please forgive me if I am repeating the same information.
A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL.
Web crawlers don't typically have much luck crawling by IP address. Name-based virtual hosting @ the level of the web server tends to make it less than adequate. Best practice for virtualhosting is to make a hit directly to an IP address (rather than a name) return... nothing. --e
Interesting point about what is accessible on the Internet. I'd not judge the number of possibilities by the use of IP addresses. It is common practice to have many websites attached to one IP address and many IP addresses are used to connect to the internet but do not provide web content. Even when web content is available at an address, a complete path is necessary to get to the content. I've often placed content that I'd prefer the world not see using a web address that has no referring links and would not easily be guessed. Search engines follow links that they find on pages. The big engines don't follow random possible content locations. Yes, there are programs that would allow a researcher (cracker) to explore all link possibilities on a site. Such an attempt without permission would be unethical. On the other hand, if you've announced your content to the world, the world has a right to explore your content. I believe that we would all agree that information that a poster has made some effort to make private through the use of a password or even simple obscurity requires informed consent before a researcher should be allowed to us it. On the other hand, publicly presented information should be fair game. This does bring up an interesting question though. At what point can a researcher use hidden information? Historians routinely use the content of diaries and letters that the authors would probably prefer never become public. The net is providing a fifth estate. Current USA laws are moving towards giving bloggers the same protections and responsibilities that are enjoyed by commercial reporters. Publicly posted that is clearly intended to be read is fair game and should not require review any more than using a reference from a journal or popular magazine. Charlie Balch -----Original Message----- From: elw@stderr.org Sent: Tuesday, August 14, 2007 8:10 AM
A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL.
Web crawlers don't typically have much luck crawling by IP address. Name-based virtual hosting @ the level of the web server tends to make it less than adequate. Best practice for virtualhosting is to make a hit directly to an IP address (rather than a name) return... nothing. --e _
In the context of my message below, these two articles in today's news jumped out. No need to read the articles, the headlines give you the idea. FYI: I use iGoogle to aggregate a number of news feeds -- these headlines are what some popular news agencies thought was important enough to make their top three. Pedophile Blogger Arrested Near UCLA Day Care Facility http://www.foxnews.com/story/0,2933,293173,00.html Dutch bloggers due in court over filming under skirts http://news.zdnet.com/2100-9588_22-6202451.html The articles bring up the interesting question about the exposure of third parties in blogs. I still believe that researchers of blogs do not require informed consent from the bloggers but what about the persons discussed in the blogs? I suspect that this tertiary exposure is a problem in any research. Charlie Balch -----Original Message----- From: air-l-bounces@listserv.aoir.org [mailto:air-l-bounces@listserv.aoir.org] On Behalf Of Charlie Balch Sent: Tuesday, August 14, 2007 9:33 AM To: air-l@listserv.aoir.org Subject: Re: [Air-L] The Spiders will find you (was wayback machine waspublic/private) Interesting point about what is accessible on the Internet. I'd not judge the number of possibilities by the use of IP addresses. It is common practice to have many websites attached to one IP address and many IP addresses are used to connect to the internet but do not provide web content. Even when web content is available at an address, a complete path is necessary to get to the content. I've often placed content that I'd prefer the world not see using a web address that has no referring links and would not easily be guessed. Search engines follow links that they find on pages. The big engines don't follow random possible content locations. Yes, there are programs that would allow a researcher (cracker) to explore all link possibilities on a site. Such an attempt without permission would be unethical. On the other hand, if you've announced your content to the world, the world has a right to explore your content. I believe that we would all agree that information that a poster has made some effort to make private through the use of a password or even simple obscurity requires informed consent before a researcher should be allowed to us it. On the other hand, publicly presented information should be fair game. This does bring up an interesting question though. At what point can a researcher use hidden information? Historians routinely use the content of diaries and letters that the authors would probably prefer never become public. The net is providing a fifth estate. Current USA laws are moving towards giving bloggers the same protections and responsibilities that are enjoyed by commercial reporters. Publicly posted that is clearly intended to be read is fair game and should not require review any more than using a reference from a journal or popular magazine. Charlie Balch -----Original Message----- From: elw@stderr.org Sent: Tuesday, August 14, 2007 8:10 AM
A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL.
Web crawlers don't typically have much luck crawling by IP address. Name-based virtual hosting @ the level of the web server tends to make it less than adequate. Best practice for virtualhosting is to make a hit directly to an IP address (rather than a name) return... nothing. --e _ _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/ -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.10/943 - Release Date: 8/8/2007 5:38 PM
Thanks, Charlie. I don't mean to thread hijack, but you reminded me of this interesting tidbit I came across the other day: Google News Blog: Perspectives about the news from people in the news http://googlenewsblog.blogspot.com/2007/08/perspectives-about-news-from-peop... Not exactly a watershed moment, perhaps, but intriguing nonetheless. Conor Charlie Balch wrote:
In the context of my message below, these two articles in today's news jumped out. No need to read the articles, the headlines give you the idea. FYI: I use iGoogle to aggregate a number of news feeds -- these headlines are what some popular news agencies thought was important enough to make their top three.
Pedophile Blogger Arrested Near UCLA Day Care Facility http://www.foxnews.com/story/0,2933,293173,00.html
Dutch bloggers due in court over filming under skirts http://news.zdnet.com/2100-9588_22-6202451.html
The articles bring up the interesting question about the exposure of third parties in blogs. I still believe that researchers of blogs do not require informed consent from the bloggers but what about the persons discussed in the blogs? I suspect that this tertiary exposure is a problem in any research.
Charlie Balch
-----Original Message----- From: air-l-bounces@listserv.aoir.org [mailto:air-l-bounces@listserv.aoir.org] On Behalf Of Charlie Balch Sent: Tuesday, August 14, 2007 9:33 AM To: air-l@listserv.aoir.org Subject: Re: [Air-L] The Spiders will find you (was wayback machine waspublic/private)
Interesting point about what is accessible on the Internet. I'd not judge the number of possibilities by the use of IP addresses. It is common practice to have many websites attached to one IP address and many IP addresses are used to connect to the internet but do not provide web content. Even when web content is available at an address, a complete path is necessary to get to the content. I've often placed content that I'd prefer the world not see using a web address that has no referring links and would not easily be guessed.
Search engines follow links that they find on pages. The big engines don't follow random possible content locations. Yes, there are programs that would allow a researcher (cracker) to explore all link possibilities on a site. Such an attempt without permission would be unethical. On the other hand, if you've announced your content to the world, the world has a right to explore your content.
I believe that we would all agree that information that a poster has made some effort to make private through the use of a password or even simple obscurity requires informed consent before a researcher should be allowed to us it. On the other hand, publicly presented information should be fair game. This does bring up an interesting question though. At what point can a researcher use hidden information? Historians routinely use the content of diaries and letters that the authors would probably prefer never become public.
The net is providing a fifth estate. Current USA laws are moving towards giving bloggers the same protections and responsibilities that are enjoyed by commercial reporters. Publicly posted that is clearly intended to be read is fair game and should not require review any more than using a reference from a journal or popular magazine.
Charlie Balch
-----Original Message----- From: elw@stderr.org Sent: Tuesday, August 14, 2007 8:10 AM
A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL.
Web crawlers don't typically have much luck crawling by IP address.
Name-based virtual hosting @ the level of the web server tends to make it less than adequate.
Best practice for virtualhosting is to make a hit directly to an IP address (rather than a name) return... nothing.
--e _
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.10/943 - Release Date: 8/8/2007 5:38 PM
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Two questions ... On literary blogs, blogs that contain creative writing, blogs that clearly create a copyright symbol or request on the blog page that copyright should be respected, does a researcher have the right to "research" the contents of that blog (which to me implies copying the content, trasposing it as "data" for the sake of qualitative or quantitative research purposes? What is wrong or problematic about a researcher simply asking the blog owner if it is acceptable to them if their blog contents become the subject of a research project? Beyond the rules and regs of an ethics research board, what is wrong with simply asking upfront? rather than working surreptitiously, lurking or working from an alias? Simply wondering about this ... jcu ----- Original Message ----- From: "Charlie Balch" <charlie@balch.org> To: <air-l@listserv.aoir.org> Sent: Tuesday, August 14, 2007 12:33 PM Subject: Re: [Air-L] The Spiders will find you (was wayback machine waspublic/private)
Interesting point about what is accessible on the Internet. I'd not judge the number of possibilities by the use of IP addresses. It is common practice to have many websites attached to one IP address and many IP addresses are used to connect to the internet but do not provide web content. Even when web content is available at an address, a complete path is necessary to get to the content. I've often placed content that I'd prefer the world not see using a web address that has no referring links and would not easily be guessed.
Search engines follow links that they find on pages. The big engines don't follow random possible content locations. Yes, there are programs that would allow a researcher (cracker) to explore all link possibilities on a site. Such an attempt without permission would be unethical. On the other hand, if you've announced your content to the world, the world has a right to explore your content.
I believe that we would all agree that information that a poster has made some effort to make private through the use of a password or even simple obscurity requires informed consent before a researcher should be allowed to us it. On the other hand, publicly presented information should be fair game. This does bring up an interesting question though. At what point can a researcher use hidden information? Historians routinely use the content of diaries and letters that the authors would probably prefer never become public.
The net is providing a fifth estate. Current USA laws are moving towards giving bloggers the same protections and responsibilities that are enjoyed by commercial reporters. Publicly posted that is clearly intended to be read is fair game and should not require review any more than using a reference from a journal or popular magazine.
Charlie Balch
-----Original Message----- From: elw@stderr.org Sent: Tuesday, August 14, 2007 8:10 AM
A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL.
Web crawlers don't typically have much luck crawling by IP address.
Name-based virtual hosting @ the level of the web server tends to make it less than adequate.
Best practice for virtualhosting is to make a hit directly to an IP address (rather than a name) return... nothing.
--e _
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
On literary blogs, blogs that contain creative writing, blogs that clearly create a copyright symbol or request on the blog page that copyright should be respected, does a researcher have the right to "research" the contents of that blog (which to me implies copying the content, trasposing it as "data" for the sake of qualitative or quantitative research purposes?
Your copyright doesn't mean that you can selectively tell people not to use things that you have published in a manner that is consistent with the doctrine of fair use. You don't get to pick. Sorry about that! --e
On 8/14/07, elw@stderr.org <elw@stderr.org> wrote:
Your copyright doesn't mean that you can selectively tell people not to use things that you have published in a manner that is consistent with the doctrine of fair use.
You don't get to pick. Sorry about that!
--e
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog? - Suzy ============================= NOTE: ANYONE WHO HAS READ ANY PART OF THE ABOVE EMAIL AGREES TO REFER TO THE AUTHOR AS "QUEEN SUZY" IN ALL SUBSEQUENT REFERENCES. FAILURE TO COMPLY MAY RESULT IN LEGAL ACTION.
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog?
General consensus among attorneys I know has long been that clickwrap licenses on *software* are questionable. By extension, I believe that such a thing on a blog post would be even more so. ToS/"don't research me" leads you quickly to the slippery slope down which such themes as "thoughtcrime" lie.... --e
Let's keep in mind that it is easy enough to make a blog with differing levels of access and thus private messages can stay private and public can be public. There is no reason to license anything really, you just have to properly configure your blog if you want private sections. On Aug 14, 2007, at 6:35 PM, elw@stderr.org wrote:
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog?
General consensus among attorneys I know has long been that clickwrap licenses on *software* are questionable. By extension, I believe that such a thing on a blog post would be even more so.
ToS/"don't research me" leads you quickly to the slippery slope down which such themes as "thoughtcrime" lie....
--e _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Jeremy Hunsinger Information Ethics Fellow, Center for Information Policy Research, School of Information Studies, University of Wisconsin-Milwaukee (www.cipr.uwm.edu) Words are things; and a small drop of ink, falling like dew upon a thought, produces that which makes thousands, perhaps millions, think. --Byron
yes, but again, we're assuming the uber-blogger. Let's say my Mom starts a blog, must we expect her to master password settings and the like? Do only the technically-proficient benefit from protections, rather than the average (or below) publishers of web content? -mz On Aug 14, 2007, at 7:54 PM, Jeremy Hunsinger wrote:
Let's keep in mind that it is easy enough to make a blog with differing levels of access and thus private messages can stay private and public can be public. There is no reason to license anything really, you just have to properly configure your blog if you want private sections. On Aug 14, 2007, at 6:35 PM, elw@stderr.org wrote:
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog?
General consensus among attorneys I know has long been that clickwrap licenses on *software* are questionable. By extension, I believe that such a thing on a blog post would be even more so.
ToS/"don't research me" leads you quickly to the slippery slope down which such themes as "thoughtcrime" lie....
--e _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http:// aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Jeremy Hunsinger Information Ethics Fellow, Center for Information Policy Research, School of Information Studies, University of Wisconsin-Milwaukee (www.cipr.uwm.edu)
Words are things; and a small drop of ink, falling like dew upon a thought, produces that which makes thousands, perhaps millions, think. --Byron
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
yes, i expect your mom, like my mom, to be able to become comfortable with blogging over a period of time, to be able to master what skills they desire to have, and from there to make decisions. uber bloggers don't happen over night, but then... all you really have to do is give your mom a copy of "Rule the Web" and that pretty much puts it as simply and easily as you can get... my assumption here is that if you want to write on the web... you can likely read. In the case that people using the web cannot read, we have a significant ethical problem centering around capacities, but that isn't the case for bloggers I'd posit. On Aug 14, 2007, at 7:09 PM, Michael Zimmer wrote:
yes, but again, we're assuming the uber-blogger. Let's say my Mom starts a blog, must we expect her to master password settings and the like? Do only the technically-proficient benefit from protections, rather than the average (or below) publishers of web content? -mz
jeremy hunsinger Information Ethics Fellow, Center for Information Policy Research, School of Information Studies, University of Wisconsin-Milwaukee (www.cipr.uwm.edu) wiki.tmttlt.com www.tmttlt.com () ascii ribbon campaign - against html mail /\ - against microsoft attachments http://www.stswiki.org/ sts wiki http://cfp.learning-inquiry.info/ Learning Inquiry-the journal http://transdisciplinarystudies.tmttlt.com/ Transdisciplinary Studies:the book series
I don't know how technically competent your mom is, but I would assume that anyone who's capable of starting a blog would be capable of clicking a box to instigate a password setting. It's offered as part of the setup in pretty much all blogging packages, and it's not really a high-tech activity. Adding side bars, now, that's difficult! M-H On 15/8/07 10:09 AM, "Michael Zimmer" <michael.zimmer@nyu.edu> wrote:
yes, but again, we're assuming the uber-blogger. Let's say my Mom starts a blog, must we expect her to master password settings and the like? Do only the technically-proficient benefit from protections, rather than the average (or below) publishers of web content? -mz
I believe the research already shows that most non-uber-bloggers use one of the online services such as LiveJournal or Blogger as their training ground. Both sites, and every other similar one I can think of, makes private and public very clear in their set-up. I think it may be "what if'ing" to think a newbie blogger could set-up their own site, and install and format the software without having enough knowledge to at least ask the password protection question...even if they didn't immediately know how to set a password up. Lois Ann Scheidt Doctoral Student - School of Library and Information Science, Indiana University, Bloomington IN USA Adjunct Instructor - School of Informatics, IUPUI, Indianapolis IN USA and IUPUC, Columbus IN USA Webpage: http://www.loisscheidt.com Blog: http://www.professional-lurker.com Quoting Michael Zimmer <michael.zimmer@nyu.edu>:
yes, but again, we're assuming the uber-blogger. Let's say my Mom starts a blog, must we expect her to master password settings and the like? Do only the technically-proficient benefit from protections, rather than the average (or below) publishers of web content? -mz
On Aug 14, 2007, at 7:54 PM, Jeremy Hunsinger wrote:
Let's keep in mind that it is easy enough to make a blog with differing levels of access and thus private messages can stay private and public can be public. There is no reason to license anything really, you just have to properly configure your blog if you want private sections. On Aug 14, 2007, at 6:35 PM, elw@stderr.org wrote:
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog?
General consensus among attorneys I know has long been that clickwrap licenses on *software* are questionable. By extension, I believe that such a thing on a blog post would be even more so.
ToS/"don't research me" leads you quickly to the slippery slope down which such themes as "thoughtcrime" lie....
--e _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http:// aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Jeremy Hunsinger Information Ethics Fellow, Center for Information Policy Research, School of Information Studies, University of Wisconsin-Milwaukee (www.cipr.uwm.edu)
Words are things; and a small drop of ink, falling like dew upon a thought, produces that which makes thousands, perhaps millions, think. --Byron
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
I love real-world analogies. I think analogies are vital in trying to understand any issue, and I stand by that here. So, toward that end, I propose that not knowing blogs are public is like not understanding the acoustics of the room in which you are speaking. If someone on the other end of a concourse in a heavily trafficked mall hears what you were saying to your friend because the sound bounced in an unexpected way (no sound-augmentation equipment was used), do you really have any right to ask the hearer to keep quiet about what you said? I think that knowledge determines behavior. Ignorance might lead to behavior which the agent might later regret, but it does not mean that the obligations on those around the agent are different. Or? Conor Michael Zimmer wrote:
yes, but again, we're assuming the uber-blogger. Let's say my Mom starts a blog, must we expect her to master password settings and the like? Do only the technically-proficient benefit from protections, rather than the average (or below) publishers of web content? -mz
On Aug 14, 2007, at 7:54 PM, Jeremy Hunsinger wrote:
Let's keep in mind that it is easy enough to make a blog with differing levels of access and thus private messages can stay private and public can be public. There is no reason to license anything really, you just have to properly configure your blog if you want private sections. On Aug 14, 2007, at 6:35 PM, elw@stderr.org wrote:
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog?
General consensus among attorneys I know has long been that clickwrap licenses on *software* are questionable. By extension, I believe that such a thing on a blog post would be even more so.
ToS/"don't research me" leads you quickly to the slippery slope down which such themes as "thoughtcrime" lie....
--e _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http:// aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Jeremy Hunsinger Information Ethics Fellow, Center for Information Policy Research, School of Information Studies, University of Wisconsin-Milwaukee (www.cipr.uwm.edu)
Words are things; and a small drop of ink, falling like dew upon a thought, produces that which makes thousands, perhaps millions, think. --Byron
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http:// listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Unfortunately, courts in the US and EU now routinely enforce shrinkwrap and clickwrap licenses on software, and are increasingly inclined to enforce "browsewrap" licenses on web sites. So ToS like "by accessing this web page you agree not to publish any data obtained" is not at all far fetched. Charles Ess, Gove Allen and I have a paper in the works addressing aspects of this. DLB On Aug 14 2007, elw@stderr.org wrote:
Copyright does not let you pick, but what if I include a restrictive license? Someone earlier suggested a "Researchers May Not Research Me" license, for example. How far may "Terms of Service" extend? Even if I do not have password protection, couldn't readers be exposed to a clickwrap license (ToS) on reading my blog?
General consensus among attorneys I know has long been that clickwrap licenses on *software* are questionable. By extension, I believe that such a thing on a blog post would be even more so.
ToS/"don't research me" leads you quickly to the slippery slope down which such themes as "thoughtcrime" lie....
--e _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dan L. Burk Oppenheimer, Wolff & Donnelly Professor University of Minnesota Law School 229 19th Avenue South Minneapolis, MN 55455 ********************************** voice: 612-626-8726 fax: 612-625-2011 bits: burkx006@umn.edu
Two questions ...
On literary blogs, blogs that contain creative writing, blogs that clearly create a copyright symbol or request on the blog page that copyright should be respected, does a researcher have the right to "research" the contents of that blog (which to me implies copying the content, trasposing it as "data" for the sake of qualitative or quantitative research purposes?
Since all published work is covered under the same US legal statutes, I would say that you could still research the contents and publish under fair use exemptions. However I bow to our legal colleagues greater knowledge. What I wanted to comment on is something I feel has been repeated in a variety of forms in this discussion...the idea of copyright for creative work (i.e. fiction) being somehow more secure then for non-fiction works. I believe that from a creative standpoint it's all the same...published work.
What is wrong or problematic about a researcher simply asking the blog owner if it is acceptable to them if their blog contents become the subject of a research project? Beyond the rules and regs of an ethics research board, what is wrong with simply asking upfront? rather than working surreptitiously, lurking or working from an alias?
There is absolutely nothing wrong with asking if that's what you want to do as a researcher. I have no human subjects related problem with asking...my issue is more philosophical in that I don't want "in the wild" research to become a study of outliers, or only those that want to be studied. Also don't forget that "asking" can, though doesn't always, change the phenomena...which is why I don't want to do experimental research. I guess my biggest question here is what would be the problem with working surreptitiously, lurking, or working from an alias, is the issue copyright or the amount of risk...because they really are very different issues that shouldn't be confounded. Lois
Lois, Your comments are brilliant. Good research causes change. Charlie Balch -----Original Message----- From: Lois Ann Scheidt Subject: Re: [Air-L] The Spiders will find you (was waybackmachine waspublic/private) Also don't forget that "asking" can, though doesn't always, change the phenomena...which is why I don't want to do experimental research. I guess my biggest question here is what would be the problem with working surreptitiously, lurking, or working from an alias, is the issue copyright or the amount of risk...because they really are very different issues that shouldn't be confounded. Lois
Charlie as usual you are superb and I like what Lois has said too. I know as a producer that sometimes what works best is the unrehearsed and pure experiments sometimes miss this. The real world is quite similar to the simulated world, but it is the nuances that make all of the major differences in the world. -----Original Message----- From: air-l-bounces@listserv.aoir.org [mailto:air-l-bounces@listserv.aoir.org] On Behalf Of Charlie Balch Sent: Tuesday, August 14, 2007 5:42 PM To: air-l@listserv.aoir.org Subject: Re: [Air-L] The Spiders will find you (waswaybackmachine waspublic/private) Lois, Your comments are brilliant. Good research causes change. Charlie Balch -----Original Message----- From: Lois Ann Scheidt Subject: Re: [Air-L] The Spiders will find you (was waybackmachine waspublic/private) Also don't forget that "asking" can, though doesn't always, change the phenomena...which is why I don't want to do experimental research. I guess my biggest question here is what would be the problem with working surreptitiously, lurking, or working from an alias, is the issue copyright or the amount of risk...because they really are very different issues that shouldn't be confounded. Lois _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/
JCU, Great questions and I don't claim to be an expert on ethics or the law so I respond with my opinions. I admit a bias towards the belief that researchers should be allowed to use whatever is publicly available. Notification to the author (blogger) that their posting is part of a study would be courtesy but should not be a requirement. I appreciate that exposure can cause harm. I suppose that I'm a good case study. I've posted blog-like links to various personal thoughts and experiences on my homepage (http://charlie.balch.org). Just the other day my wife was not at all happy with me as she revisited content where I discussed experiences before we ever met. We are ten years happily married and I got flack for stuff that happened before we ever met. I have not and will not remove the content. I do need to update it though -- we have not lived in the islands for years. For those who visit my home page, my wife likes the mushy story but not the boat logs. My students often comment on the mushy story part. Charlie Balch -----Original Message----- From: air-l-bounces@listserv.aoir.org [mailto:air-l-bounces@listserv.aoir.org] On Behalf Of jcu Sent: Tuesday, August 14, 2007 1:50 PM To: air-l@listserv.aoir.org Subject: Re: [Air-L] The Spiders will find you (was wayback machinewaspublic/private) Two questions ... On literary blogs, blogs that contain creative writing, blogs that clearly create a copyright symbol or request on the blog page that copyright should be respected, does a researcher have the right to "research" the contents of that blog (which to me implies copying the content, trasposing it as "data" for the sake of qualitative or quantitative research purposes? What is wrong or problematic about a researcher simply asking the blog owner if it is acceptable to them if their blog contents become the subject of a research project? Beyond the rules and regs of an ethics research board, what is wrong with simply asking upfront? rather than working surreptitiously, lurking or working from an alias? Simply wondering about this ... jcu ----- Original Message ----- From: "Charlie Balch" <charlie@balch.org> To: <air-l@listserv.aoir.org> Sent: Tuesday, August 14, 2007 12:33 PM Subject: Re: [Air-L] The Spiders will find you (was wayback machine waspublic/private)
Interesting point about what is accessible on the Internet. I'd not judge the number of possibilities by the use of IP addresses. It is common practice to have many websites attached to one IP address and many IP addresses are used to connect to the internet but do not provide web content. Even when web content is available at an address, a complete path is necessary to get to the content. I've often placed content that I'd prefer the world not see using a web address that has no referring links and would not easily be guessed.
Search engines follow links that they find on pages. The big engines don't follow random possible content locations. Yes, there are programs that would allow a researcher (cracker) to explore all link possibilities on a site. Such an attempt without permission would be unethical. On the other hand, if you've announced your content to the world, the world has a right to explore your content.
I believe that we would all agree that information that a poster has made some effort to make private through the use of a password or even simple obscurity requires informed consent before a researcher should be allowed to us it. On the other hand, publicly presented information should be fair game. This does bring up an interesting question though. At what point can a researcher use hidden information? Historians routinely use the content of diaries and letters that the authors would probably prefer never become public.
The net is providing a fifth estate. Current USA laws are moving towards giving bloggers the same protections and responsibilities that are enjoyed by commercial reporters. Publicly posted that is clearly intended to be read is fair game and should not require review any more than using a reference from a journal or popular magazine.
Charlie Balch
-----Original Message----- From: elw@stderr.org Sent: Tuesday, August 14, 2007 8:10 AM
A web crawler will find you, that's the point. There are a finite number of IP addresses, 4,294,967,296 (232) , these are what get resolved from a URL.
Web crawlers don't typically have much luck crawling by IP address.
Name-based virtual hosting @ the level of the web server tends to make it less than adequate.
Best practice for virtualhosting is to make a hit directly to an IP address (rather than a name) return... nothing.
--e _
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/ -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.10/943 - Release Date: 8/8/2007 5:38 PM
I think that a better analogy for an unlisted website "in the wild" would be a broadsheet posted to a tree deep in the woods, with no signs pointing to it, and it doesn't exist on any maps. Of course, once someone /does /find it by walking around at random (or by systematically trying to cover all ground in the forest), its location can easily be placed on any map (i.e. linked to). I'm going to further say that I believe the woods to be on national park land, so that anyone can freely traverse them. I understand this will complicate the analogy for some people on this list, but I had to state my opinion entirely. Lois Ann Scheidt wrote:
Now beyond that, I would argue that from a research perspective the blog was still open to study, if it can be found. I say that because the best terrestrial analogy I can think of to your thought experiment is someone who posted a broadsheet on a pole but with a cover over the broadsheet...lots of things could pull that cover off...the wind, rain, or a passing person who didn't like the color of the cover. In short, a person who would go to that much trouble to hide in plan sight...well it's still plan sight.
In response to what's written below, this is my favorite part of your lengthy post. This is a very compelling angle; it's showing that the scale is changing so quickly in this field from what social scientists have traditionally been exposed to! In an online ethnography I did, I kept the true names (true aliases?) of members I came across, because so often the names themselves were the subject of conversation and were absolutely crucial to the handling of identity among the group. I could not obfuscate the names and retain and semblance of validity in the conclusions I drew from observing this behavior. Conor
Some years ago, I wrote a classroom paper that discussed why I don't pseudonymize my subjects in my chatroom research. The simple reason is that there is no way I could come up with nicknames that would cover the participants and not potentially deflect on to another chatroom user. I'm good but I'm not THAT good.
So if I did pseudonymize I might protect my participants but in that one-in-a-billion case where someone was in harm's way from something I did...I might well have created the situation rather then hidden it. I'll let you "what if'ers" out there run that one through the filters.
Lois
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Lois, thank you again for your thoughtful summary of why most of the research conducted in and with the Internet is far from being ethically questionable. In particular that debate about Google's street view imho highlighted the fact that there is really no need for questioning cases where individuals place information about *themselves* on the Internet and that information is then used in research. Google places information about *others* on the Internet and does so in an all-encompassing way. In light of this comparison it seems unacceptable that time is wasted by criticizing trained researchers for responsible work. Furthermore, by using the Internet for research we may produce knowledge that here and there is to the benefit of someone (think of Berners-Lee's research, for example ;-). By conducting research on the Internet instead of offline places we often reduce harm. To use a blunt example by Michael Birnbaum: no case was heard of or seems logically plausible of research participants being injured or killed in Internet research. Compare that to the millions of cases where research participants are invited to physical research laboratories and take their car to drive there... Internet scientists and Internet researchers change the way research is conducted, often to the better. Don't bug them with absurd "what if" scenarios, fight Google's Orwellian practices instead. Best --u At 10:03 Uhr -0400 14.8.2007, Lois Ann Scheidt wrote:
But to complicate your argument below (and Hughie's in a separate thread), a couple of thought experiments:
Say I place content on a "publicly accessible" webpage without creating any incoming links or notifying anyone. Web crawlers won't find it. A search engine won't index it. While on the open and public Internet, unless a random URL-generator happens to guess the precise address of the page, no one will ever read it. Is this content "fair game for researchers"?
I don't profess to be a "hardware" or a "search engine" expert, but I can tell you that the above thought experiment only works if you have complete control of the hardware and software involved in the webpage. Many hosts webpages link to pages of folks hosted by their service. It's a term and condition of service...so it's not nearly as easy to be completely off the search engines as one may think.
Beyond that point, for those of us that do "in the wild" research...how would we find your thought experiment if it's not indexed? I guess it would be possible to have a crawler that would randomly generate URLs and check their validity. That might be useful for some research question, but in that case I would assume the unit of analysis would be the URL, at most, and more likely it would be a proof of concept test of some kind - so no human subject would be involved.
Now beyond that, I would argue that from a research perspective the blog was still open to study, if it can be found. I say that because the best terrestrial analogy I can think of to your thought experiment is someone who posted a broadsheet on a pole but with a cover over the broadsheet...lots of things could pull that cover off...the wind, rain, or a passing person who didn't like the color of the cover. In short, a person who would go to that much trouble to hide in plan sight...well it's still plan sight.
I think that's part of why this "what if" game gets bogus, a person who would make an effort to do the above, isn't all that likely to be "real" and I study real people. "What if's" are valuable in these discussions but again we have to remember that we can dream up far more elaborate scenarios then we are ever likely to see in the wild.
Consider a different scenario: Walking down the street with a friend, I comment on the fact my rent check bounced, and she offers condolences. We're in public. We make the utterances loud enough that someone next to us could easily hear it, but not loud enough that someone across the street can. Now, this is a public utterance - we didn't have the conversation behind locked (password-protected) doors. Can this be archived for research? Better yet, can we setup microphones to automatically record every conversation uttered in public? Does the fact these conversations *can* be recorded and archived mean they, by default, *should be*? I feel we're saying the same about utterances on the Web - so what's the fundamental difference from a user perspective between the street and the Web? Both are a place where we engage in conversations and maintain relationships....
I fear we're confusing what the Internet *is* and *is capable of* with *how it is used*. Yes, comments left on a personal blog are open for anyone to see. Yes, discussion board conversations can be archived. But that's not, I suspect, going through the mind of many casual users of this technology. It is a medium for communication, for connecting with people. I leave messages on my neighborhood parents discussion board because that's how I connect with that community. I use my real name because I'm among friends. Does that mean my comments are de facto fair game for any researcher who wants to scrape the database?
Actually I think the "confusion" is at a different level. Your second thought experiment gets at it fairly well. When I am designing research I can only know the "potential" harms, not the actual ones...potentials are before the fact, actuals are retrospective.
Let's change your experiment a bit. You are home and in the shower getting ready to meet your friend. You think about the stuff you are going to tell them and you decide to mention your bounced check. And maybe it crosses your mind that you need to be careful where you say that so fewer people over hear.
When you decide to tell your friend you don't know who is really around...or what someone who overhears might do with the utterance. You only know that you are going to do your best to make sure that you are careful.
Same with research design and human subjects work.
We have to learn from actual happenings but if we sit and do statistically, oh my a qualitative researcher just used the s-word, improbable "what-if's" then no research will happen at all.
Personally I believe the following -
- Research is never risk-free. Neither is life. If you are working toward absolute zero risk in your research you will either give up the pursuit, or build a protocal for your research that lies to you. Yes there are ways to protect YOUR subjects but most of them can "endanger" others in the process (more on this below). Merry Christmas that is not zero risk, it's just minimal risk for the research and probably a heck of a lot of your work that didn't buy anyone more "safety."
All of the aforementioned "harm" and "protective" terms were used only as part of the discussion, because what is the real chance of harm from most of our research, one-in-a-million, one-in-a-billion, one-in-a-trillion? Medical experiments don't get to absolute zero risk...they just shift the potential risk - which may be minimal as well - from the researcher to the patient...unless something unexpected happens and then the risk is back on the researcher. And how do they "know" the potential risks? From computer simulations - based on facts known from old (and often questionable) human experiments, and/or from animal trials. So their "potential" risk knowledge comes from analyzing actual risk after a trial, not from "what if's."
- I design research...I predict the potential harm from my "intervention" and I work to minimize the risk ("minimal risk" is the operative term). I do not mind read or assume my position as researcher places me in a superior position and that my subjects need special protections from me.
Sorry folks but I truly find that last concept to be pretty funny. I am not injecting people with foreign substances, I look at artifacts they produced and placed online. Personally, I have never had a artifact's author/artist come back to me after I published and complain. I have had a single author question my use of his work without permission - I work with teens remember - and so we talked and by the end of the discussion he understood and told me he was pleased his work was in the paper he just thought he HAD to give permission first. Plus he found out by tracking his visitor logs and finding I had visited repeatedly, so it was not because of an unpleasant experience on his part.
Now before I sound like a totally heartless researcher who only sees her subjects as things...when nothing could be further from the truth...if even one of the kids I work with were hurt by my research I would have a crisis of conscience. But again I am not God, I know that if I had done the best job I could do working through my research design and trying to predict all the potential harms likely to happen with my intervention then I have done everything I can do.
Some years ago, I wrote a classroom paper that discussed why I don't pseudonymize my subjects in my chatroom research. The simple reason is that there is no way I could come up with nicknames that would cover the participants and not potentially deflect on to another chatroom user. I'm good but I'm not THAT good.
So if I did pseudonymize I might protect my participants but in that one-in-a-billion case where someone was in harm's way from something I did...I might well have created the situation rather then hidden it. I'll let you "what if'ers" out there run that one through the filters.
Lois
participants (13)
-
burkx006@umn.edu -
Charlie Balch -
Conor Schaefer -
elw@stderr.org -
Heidelberg, Chris -
jcu -
Jeremy Hunsinger -
Lois Ann Scheidt -
Martin Garthwaite -
mhward -
Michael Zimmer -
Susan Chang -
Ulf-Dietrich Reips