New subject: on the Wayback Machine (was public/private [part 1 of 2])

14 Aug 2007

      ...
But to complicate your argument below (and Hughie's in a separate
thread), a couple of thought experiments:
Say I place content on a "publicly accessible" webpage without
creating any incoming links or notifying anyone. Web crawlers won't
find it. A search engine won't index it. While on the open and public
Internet, unless a random URL-generator happens to guess the precise
address of the page, no one will ever read it. Is this content "fair
game for researchers"?
I don't profess to be a "hardware" or a "search engine" expert, but I 
can tell you that the above thought experiment only works if you have 
complete control of the hardware and software involved in the webpage.  
Many hosts webpages link to pages of folks hosted by their service.  
It's a term and condition of service...so it's not nearly as easy to be 
completely off the search engines as one may think.

Beyond that point, for those of us that do "in the wild" research...how 
would we find your thought experiment if it's not indexed?  I guess it 
would be possible to have a crawler that would randomly generate URLs 
and check their validity.  That might be useful for some research 
question, but in that case I would assume the unit of analysis would be 
the URL, at most, and more likely it would be a proof of concept test 
of some kind - so no human subject would be involved.

Now beyond that, I would argue that from a research perspective the 
blog was still open to study, if it can be found.  I say that because 
the best terrestrial analogy I can think of to your thought experiment 
is someone who posted a broadsheet on a pole but with a cover over the 
broadsheet...lots of things could pull that cover off...the wind, rain, 
or a passing person who didn't like the color of the cover.  In short, 
a person who would go to that much trouble to hide in plan sight...well 
it's still plan sight.

I think that's part of why this "what if" game gets bogus, a person who 
would make an effort to do the above, isn't all that likely to be 
"real" and I study real people.  "What if's" are valuable in these 
discussions but again we have to remember that we can dream up far more 
elaborate scenarios then we are ever likely to see in the wild.
...
Consider a different scenario: Walking down the street with a friend,
I comment on the fact my rent check bounced, and she offers
condolences. We're in public. We make the utterances loud enough that
someone next to us could easily hear it, but not loud enough that
someone across the street can. Now, this is a public utterance - we
didn't have the conversation behind locked (password-protected)
doors. Can this be archived for research? Better yet, can we setup
microphones to automatically record every conversation uttered in
public? Does the fact these conversations *can* be recorded and
archived mean they, by default, *should be*? I feel we're saying the
same about utterances on the Web - so what's the fundamental
difference from a user perspective between the street and the Web?
Both are a place where we engage in conversations and maintain
relationships....
I fear we're confusing what the Internet *is* and *is capable of*
with *how it is used*. Yes, comments left on a personal blog are open
for anyone to see. Yes, discussion board conversations can be
archived. But that's not, I suspect, going through the mind of many
casual users of this technology. It is a medium for communication,
for connecting with people. I leave messages on my neighborhood
parents discussion board because that's how I connect with that
community. I use my real name because I'm among friends. Does that
mean my comments are de facto fair game for any researcher who wants
to scrape the database?
Actually I think the "confusion" is at a different level.  Your second 
thought experiment gets at it fairly well.  When I am designing 
research I can only know the "potential" harms, not the actual 
ones...potentials are before the fact, actuals are retrospective.

Let's change your experiment a bit.  You are home and in the shower 
getting ready to meet your friend.  You think about the stuff you are 
going to tell them and you decide to mention your bounced check.  And 
maybe it crosses your mind that you need to be careful where you say 
that so fewer people over hear.

When you decide to tell your friend you don't know who is really 
around...or what someone who overhears might do with the utterance.  
You only know that you are going to do your best to make sure that you 
are careful.

Same with research design and human subjects work.

We have to learn from actual happenings but if we sit and do 
statistically, oh my a qualitative researcher just used the s-word, 
improbable "what-if's" then no research will happen at all.

Personally I believe the following -

- Research is never risk-free.  Neither is life.  If you are working 
toward absolute zero risk in your research you will either give up the 
pursuit, or build a protocal for your research that lies to you.  Yes 
there are ways to protect YOUR subjects but most of them can "endanger" 
others in the process (more on this below).  Merry Christmas that is 
not zero risk, it's just minimal risk for the research and probably a 
heck of a lot of your work that didn't buy anyone more "safety."

All of the aforementioned "harm" and "protective" terms were used only 
as part of the discussion, because what is the real chance of harm from 
most of our research, one-in-a-million, one-in-a-billion, 
one-in-a-trillion?  Medical experiments don't get to absolute zero 
risk...they just shift the potential risk - which may be minimal as 
well - from the researcher to the patient...unless something unexpected 
happens and then the risk is back on the researcher.  And how do they 
"know" the potential risks?  From computer simulations - based on facts 
known from old (and often questionable) human experiments, and/or from 
animal trials.  So their "potential" risk knowledge comes from 
analyzing actual risk after a trial, not from "what if's."

- I design research...I predict the potential harm from my 
"intervention" and I work to minimize the risk ("minimal risk" is the 
operative term).  I do not mind read or assume my position as 
researcher places me in a superior position and that my subjects need 
special protections from me.

Sorry folks but I truly find that last concept to be pretty funny.  I 
am not injecting people with foreign substances, I look at artifacts 
they produced and placed online.  Personally, I have never had a 
artifact's author/artist come back to me after I published and 
complain.  I have had a single author question my use of his work 
without permission - I work with teens remember - and so we talked and 
by the end of the discussion he understood and told me he was pleased 
his work was in the paper he just thought he HAD to give permission 
first.  Plus he found out by tracking his visitor logs and finding I 
had visited repeatedly, so it was not because of an unpleasant 
experience on his part.

Now before I sound like a totally heartless researcher who only sees 
her subjects as things...when nothing could be further from the 
truth...if even one of the kids I work with were hurt by my research I 
would have a crisis of conscience.  But again I am not God, I know that 
if I had done the best job I could do working through my research 
design and trying to predict all the potential harms likely to happen 
with my intervention then I have done everything I can do.

Some years ago, I wrote a classroom paper that discussed why I don't 
pseudonymize my subjects in my chatroom research.  The simple reason is 
that there is no way I could come up with nicknames that would cover 
the participants and not potentially deflect on to another chatroom 
user.  I'm good but I'm not THAT good.

So if I did pseudonymize I might protect my participants but in that 
one-in-a-billion case where someone was in harm's way from something I 
did...I might well have created the situation rather then hidden it.  
I'll let you "what if'ers" out there run that one through the filters.

Lois

Re: [Air-L] on the Wayback Machine (was public/private [part 1 of 2])

Lois Ann Scheidt

Martin Garthwaite

elw＠stderr.org

Charlie Balch

Charlie Balch

Conor Schaefer

jcu

elw＠stderr.org

Susan Chang

elw＠stderr.org

Jeremy Hunsinger

Michael Zimmer

Jeremy Hunsinger

mhward

Lois Ann Scheidt

Conor Schaefer

burkx006＠umn.edu

Lois Ann Scheidt

Charlie Balch

Heidelberg, Chris

Charlie Balch

Conor Schaefer

Ulf-Dietrich Reips

tags

participants (13)