a question about privacy protection and copyright in Internet research
Hi all, Just a short comment about robots.txt. Whether robots.txt should be respected or not by web archives is not as unequivocal as put by Jeremy below - there are national differences. The national Danish web archive Netarkivet.dk is based on the "Act no. 1439 of December 22, 2004 on Legal Deposit of Published Material", and here it is stated that the archive is not obliged to respect robots.txt. The rationale for this is that once it's out there it's out there, and as long as it's accessible to everyone it's 'public' - thus it's part of the cultural heritage, thus it should be archived. This also applies for password protected material IF everyone could get a password, no matter if s/he had to pay for it. And if you don't want it to be public, then don't put it out there - robots.txt does not prevent people to see it, then it's public and will be archived. In the Danish legislation ignoring robots.txt is being considered necessary to collect all relevant material. However, the acces to the web archive is very restricted: there's only acces for scholars, and you have to apply for acces. If interested, the Danish web archive has made a short fact sheet: http://netarkivet.dk/publikationer/Fact%20sheet%20Webarchiving%20in%20Denmar... And a FAQ: http://netarkivet.dk/faq/index-en.php - and about robots.txt: http://netarkivet.dk/faq/index-en.php#faq_robots Best, Niels Brügger ----------- Message: 9 Date: Sat, 7 May 2011 11:18:04 -0400 From: jeremy hunsinger <jhuns@vt.edu> To: aoir list <air-l@aoir.org> Subject: Re: [Air-L] a question about privacy protection and copyright in Internet research Message-ID: <B7115BE4-D7AA-4B72-8C70-CF029ADF6AEE@vt.edu> Content-Type: text/plain; charset=us-ascii I think the tendency is to muddy the waters immensely here, but I also don't think we need to muddy the waters in regards to the Document vs Research Subject distinction. If you were allowed to research facebook, then you could do it either way or both ways, having subjects and documents, having just documents, or having just subjects. But once we are dealing with documents, then the only question we have is whether those are published documents or not. I think that instead of muddying the waters and continuing to say it is not simple, as we are inclined to do as academics, is going to continue to cause us grief and possibly prevent perfectly reasonable research, and thus i think we should embark on the other strategy that says: 'We can make this simple'. If it is published, it is public and open to research, you determine if it is published using these guidelines. If it is not published, then what is it, is it a private diary? is it a private letter? who has rights to the material and how can it be released for research. If you are dealing with research subjects, in what way are you doing that? if you are just reading their postings... you are not interacting with them and not creating research subjects, if you are doing an ethnography or participatory or action research then yes you are interacting with them and you are creating human subjects, in short, a matrix of methods in relation to their objects would great ly clarify the document vs subject distinction. In terms of public private on the web, my position is more or less that if you put it on the web and you do not protect in in some manner via legal device, technical system, or otherwise, then you are producing a public document and that's the end of it. People I would argue that it does not matter if that was not your intent or that you wanted it to be private, what matters is that you committed something to the public record, and while you can withdraw it, once it is distributed, you might find that very difficult, and withdrawing likely doesn't change it's public status, it just changes the ease of access to that data. Robots.txt should not be ignored. it is one of the technical means that people use to secure their property on the web. If there is a robots txt and it prevents you from making a copy of something, then i'd guess that the owners of the material do not want you to use it for research, and you'd need to get permission. ------------------------------------------------------------ LATEST PUBLICATIONS AND PAPERS February 2011 "Web Archiving — Between Past, Present, and Future", The Handbook of Internet Studies (eds. M. Consalvo, & C. Ess), Wiley-Blackwell 2011, pp. 24-42 Read more on the publishers website: http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1405185880.html April 2010 Web History (ed.), Peter Lang Publishing, New York 2010 Read more on the publishers website: http://www.peterlang.com/index.cfm?vID=310468&vLang=E&vHR=1&vUR=2&vUUR=1 January 2010 Website Analysis, Papers from The Centre for Internet Research, 12, The Centre for Internet Research, Aarhus 2010, [ get an electronic copy at: http://cfi.au.dk/en/publications/paper/#a12 ] NIELS BRÜGGER, Associate Professor, PhD Director, the Centre for Internet Research Department of Information and Media Studies Aarhus University Helsingforsgade 14 8200 Aarhus N Denmark Phone (switchboard) +45 8942 1111 Phone (direct) +45 8942 9226 Telefax + 45 8942 5950 E-mail nb@imv.au.dk Webpage http://imv.au.dk/~nb Profile at LinkedIn: http://www.linkedin.com/pub/1/50a/555 Profile at Kommunikationsforum [in Danish]: www.kommunikationsforum.dk/Niels-Brugger The Centre for Internet Research http://cfi.au.dk The history of dr.dk, 1996-2006 http://drdk.dk LARM (Radio Culture and Auditory Resources Research Infrastructure) http://www.larm-archive.org
i'd tend to think that nation-states have different standards and practices than researchers:) there are also pages that say do not archive and do not research. for instance, the other day someone showed me a link to a dating site where several posts that have this text: ***WARNING: Any institutions or individuals using this site or any of its associated sites for studies or projects - You DO NOT have permission to use any of my profile or pictures in any form or forum both current and future. If you have or do, it will be considered a violation of my privacy and will be subject to legal ramifications. It is recommended that other members post a similar notice to this or you may copy and paste this one.*** .... now that's in their profiles in open text boxes. however, it is also available without issue on the open web, nor is it clear in any form that they own the data they think is private, though they clearly think that they do. this looks like it spread meme-like across the site, but i have no idea how many users have this link statement. but it is clear to me that some of the users, like we had in responses in various online fora after ir 1.0 which i presented on at ir 2.0, do not think they should be researched.
Hi Jeremy, Your comment below indicate to me that it may be relevant to distinguish between two discussions: one related to copyright, and one related to data protection. Again, to use the Danish web archive as an example: the access to the archive is based on two different Acts, the copyright law and the data protection act. Each of these laws regulates it's own area: who owns the data in the archive?, and who can get access to the data in the archive? And these two issues are not necessarily handled in the same way: you may not owe your data, but still want to protect them. In fact, this is the main reason why the acces to the Danish web archive is very restricted, since the data protection authorities consider the archive a data handler, and since the archive cannot guarantee that protected data cannot be found in the archive only researchers have acces, and not the general public. But the point is that access to data can be a question of ownership as well as of privacy, and the two does not necessarily coincide. Best, Niels Brügger Den 08/05/2011 kl. 13.32 skrev jeremy hunsinger:
i'd tend to think that nation-states have different standards and practices than researchers:) there are also pages that say do not archive and do not research.
for instance, the other day someone showed me a link to a dating site where several posts that have this text:
***WARNING: Any institutions or individuals using this site or any of its associated sites for studies or projects - You DO NOT have permission to use any of my profile or pictures in any form or forum both current and future. If you have or do, it will be considered a violation of my privacy and will be subject to legal ramifications. It is recommended that other members post a similar notice to this or you may copy and paste this one.***
.... now that's in their profiles in open text boxes. however, it is also available without issue on the open web, nor is it clear in any form that they own the data they think is private, though they clearly think that they do. this looks like it spread meme-like across the site, but i have no idea how many users have this link statement. but it is clear to me that some of the users, like we had in responses in various online fora after ir 1.0 which i presented on at ir 2.0, do not think they should be researched.
------------------------------------------------------------ LATEST PUBLICATIONS AND PAPERS February 2011 "Web Archiving — Between Past, Present, and Future", The Handbook of Internet Studies (eds. M. Consalvo, & C. Ess), Wiley-Blackwell 2011, pp. 24-42 Read more on the publishers website: http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1405185880.html April 2010 Web History (ed.), Peter Lang Publishing, New York 2010 Read more on the publishers website: http://www.peterlang.com/index.cfm?vID=310468&vLang=E&vHR=1&vUR=2&vUUR=1 January 2010 Website Analysis, Papers from The Centre for Internet Research, 12, The Centre for Internet Research, Aarhus 2010, [ get an electronic copy at: http://cfi.au.dk/en/publications/paper/#a12 ] NIELS BRÜGGER, Associate Professor, PhD Director, the Centre for Internet Research Department of Information and Media Studies Aarhus University Helsingforsgade 14 8200 Aarhus N Denmark Phone (switchboard) +45 8942 1111 Phone (direct) +45 8942 9226 Telefax + 45 8942 5950 E-mail nb@imv.au.dk Webpage http://imv.au.dk/~nb Profile at LinkedIn: http://www.linkedin.com/pub/1/50a/555 Profile at Kommunikationsforum [in Danish]: www.kommunikationsforum.dk/Niels-Brugger The Centre for Internet Research http://cfi.au.dk The history of dr.dk, 1996-2006 http://drdk.dk LARM (Radio Culture and Auditory Resources Research Infrastructure) http://www.larm-archive.org
I tend to think that; if you do not own the data, then your rights in regards to the data, including rights to privacy of that data, are curtailed and likely non-existent. Te question of course then becomes who owns the data and who has copyright to the data because those are different things.
I tend to think that; if you do not own the data, then your rights in regards to the data, including rights to privacy of that data, are curtailed and likely non-existent.
Er -- this is clearly not true in the EU member states and other countries with strong data protection regimes. For that matter, it's not even true under certain U.S. laws (common law privacy, Fair Credit Reporting, PPA, COPPA, HPPA, etc., etc). So you might need to revise that thinking a little bit. DLB
Te question of course then becomes who owns the data and who has copyright to the data because those are different things.
To the extent that the data constitutes factual indicia, there will be no copyright because facts are not protectable under copyright at all. Some types of social science "data" (e.g., ethnographies) might be the subject of copyright. But I agree they are two different things. -- Dan L. Burk Chancellor's of Law University of California, Irvine 4500 Berkeley Place Irvine, CA 92697-8000 Voice: (949) 824-9325 Fax: (949)824-7336
participants (4)
-
Dan L. Burk -
jeremy hunsinger -
Jeremy hunsinger -
Niels Brügger