Library of Congress Acquires Entire (Public) Twitter Archive
For those who haven't already seen the news, the Library of Congress announced today that they are acquiring the entire archive of public Twitter activity since March 2006: Library of Congress Announcement: How Tweet It Is!: Library Acquires Entire Twitter Archive http://www.loc.gov/tweet/how-tweet-it-is.html Twitter Announcement: Tweet Preservation http://blog.twitter.com/2010/04/tweet-preservation.html And my initial probe of various open concerns: Open Questions about Library of Congress Archiving Twitter Streams http://michaelzimmer.org/2010/04/14/open-questions-about-library-of-congress... -- Michael Zimmer, PhD Assistant Professor, School of Information Studies Associate, Center for Information Policy Research University of Wisconsin-Milwaukee e: zimmerm@uwm.edu w: www.michaelzimmer.org
The key point from Twitter's blog post is: "Only after a six-month delay can the Tweets will [sic] be used for internal library use, *for non-commercial research*, public display by the library itself, and preservation." --- Alexander Leavitt Research Specialist, Convergence Culture Consortium Comparative Media Studies, MIT http://doalchemy.org Twitter: @alexleavitt On Wed, Apr 14, 2010 at 6:13 PM, Michael Zimmer <zimmerm@uwm.edu> wrote:
For those who haven't already seen the news, the Library of Congress announced today that they are acquiring the entire archive of public Twitter activity since March 2006:
Library of Congress Announcement: How Tweet It Is!: Library Acquires Entire Twitter Archive http://www.loc.gov/tweet/how-tweet-it-is.html
Twitter Announcement: Tweet Preservation http://blog.twitter.com/2010/04/tweet-preservation.html
And my initial probe of various open concerns: Open Questions about Library of Congress Archiving Twitter Streams
http://michaelzimmer.org/2010/04/14/open-questions-about-library-of-congress...
-- Michael Zimmer, PhD Assistant Professor, School of Information Studies Associate, Center for Information Policy Research University of Wisconsin-Milwaukee e: zimmerm@uwm.edu w: www.michaelzimmer.org
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Donning my privacy hat for a minute .... I'm not exactly enthused by this idea, especially w/o Twitter/LOC giving folks the option to purge their tweets or clean up their streams that will appear under (in many cases) their real names for all posterity. Granted, that would undermine the purpose of collecting the entire "Twitterspace" for the LOC's stated purposes, but how many non-geekish Twitterati will be upset/incensed/enraged that their stuff is now a matter of permanent public record at the world's largest library? (Obviously, the whole "you-were-warned-and-should-have-known-what-you-were-getting-into-and/or-doing" rationale can apply in many cases.) Yes, there's "social networking" as a general concept, and I'll endorse it completely. But I'll also suggest there's derivatives to that concept called "public" social networking (ie, anyone can read/post, the owner doesn't care) and "private" social networking (ie, the owner uses and restricts their activities on a given SNS within their private social circles.) Certainly, Twitter can facilitate online interaction --- but bravo to those savvy enough to make their Twitter streams private/friends-only at the time they set up their accounts. Would the LOC do the same thing witih IRC chat logs? IRC is pretty much Twitter-esque. Or is that too daunting a task because *anyone* can throw up and run an IRC server, but Twitter is a centralised repository and thus an easy thing to try and archive? Doffing privacy hat now. -rf On Apr 14, 2010, at 6:13 PM, Michael Zimmer wrote:
For those who haven't already seen the news, the Library of Congress announced today that they are acquiring the entire archive of public Twitter activity since March 2006:
Library of Congress Announcement: How Tweet It Is!: Library Acquires Entire Twitter Archive http://www.loc.gov/tweet/how-tweet-it-is.html
Twitter Announcement: Tweet Preservation http://blog.twitter.com/2010/04/tweet-preservation.html
And my initial probe of various open concerns: Open Questions about Library of Congress Archiving Twitter Streams http://michaelzimmer.org/2010/04/14/open-questions-about-library-of-congress...
-- Michael Zimmer, PhD Assistant Professor, School of Information Studies Associate, Center for Information Policy Research University of Wisconsin-Milwaukee e: zimmerm@uwm.edu w: www.michaelzimmer.org
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
I am reminded of Google's acquisition of USENET archives and reposting of them as "google groups," indexed and searchable. It was a little shocking to find that posts you had assumed were going to a group at one time in a non-archived forum were now searchable, often times by your email address. On the other hand, my gut reaction to the Library of Congress archiving public Twitter posts is positive. The public twitter stream is of historical cultural significance and is an amazing repository of mundane moments in the daily lives of many people and records of what they thought important. It was initially posted in an architecture that was searchable and that displayed all public tweets in an ongoing stream. I think it's great that the Twitter stream will be preserved and curated by an institution which is not going to go out of business, or get bought and reinvented, or just reinvent itself and make it all go away. I recognize the privacy issues at stake, and think it's important to discuss them, but I'm fine with this. Nancy,
On Apr 14, 2010, at 6:13 PM, Michael Zimmer wrote:
For those who haven't already seen the news, the Library of Congress announced today that they are acquiring the entire archive of public Twitter activity since March 2006:
Library of Congress Announcement: How Tweet It Is!: Library Acquires Entire Twitter Archive http://www.loc.gov/tweet/how-tweet-it-is.html
Twitter Announcement: Tweet Preservation http://blog.twitter.com/2010/04/tweet-preservation.html
And my initial probe of various open concerns: Open Questions about Library of Congress Archiving Twitter Streams
http://michaelzimmer.org/2010/04/14/open-questions-about-library-of-congress...
It's a massive amount of information to store. I was really surprised because I had a discussion with one of the first users of Twitter, a friend of the founders, who made a request to see some of his personal archive of Tweets which go back to March 2006. The staff tried to oblige him but eventually told him that these Tweets were not accessible. You can locate a specific Tweet if you have the exact URL but you can't browse years back through an archive, like looking at messages in email folders. It makes me wonder if the Twitter staff can't organize all of the Tweets in storage (after all, the Search feature only goes back 6 days) in a way to make them accessible to users, how is the LOC going to organize and manage all of this data? I imagine that they are either going to concentrate on the accounts of notable people, news services, organizations, etc. or just have an undifferentiated mass of messages. Perhaps the arrangement will be that you can look at all Tweets from certain dates, say, you can pull messages from September 22, 2007. Liz Pullen nwjerseyliz@yahoo.com
Consider this scenario: 1- You decide to protect your privacy/visibility and keep your tweet stream private. 2- I send a request to follow you. You accept. I now receive your private tweet stream. 3- I retweet one of your private tweets. My steam is public. Your tweet is now public. 4- The Library of Congress archives my public stream, including your private tweet that I had retweeted. So, (a) it is ethically-questionable whether users with public streams should be retweeting private tweets without express consent, and (b) people's attempts to restrict access to their tweets can be easily thwarted, and ultimately those private tweets can end up in public archives with the rest of the masses. -mz -- Michael Zimmer, PhD Assistant Professor, School of Information Studies Associate, Center for Information Policy Research University of Wisconsin-Milwaukee e: zimmerm@uwm.edu w: www.michaelzimmer.org On Apr 14, 2010, at 6:35 PM, Nancy Baym wrote:
I am reminded of Google's acquisition of USENET archives and reposting of them as "google groups," indexed and searchable. It was a little shocking to find that posts you had assumed were going to a group at one time in a non-archived forum were now searchable, often times by your email address.
On the other hand, my gut reaction to the Library of Congress archiving public Twitter posts is positive. The public twitter stream is of historical cultural significance and is an amazing repository of mundane moments in the daily lives of many people and records of what they thought important. It was initially posted in an architecture that was searchable and that displayed all public tweets in an ongoing stream. I think it's great that the Twitter stream will be preserved and curated by an institution which is not going to go out of business, or get bought and reinvented, or just reinvent itself and make it all go away.
I recognize the privacy issues at stake, and think it's important to discuss them, but I'm fine with this.
Nancy,
On Apr 14, 2010, at 6:13 PM, Michael Zimmer wrote:
For those who haven't already seen the news, the Library of Congress announced today that they are acquiring the entire archive of public Twitter activity since March 2006:
Library of Congress Announcement: How Tweet It Is!: Library Acquires Entire Twitter Archive http://www.loc.gov/tweet/how-tweet-it-is.html
Twitter Announcement: Tweet Preservation http://blog.twitter.com/2010/04/tweet-preservation.html
And my initial probe of various open concerns: Open Questions about Library of Congress Archiving Twitter Streams http://michaelzimmer.org/2010/04/14/open-questions-about-library-of-congress...
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
At least for users of Twitter.com--and today, at the Twitter conference Chirp, founder Ev Williams said that 57% of the 105 million users just use the website--you can't use the automatic ReTweet feature to resend a protected account message. A private account's message can always be cut & pasted & I'm not sure about how third party clients handle RTs of protected updates. But Twitter has put in some safeguards so it's not easy. You can't ReTweet a protected account message by accident. Liz Pullen nwjerseyliz@yahoo.com
Thanks, Liz. So, for the 43% of users who don't use the website, 3rd party apps like TweetDeck make it quite easy to RT protected tweets. See, for example, http://support.tweetdeck.com/forums/63876/entries/83047, where it notes that the "Edit then Retweet" method places no restrictions on the ability to retweet. Further, % of users isn't the right way to think about this (a small % of users might account for a large proportion of retweets). The key is where do the majority of actual retweets take place: on Twitter.com or a 3rd party app. I suspect the latter. -michael. On Apr 14, 2010, at 8:44 PM, Liz wrote:
At least for users of Twitter.com--and today, at the Twitter conference Chirp, founder Ev Williams said that 57% of the 105 million users just use the website--you can't use the automatic ReTweet feature to resend a protected account message.
A private account's message can always be cut & pasted & I'm not sure about how third party clients handle RTs of protected updates. But Twitter has put in some safeguards so it's not easy. You can't ReTweet a protected account message by accident.
Liz Pullen nwjerseyliz@yahoo.com _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Williams said that although 57% of users just use Twitter.com, they account for 25% of the Tweets. Users who use third party clients were said to be "more engaged" and send a higher number of Tweets than the typical user of the website. It would be so much simpler if Twitter regularly issued statistics regarding usage (like Facebook updates their publicly available stats) & I don't know why they play it so close to the vest. Liz Pulen nwjerseyliz@yahoo.com ________________________________ From: Michael Zimmer <zimmerm@uwm.edu> To: aoir list <air-l@aoir.org> Sent: Wed, April 14, 2010 10:25:22 PM Subject: Re: [Air-L] Library of Congress Acquires Entire (Public) Twitter Archive Thanks, Liz. So, for the 43% of users who don't use the website, 3rd party apps like TweetDeck make it quite easy to RT protected tweets. See, for example, http://support.tweetdeck.com/forums/63876/entries/83047, where it notes that the "Edit then Retweet" method places no restrictions on the ability to retweet. Further, % of users isn't the right way to think about this (a small % of users might account for a large proportion of retweets). The key is where do the majority of actual retweets take place: on Twitter.com or a 3rd party app. I suspect the latter. -michael.
Michael, To be fair to TweetDeck (and I presume many other twitter clients) while you can retweet tweets from private accounts, TweetDeck does issue a warning before you do so. I'm not saying everyone would take heed - or even necessarily read the warning - but there is some mechanism in place to warn folks if they are moving something ostensibly private into the public twittersphere. That doesn't detract from your point, of course, but it is worth noting that other clients do take steps to try and prevent completely accidental movement of tweets into the public (and thus the archive in question). -Tama On Thu, Apr 15, 2010 at 10:25 AM, Michael Zimmer <zimmerm@uwm.edu> wrote:
Thanks, Liz.
So, for the 43% of users who don't use the website, 3rd party apps like TweetDeck make it quite easy to RT protected tweets. See, for example, http://support.tweetdeck.com/forums/63876/entries/83047, where it notes that the "Edit then Retweet" method places no restrictions on the ability to retweet.
Further, % of users isn't the right way to think about this (a small % of users might account for a large proportion of retweets). The key is where do the majority of actual retweets take place: on Twitter.com or a 3rd party app. I suspect the latter.
-michael.
On Apr 14, 2010, at 8:44 PM, Liz wrote:
At least for users of Twitter.com--and today, at the Twitter conference Chirp, founder Ev Williams said that 57% of the 105 million users just use the website--you can't use the automatic ReTweet feature to resend a protected account message.
A private account's message can always be cut & pasted & I'm not sure about how third party clients handle RTs of protected updates. But Twitter has put in some safeguards so it's not easy. You can't ReTweet a protected account message by accident.
Liz Pullen nwjerseyliz@yahoo.com
-- Dr Tama Leaver Lecturer in Internet Studies Faculty of Humanities, Curtin University of Technology GPO Box U1987 Perth WA Australia 6845 Phone: (+61 8) 9266 1258 Fax: (+61 8) 9266 3166 Email: t.leaver@curtin.edu.au Web: www.tamaleaver.net
CRICOS Provider Code: 00301J (WA) 02637B (NSW)
Library of Congress to save Tweet http://www.nytimes.com/2010/04/15/technology/15twitter.html Some online commentators raised the question of whether the library’s Twitter archive could threaten the privacy of users. Mr. Raymond said that the archive would be available only for scholarly and research purposes. Besides, he added, the vast majority of Twitter messages that would be archived are publicly published on the Web.
Thanks for the discussion, Liz. This is the classic "but the information is already public" argument that, while technically true, presumes a false dichotomy that information is either strictly public or private, ignoring any contextual norms that might have guided the initial release of information or how a person expects that information to flow. This is Nissenbaum's theory of contextual integrity, which Fred Stutzman has already invoked related to this case: http://fstutzman.com/2010/04/14/twitter-and-the-library-of-congress/ Further, it is interesting that the LOC seems to acknowledge that there are non-public tweets within the archive: "...the vast majority...are publicly published on the Web" -michael. On Apr 15, 2010, at 5:47 AM, Liz wrote:
Library of Congress to save Tweet http://www.nytimes.com/2010/04/15/technology/15twitter.html
Some online commentators raised the question of whether the library’s Twitter archive could threaten the privacy of users. Mr. Raymond said that the archive would be available only for scholarly and research purposes. Besides, he added, the vast majority of Twitter messages that would be archived are publicly published on the Web.
I don't see how this could threaten the privacy of users any more so than the Internet Archive (archive.org funded by Library of Congress) is threatening when archiving a blog. On Apr 15, 2010, at 3:47 AM, Liz wrote:
Library of Congress to save Tweet http://www.nytimes.com/2010/04/15/technology/15twitter.html
Some online commentators raised the question of whether the library’s Twitter archive could threaten the privacy of users. Mr. Raymond said that the archive would be available only for scholarly and research purposes. Besides, he added, the vast majority of Twitter messages that would be archived are publicly published on the Web. _______________________________________________
yes i just tweeted: "Fear of LOC getting Tweets Archive == Fear, a while back, of LOC getting all of Archive.org's web page crawling." a few seconds ago. deja vous all over again On Thu, Apr 15, 2010 at 10:39 AM, live <human.factor.one@gmail.com> wrote:
I don't see how this could threaten the privacy of users any more so than the Internet Archive (archive.org funded by Library of Congress) is threatening when archiving a blog.
On Apr 15, 2010, at 3:47 AM, Liz wrote:
Library of Congress to save Tweet
http://www.nytimes.com/2010/04/15/technology/15twitter.html
Some online commentators raised the question of whether the library’s Twitter archive could threaten the privacy of users. Mr. Raymond said that the archive would be available only for scholarly and research purposes. Besides, he added, the vast majority of Twitter messages that would be archived are publicly published on the Web. _______________________________________________
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Thinking about this less from a Web-privacy perspective and more from the perspective of LoC-as-actor, this is their mission: The Library's mission is to make its resources available and useful to the Congress and the American people and to sustain and preserve a universal collection of knowledge and creativity for future generations. From that perspective, maintaining an archive of Twitter is pretty mundane. Putting back on my own privacy hat, I think Michael has raised some very legitimate concerns and what I hope (putting on LIS hat, here) this can lead to is more forethought not just from users but especially designers in terms of thinking about what their data and interactions will look like not just in the moment but 6 mos., 3, 10 years down the road. The fact that Twitter might not be able to do this themselves just underlines this point. jkd On 4/15/10 10:42 AM, paul jones wrote:
yes i just tweeted: "Fear of LOC getting Tweets Archive == Fear, a while back, of LOC getting all of Archive.org's web page crawling." a few seconds ago. deja vous all over again
On Thu, Apr 15, 2010 at 10:39 AM, live<human.factor.one@gmail.com> wrote:
I don't see how this could threaten the privacy of users any more so than the Internet Archive (archive.org funded by Library of Congress) is threatening when archiving a blog.
On Apr 15, 2010, at 3:47 AM, Liz wrote:
Library of Congress to save Tweet
http://www.nytimes.com/2010/04/15/technology/15twitter.html
Some online commentators raised the question of whether the library’s Twitter archive could threaten the privacy of users. Mr. Raymond said that the archive would be available only for scholarly and research purposes. Besides, he added, the vast majority of Twitter messages that would be archived are publicly published on the Web. _______________________________________________
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
It was interesting to virtually attend yesterday's Twitter Chirp conference (videos archived at http://www.justin.tv/twitterchirp/all#r=ljm4GHc~) and hear how the company owners see Twitter...as a flow of information to be mined and not as a social network with all of the connections that have been created between users. Despite having 105 million accounts created by users, they didn't talk about people as much as information flow. The two pillars in their business plan are Promoted Tweets (ads appearing in Search) and Premium Corporate accounts (presumably with additional features & analytics). They do not view Twitter as interrelated circles of followers & following but as massive amounts of user-generated content that can be analyzed for insight into consumer behavior and opinion. While it's expected that the company would try to make money, they took the "social" out of the Twitter experience and spoke as if using Twitter was like reading a newspaper or watching television with users responding to advertisers & companies with their thoughts & concerns. The value of conversations between users isn't the fact that people across the world are connecting but that information is generated & opinions expressed that might be of interest to third parties. Hearing their "vision" of the platform won't make me discontinue using Twitter but it does show a disconnect between how the company sees this communication tool vs. what users are doing with it (organizing face to face meetups, giving to charity, continuing blog conversations, maintaining social ties with distant friends, connecting with strangers, etc.). Liz Pullen nwjerseyliz@yahoo.com
Yes, that is a recent feature TweeDeck added. On Apr 15, 2010, at 3:28 AM, Tama Leaver wrote:
Michael,
To be fair to TweetDeck (and I presume many other twitter clients) while you can retweet tweets from private accounts, TweetDeck does issue a warning before you do so. I'm not saying everyone would take heed - or even necessarily read the warning - but there is some mechanism in place to warn folks if they are moving something ostensibly private into the public twittersphere. That doesn't detract from your point, of course, but it is worth noting that other clients do take steps to try and prevent completely accidental movement of tweets into the public (and thus the archive in question).
-Tama
On Thu, Apr 15, 2010 at 10:25 AM, Michael Zimmer <zimmerm@uwm.edu> wrote: Thanks, Liz.
So, for the 43% of users who don't use the website, 3rd party apps like TweetDeck make it quite easy to RT protected tweets. See, for example, http://support.tweetdeck.com/forums/63876/entries/83047, where it notes that the "Edit then Retweet" method places no restrictions on the ability to retweet.
Further, % of users isn't the right way to think about this (a small % of users might account for a large proportion of retweets). The key is where do the majority of actual retweets take place: on Twitter.com or a 3rd party app. I suspect the latter.
-michael.
On Apr 14, 2010, at 8:44 PM, Liz wrote:
At least for users of Twitter.com--and today, at the Twitter conference Chirp, founder Ev Williams said that 57% of the 105 million users just use the website--you can't use the automatic ReTweet feature to resend a protected account message.
A private account's message can always be cut & pasted & I'm not sure about how third party clients handle RTs of protected updates. But Twitter has put in some safeguards so it's not easy. You can't ReTweet a protected account message by accident.
Liz Pullen nwjerseyliz@yahoo.com
-- Dr Tama Leaver Lecturer in Internet Studies Faculty of Humanities, Curtin University of Technology GPO Box U1987 Perth WA Australia 6845 Phone: (+61 8) 9266 1258 Fax: (+61 8) 9266 3166 Email: t.leaver@curtin.edu.au Web: www.tamaleaver.net
CRICOS Provider Code: 00301J (WA) 02637B (NSW)
participants (9)
-
Alex Leavitt -
Jacob Kramer-Duffield -
live -
Liz -
Michael Zimmer -
Nancy Baym -
paul jones -
Richard Forno -
Tama Leaver