Subject: Re: [External] Re: Buying tweets ?
Hello I am genuinely curious about how the ethics of research on available personal data is implemented. As a relatively new academic I would love to do this type of research but see many ethical hurdles. I have stuck to the organizational level rather than looking at data at the level of individuals, which to me is fraught with ethical data extraction and exploitation issues. When Clearview AI used data on publicly available images, many of us said it was unethical. I have always wondered how a research study such as the infamous gaydar experiment passed ethics protocols at a reputable post secondary academic institution. https://thenextweb.com/artificial-intelligence/2018/02/20/opinion-the-stanfo... And from studies such as the one done recently by Mozilla https://www.zdnet.com/article/mozilla-research-browsing-histories-are-unique... indicating an individual is identifiable for 50-150 favorite sites, "no exposure of any identifying information" is a meaningless phrase. Sincerely Ushnish Sengupta Message: 3 Date: Thu, 10 Sep 2020 11:04:36 -0400 From: Deen Freelon <dfreelon@gmail.com> To: "air-l@listserv.aoir.org" <air-l@listserv.aoir.org> Subject: Re: [Air-L] [External] Re: Buying tweets ? Message-ID: <83ec2d80-c91c-ea9e-6f2b-395297079452@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Sure, many countries have a right to be forgotten. The US doesn't, and AFAIK there's little clear case law that applies to individuals' presence in research datasets. If someone asks me to remove their data from my datasets, I'm happy to do so, but I'm not willing to prospectively monitor Twitter's platform for deletions so that my datasets always match what is currently available on Twitter. That is technically infeasible for me, and I suspect for many others as well. The practicality aspect I mentioned applies also to users. You can ask AIR-L members to remove your data, but what assurances do you have that they've done so? It's impossible even to check that they've actually read your message. Now consider all the other researchers' datasets of which your data may be a part--there's no way to even know who to ask. And all of this to prevent your data from being one point among millions, with no exposure of any identifying information? It's little wonder yours is the first data removal request I've ever received, but as I said, I'll honor it. /DEEN On 9/10/2020 10:49 AM, Stuart Shulman wrote:
There is nothing theoretical about checking in real time for deletions. When you study a Tweet's content in the Twitter display, if a Tweet is deleted or an account suspended or deleted, the Tweet will not display. That is real time compliance. We have done it for many years now, all the while advising students and faculty on the ethical?importance of this point.
The "right to be forgotten"?is law in many countries, so I am unsure how that is unresolved. Something is either legal or it is not. If anyone reading this has any of my deleted Tweets from my deleted account, the Canadian part of me requests you immediately delete them. If you lack the ability to check for compliance in real time, should you be handling my data and violating my right to be forgotten under the broad banner of research? I have tweeted extensively about acts by a hostile foreign power to game the imminent election. I have recently deleted personal Facebook, YouTube and Twitter accounts. Nobody has any business holding that data. It is unethical.
There are various guidelines about legally sharing lists of Tweet IDs for rehydration and replication (something almost never done) versus sharing spreadsheets of complete data extracts or the raw JSON,?which is done all the time in defiance of the Twitter ToS.
--
Dear all, Many thanks to all the persons who took the time to reply to my mail and provided me with a lot of good for thought! Have a nice week and thank you again, Sandrine Le 11/09/20 01:17, « Air-L au nom de Ushnish Sengupta » <air-l-bounces@listserv.aoir.org au nom de ushnish.sengupta@gmail.com> a écrit : Hello I am genuinely curious about how the ethics of research on available personal data is implemented. As a relatively new academic I would love to do this type of research but see many ethical hurdles. I have stuck to the organizational level rather than looking at data at the level of individuals, which to me is fraught with ethical data extraction and exploitation issues. When Clearview AI used data on publicly available images, many of us said it was unethical. I have always wondered how a research study such as the infamous gaydar experiment passed ethics protocols at a reputable post secondary academic institution. https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fthenextweb.com%2Fartificial-intelligence%2F2018%2F02%2F20%2Fopinion-the-stanford-gaydar-ai-is-hogwash%2F&data=02%7C01%7Csandrine.roginsky%40uclouvain.be%7Cb0b1fcc50d984ff3c07308d855dfa941%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637353766384995259&sdata=5vq7GLmdAegwJa3E0UBm%2BkrWK1D%2FHUUgEPwTnR%2BKFoc%3D&reserved=0 And from studies such as the one done recently by Mozilla https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.zdnet.com%2Farticle%2Fmozilla-research-browsing-histories-are-unique-enough-to-reliably-identify-users%2F&data=02%7C01%7Csandrine.roginsky%40uclouvain.be%7Cb0b1fcc50d984ff3c07308d855dfa941%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637353766385000247&sdata=iWUg94mGXloa%2Bki5ewZuq5MVXpaSzoV7XU9Bb8oIBHw%3D&reserved=0 indicating an individual is identifiable for 50-150 favorite sites, "no exposure of any identifying information" is a meaningless phrase. Sincerely Ushnish Sengupta Message: 3 Date: Thu, 10 Sep 2020 11:04:36 -0400 From: Deen Freelon <dfreelon@gmail.com> To: "air-l@listserv.aoir.org" <air-l@listserv.aoir.org> Subject: Re: [Air-L] [External] Re: Buying tweets ? Message-ID: <83ec2d80-c91c-ea9e-6f2b-395297079452@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Sure, many countries have a right to be forgotten. The US doesn't, and AFAIK there's little clear case law that applies to individuals' presence in research datasets. If someone asks me to remove their data from my datasets, I'm happy to do so, but I'm not willing to prospectively monitor Twitter's platform for deletions so that my datasets always match what is currently available on Twitter. That is technically infeasible for me, and I suspect for many others as well. The practicality aspect I mentioned applies also to users. You can ask AIR-L members to remove your data, but what assurances do you have that they've done so? It's impossible even to check that they've actually read your message. Now consider all the other researchers' datasets of which your data may be a part--there's no way to even know who to ask. And all of this to prevent your data from being one point among millions, with no exposure of any identifying information? It's little wonder yours is the first data removal request I've ever received, but as I said, I'll honor it. /DEEN On 9/10/2020 10:49 AM, Stuart Shulman wrote: > There is nothing theoretical about checking in real time for > deletions. When you study a Tweet's content in the Twitter display, if > a Tweet is deleted or an account suspended or deleted, the Tweet will > not display. That is real time compliance. We have done it for many > years now, all the while advising students and faculty on the > ethical?importance of this point. > > The "right to be forgotten"?is law in many countries, so I am unsure > how that is unresolved. Something is either legal or it is not. If > anyone reading this has any of my deleted Tweets from my deleted > account, the Canadian part of me requests you immediately delete them. > If you lack the ability to check for compliance in real time, should > you be handling my data and violating my right to be forgotten under > the broad banner of research? I have tweeted extensively about acts by > a hostile foreign power to game the imminent election. I have recently > deleted personal Facebook, YouTube and Twitter accounts. Nobody has > any business holding that data. It is unethical. > > There are various guidelines about legally sharing lists of Tweet IDs > for rehydration and replication (something almost never done) versus > sharing spreadsheets of complete data extracts or the raw JSON,?which > is done all the time in defiance of the Twitter ToS. > -- _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F&data=02%7C01%7Csandrine.roginsky%40uclouvain.be%7Cb0b1fcc50d984ff3c07308d855dfa941%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637353766385000247&sdata=%2BxENlzYCab7lTJttXiIOx1yhqjoEepaAencJvx2tQB8%3D&reserved=0 Subscribe, change options or unsubscribe at: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=02%7C01%7Csandrine.roginsky%40uclouvain.be%7Cb0b1fcc50d984ff3c07308d855dfa941%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637353766385000247&sdata=0cw6CXaUL4IBWcHdFuaxVFb%2BmveSZzNKVqLIRXvW8Sk%3D&reserved=0 Join the Association of Internet Researchers: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.org%2F&data=02%7C01%7Csandrine.roginsky%40uclouvain.be%7Cb0b1fcc50d984ff3c07308d855dfa941%7C7ab090d4fa2e4ecfbc7c4127b4d582ec%7C0%7C0%7C637353766385000247&sdata=EHNVuSF4gs1E%2BhIjTb%2FptRAb3GJ%2BUVtPjjBvHWSOa4Y%3D&reserved=0
There are protocols and laws and mathematical calculations at our national statistics agency in Canada where I work so that no one individual's or one business's data is ever exposed. We need to publish in aggregates. We do samples from something like 100 to 100,000 respondents. The public trust us not to misuse their data. 80% of my time is spent calculating if a table of statistical estimates exposes data. Now we don't publish salient quotes of research subjects but still we make a great deal of effort to protect confidentiality. Typically we do a few surveys about the Internet every few years funded by government initiatives to help Internet access. On Thu., Sep. 10, 2020, 7:17 p.m. Ushnish Sengupta, < ushnish.sengupta@gmail.com> wrote:
Hello I am genuinely curious about how the ethics of research on available personal data is implemented. As a relatively new academic I would love to do this type of research but see many ethical hurdles. I have stuck to the organizational level rather than looking at data at the level of individuals, which to me is fraught with ethical data extraction and exploitation issues.
When Clearview AI used data on publicly available images, many of us said it was unethical. I have always wondered how a research study such as the infamous gaydar experiment passed ethics protocols at a reputable post secondary academic institution.
https://thenextweb.com/artificial-intelligence/2018/02/20/opinion-the-stanfo...
And from studies such as the one done recently by Mozilla
https://www.zdnet.com/article/mozilla-research-browsing-histories-are-unique... indicating an individual is identifiable for 50-150 favorite sites, "no exposure of any identifying information" is a meaningless phrase.
Sincerely Ushnish Sengupta
Message: 3 Date: Thu, 10 Sep 2020 11:04:36 -0400 From: Deen Freelon <dfreelon@gmail.com> To: "air-l@listserv.aoir.org" <air-l@listserv.aoir.org> Subject: Re: [Air-L] [External] Re: Buying tweets ? Message-ID: <83ec2d80-c91c-ea9e-6f2b-395297079452@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed
Sure, many countries have a right to be forgotten. The US doesn't, and AFAIK there's little clear case law that applies to individuals' presence in research datasets. If someone asks me to remove their data from my datasets, I'm happy to do so, but I'm not willing to prospectively monitor Twitter's platform for deletions so that my datasets always match what is currently available on Twitter. That is technically infeasible for me, and I suspect for many others as well.
The practicality aspect I mentioned applies also to users. You can ask AIR-L members to remove your data, but what assurances do you have that they've done so? It's impossible even to check that they've actually read your message. Now consider all the other researchers' datasets of which your data may be a part--there's no way to even know who to ask. And all of this to prevent your data from being one point among millions, with no exposure of any identifying information? It's little wonder yours is the first data removal request I've ever received, but as I said, I'll honor it. /DEEN
On 9/10/2020 10:49 AM, Stuart Shulman wrote:
There is nothing theoretical about checking in real time for deletions. When you study a Tweet's content in the Twitter display, if a Tweet is deleted or an account suspended or deleted, the Tweet will not display. That is real time compliance. We have done it for many years now, all the while advising students and faculty on the ethical?importance of this point.
The "right to be forgotten"?is law in many countries, so I am unsure how that is unresolved. Something is either legal or it is not. If anyone reading this has any of my deleted Tweets from my deleted account, the Canadian part of me requests you immediately delete them. If you lack the ability to check for compliance in real time, should you be handling my data and violating my right to be forgotten under the broad banner of research? I have tweeted extensively about acts by a hostile foreign power to game the imminent election. I have recently deleted personal Facebook, YouTube and Twitter accounts. Nobody has any business holding that data. It is unethical.
There are various guidelines about legally sharing lists of Tweet IDs for rehydration and replication (something almost never done) versus sharing spreadsheets of complete data extracts or the raw JSON,?which is done all the time in defiance of the Twitter ToS.
-- _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
participants (3)
-
Peter Timusk -
Sandrine Roginsky -
Ushnish Sengupta