Cory, thanks for asking this important question, and Michael, thank you for pointing out some issues of intersectional re-identification in qualitative research, particularly for marginalized participants. This is a huge social justice issue where more research needs to be done. As a member of multiple underrepresented groups, I've frequently been asked to be a participant in qualitative research. I've found that many very well-meaning and intelligent people, including both researchers and IRBs, are unaware of the risks to vulnerable populations, particularly when intersectionality is involved. Researchers often apply naïve techniques for anonymity, and IRBs assume they are sufficient. Some IRB-approved projects clearly re-identified me and put me at risk, e.g. "anonymous participant X is a female computer science PhD student who is also an aerobatic pilot" (in a department where a professor had told me "women don't have the intellectual ability for computer science"). So I started requiring researchers to let me approve and/or edit the final text to go into published reports (and yes, it was a lot of labor for me). Some of the techniques I ended up asking researchers to use: - Apply a qualitative analog of differential privacy: add "noise" by obfuscating parts of the quotes and identifiers. For example, change the above description to "anonymous participant X is a female computer science PhD student who is also a competitive skier." This maintains the contextual integrity of the quote but obfuscates an easily identifiable characteristic. - Don't retain identifiers throughout the document for the same individual. It can be easy to re-identify an individual from a series of quotes, particularly if they have intersectional identities. Instead, you can attribute some of the quotes to "participant X, a female computer science student" and the rest to "participant Y, a Latinx computer science student." I'd love to see more research on this topic and would be happy to collaborate on projects! Cecilia -- Cecilia R. Aragon, Professor Director, Human-Centered Data Science Lab Department of Human Centered Design & Engineering, University of Washington, Seattle http://faculty.washington.edu/aragon | @craragon <https://twitter.com/craragon> New memoir *Flying Free <https://www.blackstonepublishing.com/flying-free-cecilia-aragon>: My Victory over Fear to Become the First Latina Pilot on the US Aerobatic Team* On Fri, Apr 9, 2021 at 6:43 AM Michael Muller <michael_muller@us.ibm.com> wrote:
I think part of the thinking-process might be: How easily can someone figure out the identity of the informant? If I were to say that I interviewed people in our 8-person team, and if I report that one person was working from the Pacific timezone, then it's easy to determine which of us I am referring to. Or if I were to write that a disabled member of the team said... then that's me. I know that seems obvious for a tiny group, but these kinds of intersectional identities can operate in larger groups, too. A second way-of-thinking may involve a focus on the risk of disclosure to the informant. However, this criterion often becomes a matter of the researcher's imagination regarding the Other. It's been shown again and again that people in a position of privilege and safety may not understand the very real risks that are experienced by people who have fewer safeguards - e.g., men writing about women's safety (how easily can a stalker act on the information?), or straight people writing about risk of identification of someone in one of the LGBTQIA+ spectra, or citizens making assumptions about legal protections (or lack of protections) for non-citizens. Of course, it's a good idea to discuss these matters with people who are not ourselves, and who are not like ourselves. It's also a good idea not to put the burden of explaining bias on the person who is the target of that bias. Yes, I know that I said two things that somewhat contradict each other. There are no easy answers here. A third possibility is to ask each informant to state what information about themself would be safe to share. This is sensible only if the informants understand publications and readerships, etc. But it may be a more radically democratic approach to demographic description. I'm suggesting these ideas as among a larger number of *starting points* for thinking about difficult research questions. Please think of them as heuristic questions - not as authoritative questions, and certainly not as answers! best wishes, --michael ----- Michael Muller, PhD, IBM Research, Cambridge MA USA pronouns: he/him/his ACM Distinguished Scientist ACM SIGCHI Academy
On Wed, Apr 7, 2021 at 3:47 AM Cory Robinson
<cory.robinson@liu.se<mailto:
cory.robinson@liu.se>> wrote: HI all,
Two Masterâs students I recently met are conducting recorded interviews resulting in texts they will code and quote within their theses. I have given input about how to protect the recorded interviews (encrypted, password protected, not stored in the cloud). I do not work with qual data, so I need help recommending methodology or help for anonymizing quotes in their thesis.
(I am inquiring about this for a student, that unfortunately, has not received helpful advice from their supervisor). â¹
The students assumed they would assign each participating an identification number, and then attribute the quote and ID # in their thesis. However, I feel there is surely a better way to ensure anonymity? (Too easy to reidentify if research data was obtained).
What methods do you utilize for anonymizing individual interview data? Or manuscripts/books helpful for this? Sadly, the students are nearing the end of the study, but late is better than never. (Itâs indeed a failure of universities, as well as unequipped supervisors!)
Best, Cory