Re: [Air-L] Suggestions for anonymizing qual interview data

9 Apr 2021

      Cory, thanks for asking this important question, and Michael, thank you for
pointing out some issues of intersectional re-identification in qualitative
research, particularly for marginalized participants. This is a huge social
justice issue where more research needs to be done.

As a member of multiple underrepresented groups, I've frequently been asked
to be a participant in qualitative research. I've found that many very
well-meaning and intelligent people, including both researchers and IRBs,
are unaware of the risks to vulnerable populations, particularly when
intersectionality is involved. Researchers often apply naïve techniques for
anonymity, and IRBs assume they are sufficient.

Some IRB-approved projects clearly re-identified me and put me at risk,
e.g. "anonymous participant X is a female computer science PhD student who
is also an aerobatic pilot" (in a department where a professor had told me
"women don't have the intellectual ability for computer science"). So I
started requiring researchers to let me approve and/or edit the final text
to go into published reports (and yes, it was a lot of labor for me). Some
of the techniques I ended up asking researchers to use:

   - Apply a qualitative analog of differential privacy: add "noise" by
   obfuscating parts of the quotes and identifiers. For example, change the
   above description to  "anonymous participant X is a female computer science
   PhD student who is also a competitive skier." This maintains the contextual
   integrity of the quote but obfuscates an easily identifiable characteristic.
   - Don't retain identifiers throughout the document for the same
   individual. It can be easy to re-identify an individual from a series of
   quotes, particularly if they have intersectional identities. Instead, you
   can attribute some of the quotes to "participant X, a female computer
   science student" and the rest to "participant Y, a Latinx computer science
   student."

I'd love to see more research on this topic and would be happy to
collaborate on projects!

Cecilia

--
Cecilia R. Aragon, Professor
Director, Human-Centered Data Science Lab
Department of Human Centered Design & Engineering, University of
Washington, Seattle
http://faculty.washington.edu/aragon | @craragon
<https://twitter.com/craragon>
New memoir *Flying Free
<https://www.blackstonepublishing.com/flying-free-cecilia-aragon>: My
Victory over Fear to Become the First Latina Pilot on the US Aerobatic Team*

On Fri, Apr 9, 2021 at 6:43 AM Michael Muller <michael_muller@us.ibm.com>
wrote:
...
I think part of the thinking-process might be: How easily can someone
   figure out the identity of the informant? If I were to say that I
   interviewed people in our 8-person team, and if I report that one
   person was working from the Pacific timezone, then it's easy to
   determine which of us I am referring to. Or if I were to write that a
   disabled member of the team said... then that's me. I know that seems
   obvious for a tiny group, but these kinds of intersectional identities
   can operate in larger groups, too.
   A second way-of-thinking may involve a focus on the risk of disclosure
   to the informant. However, this criterion often becomes a matter of the
   researcher's imagination regarding the Other. It's been shown again and
   again that people in a position of privilege and safety may not
   understand the very real risks that are experienced by people who have
   fewer safeguards - e.g., men writing about women's safety (how easily
   can a stalker act on the information?), or straight people writing
   about risk of identification of someone in one of the LGBTQIA+ spectra,
   or citizens making assumptions about legal protections (or lack of
   protections) for non-citizens. Of course, it's a good idea to discuss
   these matters with people who are not ourselves, and who are not like
   ourselves. It's also a good idea not to put the burden of explaining
   bias on the person who is the target of that bias. Yes, I know that I
   said two things that somewhat contradict each other. There are no easy
   answers here.
   A third possibility is to ask each informant to state what information
   about themself would be safe to share. This is sensible only if the
   informants understand publications and readerships, etc. But it may be
   a more radically democratic approach to demographic description.
   I'm suggesting these ideas as among a larger number of *starting
   points* for thinking about difficult research questions. Please think
   of them as heuristic questions - not as authoritative questions, and
   certainly not as answers!
   best wishes,
   --michael
   -----
   Michael Muller, PhD, IBM Research, Cambridge MA USA
   pronouns: he/him/his
   ACM Distinguished Scientist
   ACM SIGCHI Academy
...
On Wed, Apr 7, 2021 at 3:47 AM Cory Robinson
<cory.robinson@liu.se<mailto:
...
cory.robinson@liu.se>> wrote:
HI all,
Two Masterâs students I recently met are conducting recorded
   interviews
resulting in texts they will code and quote within their theses. I
   have
given input about how to protect the recorded interviews (encrypted,
password protected, not stored in the cloud). I do not work with qual
   data,
so I need help recommending methodology or help for anonymizing
   quotes in
their thesis.
(I am inquiring about this for a student, that unfortunately, has not
received helpful advice from their supervisor). â¹
The students assumed they would assign each participating an
identification number, and then attribute the quote and ID # in their
thesis. However, I feel there is surely a better way to ensure
   anonymity?
(Too easy to reidentify if research data was obtained).
What methods do you utilize for anonymizing individual interview
   data? Or
manuscripts/books helpful for this? Sadly, the students are nearing
   the end
of the study, but late is better than never. (Itâs indeed a failure
   of
universities, as well as unequipped supervisors!)
Best,
Cory

Re: [Air-L] Suggestions for anonymizing qual interview data

Cecilia Aragon