[Air-L] The Case Against Spreadsheets

20 Mar 2021

      The Case Against Spreadsheets
https://vimeo.com/526218014

When is a Tweet not a Tweet?

My answer: “When it appears in a spreadsheet, or is deleted, or the account
has been suspended.”

Though the video title sounds polemical, half of the argument strongly
supports doing certain parts of Twitter research using spreadsheets.
Indeed, Twitter is actively encouraging and training academic researchers
to transform raw JSON into CSV files for research purposes. Though I find
it counterintuitive to mechanically separate the data from the platform
that gives it vitality beyond the moment of the Tweet, I do it all the time
with Twitter metadata for a variety of research and model building
purposes.

However, it is the interactional, evolving, multimodal media content, and
distinctive, color-rich look and feel that makes Twitter a quintessential
early 21st century social media platform. No spreadsheet can capture that.
The case against spreadsheets is rooted in a desire for reliable and
authentic interpretation practices for digital artifacts.

Tweets do not live and cannot make meaning in spreadsheets. Many
researchers wait weeks, months, or years to find the time to look at the
data closely. After the collection process stops, Tweets often get deleted.
Accounts are suspended or deleted. Though a sophisticated (usually
well-funded) research team might have a system and method for compliance
checking deletions and suspensions at scale, there is no practical way to
check and recheck the 10 million Tweets per month academics can now get
under Twitter’s new sponsored quotas.

Studying Tweet content in spreadsheets is like studying polar bears in a
zoo to explain their fate on the melting ice caps. Many things are
fundamentally different at the zoo. Inductive and qualitative methods
require seeing Tweets in the native display. The same is true for medium to
large scale content analysis, where machine-learning training models
informed by features intrinsic in the Twitter display are often more robust
than those trained on observations that are text-based only.

I have heard the story repeatedly of researchers clicking from spreadsheets
to Twitter and back to the spreadsheet to record their observation. This
common practice is a lot of work on the mouse. Anyone who has done it at
any scale has probably felt the carpal tunnel kick in as it requires a
minimum of 4-5 clicks, perhaps more depending on labeling variations. In
the video, I describe findings accumulated over ten years of studying and
coding Tweets using the Twitter display, obscuring deleted and suspended
content in real time, and avoiding whenever possible the computer mouse.

Dr. Stuart ShulmanU.S. Soccer Federation C-Licensed Coach