Methods to infer Twitter data location
Hi aoir, Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages? Thanks. Deena ____________________________ Deena Abul-Fottouh, PhD. Postdoctoral Research Fellow Digital Society Lab Political Science Department Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca McMaster University 1280 Main Street West Hamilton, ON L8S 4M4 McMaster University Brighter World logo References 1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca
User location on Twitter is very tricky and unreliable. Users must opt-in to share geo or to "claim" a location when they set up or edit their account profile. Claims of location range from 100% authentic to preposterous satirical banter and everything in between. Generally, if you restrict yourself to the study geo-located or location-purported data, you are looking at a small fraction of the whole dataset. For example, here are the top most common 30 "claimed" locations in a dataset of Tweets about "Dominion Voting Systems" from November 2020 where the most common from 421,000 Tweets is 15,000+ (United States) and the 30th is almost 700 (New York, NY). There are 239,051 purported locations in this dataset. United States USA Texas, USA Florida, USA California, USA Texas Georgia, USA Canada Venezuela New York, USA Pennsylvania, USA Arizona, USA Los Angeles, CA Michigan, USA Ohio, USA North Carolina, USA Florida New Jersey, USA Washington, DC Houston, TX Virginia, USA California Tennessee, USA South Carolina, USA Atlanta, GA Washington, USA Earth Illinois, USA Chicago, IL New York, NY To study data like this you may need to concatenate things that are similar but the same in practice at the geographical scale you are working with. Some locations are just Twitter handles. Others are Gab or Parler handles. Some look like this: A Desk in Open Office Hell A Field of White Roses A Galaxy Far Away A Galaxy far far away A Galaxy far, far South A Garden of Feelings &Hot Air A Getsemani, Jerusalen On Wed, Feb 22, 2023 at 11:32 AM Deena Abul-Fottouh via Air-L < air-l@listserv.aoir.org> wrote:
Hi aoir,
Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages?
Thanks.
Deena
____________________________
Deena Abul-Fottouh, PhD.
Postdoctoral Research Fellow
Digital Society Lab
Political Science Department
Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca
McMaster University
1280 Main Street West
Hamilton, ON L8S 4M4 McMaster University Brighter World logo
References
1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics*
Depending on your tolerance for error and how specific you need the location to be, you may be able to use the user's location (from their profile, available through the users API if you still have access) rather than the tweet's geolocation. People usually tweet from "home". Libby On Wed, Feb 22, 2023 at 2:38 PM Shulman, Stu via Air-L < air-l@listserv.aoir.org> wrote:
User location on Twitter is very tricky and unreliable. Users must opt-in to share geo or to "claim" a location when they set up or edit their account profile. Claims of location range from 100% authentic to preposterous satirical banter and everything in between. Generally, if you restrict yourself to the study geo-located or location-purported data, you are looking at a small fraction of the whole dataset. For example, here are the top most common 30 "claimed" locations in a dataset of Tweets about "Dominion Voting Systems" from November 2020 where the most common from 421,000 Tweets is 15,000+ (United States) and the 30th is almost 700 (New York, NY). There are 239,051 purported locations in this dataset.
United States USA Texas, USA Florida, USA California, USA Texas Georgia, USA Canada Venezuela New York, USA Pennsylvania, USA Arizona, USA Los Angeles, CA Michigan, USA Ohio, USA North Carolina, USA Florida New Jersey, USA Washington, DC Houston, TX Virginia, USA California Tennessee, USA South Carolina, USA Atlanta, GA Washington, USA Earth Illinois, USA Chicago, IL New York, NY
To study data like this you may need to concatenate things that are similar but the same in practice at the geographical scale you are working with. Some locations are just Twitter handles. Others are Gab or Parler handles. Some look like this:
A Desk in Open Office Hell A Field of White Roses A Galaxy Far Away A Galaxy far far away A Galaxy far, far South A Garden of Feelings &Hot Air A Getsemani, Jerusalen
On Wed, Feb 22, 2023 at 11:32 AM Deena Abul-Fottouh via Air-L < air-l@listserv.aoir.org> wrote:
Hi aoir,
Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages?
Thanks.
Deena
____________________________
Deena Abul-Fottouh, PhD.
Postdoctoral Research Fellow
Digital Society Lab
Political Science Department
Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca
McMaster University
1280 Main Street West
Hamilton, ON L8S 4M4 McMaster University Brighter World logo
References
1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics* _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Hi list, I have been having a nice chat with Deena about this, but given now a few technical responses to this query, I wanted to send to list. I would urge researchers in this space to make sure they are considering the immediate ethical dimensions of attempting to infer the physical locations of users who have, by their settings and practice explicitly said, “please don’t know my location,” especially if that location may in fact be their home… What that location becomes linked to in any data set produced, could potentially be deeply harmful, or at a minimum invasive. Best of luck, Dan From: Air-L <air-l-bounces@listserv.aoir.org> on behalf of Libby Hemphill via Air-L <air-l@listserv.aoir.org> Date: Wednesday, February 22, 2023 at 12:13 PM To: Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location External Message [External Message] Depending on your tolerance for error and how specific you need the location to be, you may be able to use the user's location (from their profile, available through the users API if you still have access) rather than the tweet's geolocation. People usually tweet from "home". Libby On Wed, Feb 22, 2023 at 2:38 PM Shulman, Stu via Air-L < air-l@listserv.aoir.org> wrote:
User location on Twitter is very tricky and unreliable. Users must opt-in to share geo or to "claim" a location when they set up or edit their account profile. Claims of location range from 100% authentic to preposterous satirical banter and everything in between. Generally, if you restrict yourself to the study geo-located or location-purported data, you are looking at a small fraction of the whole dataset. For example, here are the top most common 30 "claimed" locations in a dataset of Tweets about "Dominion Voting Systems" from November 2020 where the most common from 421,000 Tweets is 15,000+ (United States) and the 30th is almost 700 (New York, NY). There are 239,051 purported locations in this dataset.
United States USA Texas, USA Florida, USA California, USA Texas Georgia, USA Canada Venezuela New York, USA Pennsylvania, USA Arizona, USA Los Angeles, CA Michigan, USA Ohio, USA North Carolina, USA Florida New Jersey, USA Washington, DC Houston, TX Virginia, USA California Tennessee, USA South Carolina, USA Atlanta, GA Washington, USA Earth Illinois, USA Chicago, IL New York, NY
To study data like this you may need to concatenate things that are similar but the same in practice at the geographical scale you are working with. Some locations are just Twitter handles. Others are Gab or Parler handles. Some look like this:
A Desk in Open Office Hell A Field of White Roses A Galaxy Far Away A Galaxy far far away A Galaxy far, far South A Garden of Feelings &Hot Air A Getsemani, Jerusalen
On Wed, Feb 22, 2023 at 11:32 AM Deena Abul-Fottouh via Air-L < air-l@listserv.aoir.org> wrote:
Hi aoir,
Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages?
Thanks.
Deena
____________________________
Deena Abul-Fottouh, PhD.
Postdoctoral Research Fellow
Digital Society Lab
Political Science Department
Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca
McMaster University
1280 Main Street West
Hamilton, ON L8S 4M4 McMaster University Brighter World logo
References
1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao...
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or...
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics* _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao...
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or...
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao... Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or... - NOTE: This email originated from outside Chapman’s network. Do not click links or open attachments unless you recognize the sender and know content is safe. NOTE: This email originated from outside Chapman’s network. Do not click links or open attachments unless you recognize the sender and know content is safe.
A while back we used the concept of "Place" from Human Geography in a paper where we used the profile listing of where someone claims to be or be from. I've included the cite for the paper below. Yes, people do claim to be from Mars sometimes, like Libby says. I expect that is because they wish to preserve privacy or they just enjoy being silly. Both are ok in my book. And like Libby, I think if your methods take into consideration a certain amount of noise, then it should be fine. Unlike Dan, I'm not to worried about infringing on someone's privacy when using the place they list on their PUBLIC Twitter profile. Here is the paper I mentioned: Hemsley, J. J., & Eckert, J. (2014, January). Examining the role of" place" in twitter networks through the lens of contentious politics. In 2014 47th Hawaii International Conference on System Sciences (pp. 1844-1853). IEEE. -----Original Message----- From: Air-L <air-l-bounces@listserv.aoir.org> On Behalf Of Gardner, Daniel via Air-L Sent: Wednesday, February 22, 2023 3:34 PM To: Libby Hemphill <libbyh@umich.edu>; Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location Hi list, I have been having a nice chat with Deena about this, but given now a few technical responses to this query, I wanted to send to list. I would urge researchers in this space to make sure they are considering the immediate ethical dimensions of attempting to infer the physical locations of users who have, by their settings and practice explicitly said, "please don't know my location," especially if that location may in fact be their home... What that location becomes linked to in any data set produced, could potentially be deeply harmful, or at a minimum invasive. Best of luck, Dan From: Air-L <air-l-bounces@listserv.aoir.org> on behalf of Libby Hemphill via Air-L <air-l@listserv.aoir.org> Date: Wednesday, February 22, 2023 at 12:13 PM To: Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location External Message [External Message] Depending on your tolerance for error and how specific you need the location to be, you may be able to use the user's location (from their profile, available through the users API if you still have access) rather than the tweet's geolocation. People usually tweet from "home". Libby On Wed, Feb 22, 2023 at 2:38 PM Shulman, Stu via Air-L < air-l@listserv.aoir.org> wrote:
User location on Twitter is very tricky and unreliable. Users must opt-in to share geo or to "claim" a location when they set up or edit their account profile. Claims of location range from 100% authentic to preposterous satirical banter and everything in between. Generally, if you restrict yourself to the study geo-located or location-purported data, you are looking at a small fraction of the whole dataset. For example, here are the top most common 30 "claimed" locations in a dataset of Tweets about "Dominion Voting Systems" from November 2020 where the most common from 421,000 Tweets is 15,000+ (United States) and the 30th is almost 700 (New York, NY). There are 239,051 purported locations in this dataset.
United States USA Texas, USA Florida, USA California, USA Texas Georgia, USA Canada Venezuela New York, USA Pennsylvania, USA Arizona, USA Los Angeles, CA Michigan, USA Ohio, USA North Carolina, USA Florida New Jersey, USA Washington, DC Houston, TX Virginia, USA California Tennessee, USA South Carolina, USA Atlanta, GA Washington, USA Earth Illinois, USA Chicago, IL New York, NY
To study data like this you may need to concatenate things that are similar but the same in practice at the geographical scale you are working with. Some locations are just Twitter handles. Others are Gab or Parler handles. Some look like this:
A Desk in Open Office Hell A Field of White Roses A Galaxy Far Away A Galaxy far far away A Galaxy far, far South A Garden of Feelings &Hot Air A Getsemani, Jerusalen
On Wed, Feb 22, 2023 at 11:32 AM Deena Abul-Fottouh via Air-L < air-l@listserv.aoir.org> wrote:
Hi aoir,
Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages?
Thanks.
Deena
____________________________
Deena Abul-Fottouh, PhD.
Postdoctoral Research Fellow
Digital Society Lab
Political Science Department
Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca
McMaster University
1280 Main Street West
Hamilton, ON L8S 4M4 McMaster University Brighter World logo
References
1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoi r.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1 308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C638126936 017841235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WKWvS%2BfDdAi Ev6vmXtBbHJ2QGNh2UEVF%2BAp2EbYB9k0%3D&reserved=0 Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis tserv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=05%7C01%7Cdgardn er%40chapman.edu%7Ceac89722cd3a4beaad1308db15113cc8%7C809929af2d2545 bf9837089eb9cfbd01%7C1%7C0%7C638126936017841235%7CUnknown%7CTWFpbGZs b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0% 3D%7C3000%7C%7C%7C&sdata=%2FjISq33JyPXsXIztyzvJUzVqDRcb3045fXL2YlIXk gs%3D&reserved=0
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww .aoir.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4be aad1308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C63812 6936017841235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IrdL7PQDe oY9d7Jxstk6jo6FqxMWdxxM%2B6sIX%2BIOScI%3D&reserved=0
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics* _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir. org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1308d b15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C638126936017841 235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTi I6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WKWvS%2BfDdAiEv6vmXtBbH J2QGNh2UEVF%2BAp2EbYB9k0%3D&reserved=0 Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists erv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=05%7C01%7Cdgardner%4 0chapman.edu%7Ceac89722cd3a4beaad1308db15113cc8%7C809929af2d2545bf9837 089eb9cfbd01%7C1%7C0%7C638126936017997446%7CUnknown%7CTWFpbGZsb3d8eyJW IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000% 7C%7C%7C&sdata=WR%2Bbm3wm%2FJNIqEknjbOq7dQpNySzs6It0go%2Bnre66fU%3D&re served=0
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.a oir.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1 308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C63812693601 7997446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ez%2BJOGgHyEteBX8BO 93MXW7k%2B1WwGvzJxkEFVt1sDhc%3D&reserved=0
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao... Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or... - NOTE: This email originated from outside Chapman's network. Do not click links or open attachments unless you recognize the sender and know content is safe. NOTE: This email originated from outside Chapman's network. Do not click links or open attachments unless you recognize the sender and know content is safe. _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/
Actually, I don’t disagree with you Jeff. If their location is public, then the conditions my comment may not apply. However, it still does in the context/specificity of the original question: inferring tweet location in lieu of geo-tags. Geo-tags are much more specific than a profile’s “City, State/Province.” Even if I had my city publicly available, an extreme inference of location on a tweet might approximate my home, to refer to another comment. Even then, if we are speaking generally and only place users “at home,” the problem remains minimal. However, if the inference is closer to an actual geo-tag than general “place” then it potentially goes beyond what they have made available publicly and increases the linkability of information about users without their consent/knowledge, and depending on the subject of the analysis could potentially cause harm. Technically, our addresses are publicly available if someone is on our street, but sharing that information linked to individuals online can be illegal. Again, I agree that general, publicly available location information is pretty safe. I simply cautioned that how any data that includes inferred specific locations, as the original question suggested may be a goal, is very likely to have risks that should be mitigated. Dan From: Jeff Hemsley <jjhemsle@syr.edu> Date: Thursday, February 23, 2023 at 5:55 AM To: Gardner, Daniel <dgardner@chapman.edu>, Libby Hemphill <libbyh@umich.edu>, Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: RE: [Air-L] Methods to infer Twitter data location External Message [External Message] A while back we used the concept of "Place" from Human Geography in a paper where we used the profile listing of where someone claims to be or be from. I've included the cite for the paper below. Yes, people do claim to be from Mars sometimes, like Libby says. I expect that is because they wish to preserve privacy or they just enjoy being silly. Both are ok in my book. And like Libby, I think if your methods take into consideration a certain amount of noise, then it should be fine. Unlike Dan, I'm not to worried about infringing on someone's privacy when using the place they list on their PUBLIC Twitter profile. Here is the paper I mentioned: Hemsley, J. J., & Eckert, J. (2014, January). Examining the role of" place" in twitter networks through the lens of contentious politics. In 2014 47th Hawaii International Conference on System Sciences (pp. 1844-1853). IEEE. -----Original Message----- From: Air-L <air-l-bounces@listserv.aoir.org> On Behalf Of Gardner, Daniel via Air-L Sent: Wednesday, February 22, 2023 3:34 PM To: Libby Hemphill <libbyh@umich.edu>; Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location Hi list, I have been having a nice chat with Deena about this, but given now a few technical responses to this query, I wanted to send to list. I would urge researchers in this space to make sure they are considering the immediate ethical dimensions of attempting to infer the physical locations of users who have, by their settings and practice explicitly said, "please don't know my location," especially if that location may in fact be their home... What that location becomes linked to in any data set produced, could potentially be deeply harmful, or at a minimum invasive. Best of luck, Dan From: Air-L <air-l-bounces@listserv.aoir.org> on behalf of Libby Hemphill via Air-L <air-l@listserv.aoir.org> Date: Wednesday, February 22, 2023 at 12:13 PM To: Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location External Message [External Message] Depending on your tolerance for error and how specific you need the location to be, you may be able to use the user's location (from their profile, available through the users API if you still have access) rather than the tweet's geolocation. People usually tweet from "home". Libby On Wed, Feb 22, 2023 at 2:38 PM Shulman, Stu via Air-L < air-l@listserv.aoir.org> wrote:
User location on Twitter is very tricky and unreliable. Users must opt-in to share geo or to "claim" a location when they set up or edit their account profile. Claims of location range from 100% authentic to preposterous satirical banter and everything in between. Generally, if you restrict yourself to the study geo-located or location-purported data, you are looking at a small fraction of the whole dataset. For example, here are the top most common 30 "claimed" locations in a dataset of Tweets about "Dominion Voting Systems" from November 2020 where the most common from 421,000 Tweets is 15,000+ (United States) and the 30th is almost 700 (New York, NY). There are 239,051 purported locations in this dataset.
United States USA Texas, USA Florida, USA California, USA Texas Georgia, USA Canada Venezuela New York, USA Pennsylvania, USA Arizona, USA Los Angeles, CA Michigan, USA Ohio, USA North Carolina, USA Florida New Jersey, USA Washington, DC Houston, TX Virginia, USA California Tennessee, USA South Carolina, USA Atlanta, GA Washington, USA Earth Illinois, USA Chicago, IL New York, NY
To study data like this you may need to concatenate things that are similar but the same in practice at the geographical scale you are working with. Some locations are just Twitter handles. Others are Gab or Parler handles. Some look like this:
A Desk in Open Office Hell A Field of White Roses A Galaxy Far Away A Galaxy far far away A Galaxy far, far South A Garden of Feelings &Hot Air A Getsemani, Jerusalen
On Wed, Feb 22, 2023 at 11:32 AM Deena Abul-Fottouh via Air-L < air-l@listserv.aoir.org> wrote:
Hi aoir,
Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages?
Thanks.
Deena
____________________________
Deena Abul-Fottouh, PhD.
Postdoctoral Research Fellow
Digital Society Lab
Political Science Department
Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca
McMaster University
1280 Main Street West
Hamilton, ON L8S 4M4 McMaster University Brighter World logo
References
1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoi%2F&data... r.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1 308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C638126936 017841235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WKWvS%2BfDdAi Ev6vmXtBbHJ2QGNh2UEVF%2BAp2EbYB9k0%3D&reserved=0 Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis%2F&data... tserv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=05%7C01%7Cdgardn er%40chapman.edu%7Ceac89722cd3a4beaad1308db15113cc8%7C809929af2d2545 bf9837089eb9cfbd01%7C1%7C0%7C638126936017841235%7CUnknown%7CTWFpbGZs b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0% 3D%7C3000%7C%7C%7C&sdata=%2FjISq33JyPXsXIztyzvJUzVqDRcb3045fXL2YlIXk gs%3D&reserved=0
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww%2F&data... .aoir.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4be aad1308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C63812 6936017841235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IrdL7PQDe oY9d7Jxstk6jo6FqxMWdxxM%2B6sIX%2BIOScI%3D&reserved=0
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics* _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir%2F&dat.... org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1308d b15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C638126936017841 235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTi I6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WKWvS%2BfDdAiEv6vmXtBbH J2QGNh2UEVF%2BAp2EbYB9k0%3D&reserved=0 Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists%2F&da... erv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=05%7C01%7Cdgardner%4 0chapman.edu%7Ceac89722cd3a4beaad1308db15113cc8%7C809929af2d2545bf9837 089eb9cfbd01%7C1%7C0%7C638126936017997446%7CUnknown%7CTWFpbGZsb3d8eyJW IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000% 7C%7C%7C&sdata=WR%2Bbm3wm%2FJNIqEknjbOq7dQpNySzs6It0go%2Bnre66fU%3D&re served=0
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.a%2F&da... oir.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1 308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C63812693601 7997446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ez%2BJOGgHyEteBX8BO 93MXW7k%2B1WwGvzJxkEFVt1sDhc%3D&reserved=0
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao... Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or... - NOTE: This email originated from outside Chapman's network. Do not click links or open attachments unless you recognize the sender and know content is safe. NOTE: This email originated from outside Chapman's network. Do not click links or open attachments unless you recognize the sender and know content is safe. _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao... Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or... - NOTE: This email originated from outside Chapman’s network. Do not click links or open attachments unless you recognize the sender and know content is safe. NOTE: This email originated from outside Chapman’s network. Do not click links or open attachments unless you recognize the sender and know content is safe.
For what its worth, while GPS-tagged tweets continue their slow descent, Place-tagged tweets were on an upward trajectory through when we last looked at the end of last year, though users with a populated Location field were continuing to decline: https://blog.gdeltproject.org/visualizing-a-decade-of-twitters-evolution-jan... Kalev On Thu, Feb 23, 2023 at 9:36 AM Jeff Hemsley via Air-L < air-l@listserv.aoir.org> wrote:
A while back we used the concept of "Place" from Human Geography in a paper where we used the profile listing of where someone claims to be or be from. I've included the cite for the paper below.
Yes, people do claim to be from Mars sometimes, like Libby says. I expect that is because they wish to preserve privacy or they just enjoy being silly. Both are ok in my book. And like Libby, I think if your methods take into consideration a certain amount of noise, then it should be fine.
Unlike Dan, I'm not to worried about infringing on someone's privacy when using the place they list on their PUBLIC Twitter profile.
Here is the paper I mentioned: Hemsley, J. J., & Eckert, J. (2014, January). Examining the role of" place" in twitter networks through the lens of contentious politics. In 2014 47th Hawaii International Conference on System Sciences (pp. 1844-1853). IEEE.
-----Original Message----- From: Air-L <air-l-bounces@listserv.aoir.org> On Behalf Of Gardner, Daniel via Air-L Sent: Wednesday, February 22, 2023 3:34 PM To: Libby Hemphill <libbyh@umich.edu>; Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location
Hi list,
I have been having a nice chat with Deena about this, but given now a few technical responses to this query, I wanted to send to list.
I would urge researchers in this space to make sure they are considering the immediate ethical dimensions of attempting to infer the physical locations of users who have, by their settings and practice explicitly said, "please don't know my location," especially if that location may in fact be their home...
What that location becomes linked to in any data set produced, could potentially be deeply harmful, or at a minimum invasive.
Best of luck, Dan
From: Air-L <air-l-bounces@listserv.aoir.org> on behalf of Libby Hemphill via Air-L <air-l@listserv.aoir.org> Date: Wednesday, February 22, 2023 at 12:13 PM To: Shulman, Stu <stu@texifter.com> Cc: aoir list <air-l@aoir.org> Subject: Re: [Air-L] Methods to infer Twitter data location External Message
[External Message]
Depending on your tolerance for error and how specific you need the location to be, you may be able to use the user's location (from their profile, available through the users API if you still have access) rather than the tweet's geolocation. People usually tweet from "home".
Libby
On Wed, Feb 22, 2023 at 2:38 PM Shulman, Stu via Air-L < air-l@listserv.aoir.org> wrote:
User location on Twitter is very tricky and unreliable. Users must opt-in to share geo or to "claim" a location when they set up or edit their account profile. Claims of location range from 100% authentic to preposterous satirical banter and everything in between. Generally, if you restrict yourself to the study geo-located or location-purported data, you are looking at a small fraction of the whole dataset. For example, here are the top most common 30 "claimed" locations in a dataset of Tweets about "Dominion Voting Systems" from November 2020 where the most common from 421,000 Tweets is 15,000+ (United States) and the 30th is almost 700 (New York, NY). There are 239,051 purported locations in this dataset.
United States USA Texas, USA Florida, USA California, USA Texas Georgia, USA Canada Venezuela New York, USA Pennsylvania, USA Arizona, USA Los Angeles, CA Michigan, USA Ohio, USA North Carolina, USA Florida New Jersey, USA Washington, DC Houston, TX Virginia, USA California Tennessee, USA South Carolina, USA Atlanta, GA Washington, USA Earth Illinois, USA Chicago, IL New York, NY
To study data like this you may need to concatenate things that are similar but the same in practice at the geographical scale you are working with. Some locations are just Twitter handles. Others are Gab or Parler handles. Some look like this:
A Desk in Open Office Hell A Field of White Roses A Galaxy Far Away A Galaxy far far away A Galaxy far, far South A Garden of Feelings &Hot Air A Getsemani, Jerusalen
On Wed, Feb 22, 2023 at 11:32 AM Deena Abul-Fottouh via Air-L < air-l@listserv.aoir.org> wrote:
Hi aoir,
Could anyone please share resources on methods/applications to infer Twitter data locations from non-geo-located messages?
Thanks.
Deena
____________________________
Deena Abul-Fottouh, PhD.
Postdoctoral Research Fellow
Digital Society Lab
Political Science Department
Phone icon t phone: [1](905) 525-9140 Envelope icon email: [2]abulfodm@mcmaster.ca
McMaster University
1280 Main Street West
Hamilton, ON L8S 4M4 McMaster University Brighter World logo
References
1. tel:+19055259140 2. mailto:abulfodm@mcmaster.ca _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoi r.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1 308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C638126936 017841235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMz IiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WKWvS%2BfDdAi Ev6vmXtBbHJ2QGNh2UEVF%2BAp2EbYB9k0%3D&reserved=0 Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flis tserv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=05%7C01%7Cdgardn er%40chapman.edu%7Ceac89722cd3a4beaad1308db15113cc8%7C809929af2d2545 bf9837089eb9cfbd01%7C1%7C0%7C638126936017841235%7CUnknown%7CTWFpbGZs b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0% 3D%7C3000%7C%7C%7C&sdata=%2FjISq33JyPXsXIztyzvJUzVqDRcb3045fXL2YlIXk gs%3D&reserved=0
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww .aoir.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4be aad1308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C63812 6936017841235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IrdL7PQDe oY9d7Jxstk6jo6FqxMWdxxM%2B6sIX%2BIOScI%3D&reserved=0
-- Dr. Stuart W. Shulman Founder and CEO, Texifter Editor Emeritus, *Journal of Information Technology & Politics* _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir. org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1308d b15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C638126936017841 235%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTi I6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WKWvS%2BfDdAiEv6vmXtBbH J2QGNh2UEVF%2BAp2EbYB9k0%3D&reserved=0 Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists erv.aoir.org%2Flistinfo.cgi%2Fair-l-aoir.org&data=05%7C01%7Cdgardner%4 0chapman.edu%7Ceac89722cd3a4beaad1308db15113cc8%7C809929af2d2545bf9837 089eb9cfbd01%7C1%7C0%7C638126936017997446%7CUnknown%7CTWFpbGZsb3d8eyJW IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000% 7C%7C%7C&sdata=WR%2Bbm3wm%2FJNIqEknjbOq7dQpNySzs6It0go%2Bnre66fU%3D&re served=0
Join the Association of Internet Researchers: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.a oir.org%2F&data=05%7C01%7Cdgardner%40chapman.edu%7Ceac89722cd3a4beaad1 308db15113cc8%7C809929af2d2545bf9837089eb9cfbd01%7C1%7C0%7C63812693601 7997446%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLC JBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ez%2BJOGgHyEteBX8BO 93MXW7k%2B1WwGvzJxkEFVt1sDhc%3D&reserved=0
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Faoir.org%2F... Subscribe, change options or unsubscribe at: https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Flistserv.ao...
Join the Association of Internet Researchers:
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.aoir.or... - NOTE: This email originated from outside Chapman's network. Do not click links or open attachments unless you recognize the sender and know content is safe.
NOTE: This email originated from outside Chapman's network. Do not click links or open attachments unless you recognize the sender and know content is safe. _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/ _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
participants (6)
-
Deena Abul-Fottouh -
Gardner, Daniel -
Jeff Hemsley -
kalev leetaru -
Libby Hemphill -
Shulman, Stu