Re: [Air-L] Tool for collecting Instagram images/websites/data?
maybe the R package instaR: https://github.com/pablobarbera/instaR/blob/master/examples.R For longterm applications you could learn web scraping with R package rvest (there ar more options for web scraping) The quick-and-dirty method is search a tag on Instagram with a browser, scroll down multiple times to load as much pictures as you need, then save the resulting Instagram page in a new specific folder, which then contains all pictures with that specific tag. hope that helps On Sat, Sep 17, 2016 at 5:27 PM, Rainer Hillrichs <hillrichs@uni-mannheim.de
wrote:
Dear all,
I searched on the list and on the web but couldn't find anything: I'm looging for a tool that collects Instagram images, websites, and data associated with a specific tag. Basically, I want to type in a tag and end up with a folder full of images, websites, and a table with data (e.g. user name, date posted, URL, other tags). I already suspect that is a lot to ask for ;-) Even a simpler tool would be a good start! As long as I don't have to to end up saving individual images, websites, and typiing/copying stuff into a table.
Suggestions very much appreciated! Rainer
-- Dr. Rainer Hillrichs Universität Mannheim https://uni-mannheim.academia.edu/RainerHillrichs _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/ listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- ________________________________________________ Maurice Vergeer To contact me, see http://mauricevergeer.nl/node/5 To see my publications, see http://mauricevergeer.nl/node/1 ________________________________________________
Hi Rainer, When collecting data for my Harkive project (which collects on the #harkive hashtag across various platforms, including Instagram, Twitter, etc), I recently started using Zapier, which I found solved a lot of time-consuming issues. The main advantage for my purposes was that it significantly reduced data cleaning/organising post collection. As I’m sure you are aware, the API of each service uses slightly different terms for common elements of the data available. EG: On Twitter the text content of a Tweet is contained with the <text> element, whereas on Tumblr the text written by a user is in the <body> element. Not only that, but date/time stamps often appear in different formats. What Zapier allowed me to do was collect from the various APIs, reformat to common data formats, and then write all data to specific columns in a central GoogleDocs Spreadsheet - so, all commonly formatted date/time stamps appeared in a single column, all usernames in another, and so on, regardless of the service they originated from. I was also able to create a new entry in each row that labelled each entry as originating from Twitter, Tumblr, Instagram, etc. This was all in an attempt to have my data rendered as ‘Tidy’, in the Hadley Wickham sense, automatically. The disadvantage of Zapier is that there is a charge. I was collecting over a short period of time so I was able to keep this cost quite low. If you are collecting a lot of data over a sustained period of time, you may find it prohibitively expensive. Let me know if you would like to know more - I am happy to share my rough workflow notes with you, or can show you in Berlin during the conference. Kind regards Craig
On 18 Sep 2016, at 10:24, Maurice Vergeer <m.vergeer@maw.ru.nl> wrote:
maybe the R package instaR: https://github.com/pablobarbera/instaR/blob/master/examples.R For longterm applications you could learn web scraping with R package rvest (there ar more options for web scraping) The quick-and-dirty method is search a tag on Instagram with a browser, scroll down multiple times to load as much pictures as you need, then save the resulting Instagram page in a new specific folder, which then contains all pictures with that specific tag.
hope that helps
On Sat, Sep 17, 2016 at 5:27 PM, Rainer Hillrichs <hillrichs@uni-mannheim.de
wrote:
Dear all,
I searched on the list and on the web but couldn't find anything: I'm looging for a tool that collects Instagram images, websites, and data associated with a specific tag. Basically, I want to type in a tag and end up with a folder full of images, websites, and a table with data (e.g. user name, date posted, URL, other tags). I already suspect that is a lot to ask for ;-) Even a simpler tool would be a good start! As long as I don't have to to end up saving individual images, websites, and typiing/copying stuff into a table.
Suggestions very much appreciated! Rainer
-- Dr. Rainer Hillrichs Universität Mannheim https://uni-mannheim.academia.edu/RainerHillrichs _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/ listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- ________________________________________________ Maurice Vergeer To contact me, see http://mauricevergeer.nl/node/5 To see my publications, see http://mauricevergeer.nl/node/1 ________________________________________________ _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
Hi Rainer, As others already noted, it may be hard to find a single tool to achieve this. In addition to some previous suggestions, here is another solution to collect/download Instagram images related to a particular hashtag/location: 1) Use Netlytic (https://netlytic.org) to collect Instagram posts/comments (and their associated metadata, including image urls); there is a video tutorial on how to use it at https://netlytic.org/home/?page_id=11280 2) Export the dataset collected with Netlytic to your computer (as a CSV file) and open it in Excel; 3) Copy and paste the "medialink" column into a new text file (let's call it "imgurls.txt"); Note: since Netlytic collects all comments related to your search query (hashtag or location), you can use Excel's Data->"Removes Duplicates" feature to remove duplicate image URLs (there may be multiple comments posted in response to a single image/post); 4) Open this text file "imgurls.txt" in Firefox (Ctrl+O), and then use the DownThemAll plugin in Firefox (right click on the page ->context menu) to download all of the links/images into a single folder on your computer (you need to install the DownThemAll plugin first from https://addons.mozilla.org/en-US/firefox/addon/downthemall/developers ). Hope it helps, Anatoliy -----Original Message----- From: Air-L [mailto:air-l-bounces@listserv.aoir.org] On Behalf Of Maurice Vergeer Sent: Sunday, September 18, 2016 5:24 AM To: Rainer Hillrichs <hillrichs@uni-mannheim.de> Cc: air-l@listserv.aoir.org Subject: Re: [Air-L] Tool for collecting Instagram images/websites/data? maybe the R package instaR: https://github.com/pablobarbera/instaR/blob/master/examples.R For longterm applications you could learn web scraping with R package rvest (there ar more options for web scraping) The quick-and-dirty method is search a tag on Instagram with a browser, scroll down multiple times to load as much pictures as you need, then save the resulting Instagram page in a new specific folder, which then contains all pictures with that specific tag. hope that helps On Sat, Sep 17, 2016 at 5:27 PM, Rainer Hillrichs <hillrichs@uni-mannheim.de
wrote:
Dear all,
I searched on the list and on the web but couldn't find anything: I'm looging for a tool that collects Instagram images, websites, and data associated with a specific tag. Basically, I want to type in a tag and end up with a folder full of images, websites, and a table with data (e.g. user name, date posted, URL, other tags). I already suspect that is a lot to ask for ;-) Even a simpler tool would be a good start! As long as I don't have to to end up saving individual images, websites, and typiing/copying stuff into a table.
Suggestions very much appreciated! Rainer
-- Dr. Rainer Hillrichs Universität Mannheim https://uni-mannheim.academia.edu/RainerHillrichs _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/ listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
-- ________________________________________________ Maurice Vergeer To contact me, see http://mauricevergeer.nl/node/5 To see my publications, see http://mauricevergeer.nl/node/1 ________________________________________________ _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org Join the Association of Internet Researchers: http://www.aoir.org/
participants (3)
-
Anatoliy -
Craig Hamilton -
Maurice Vergeer