The API depth limit CAN be circumvented, but it is quite tedious to collect the full range of postings. It is probably easier/faster to make friends with someone who already has done the scraping who can run a query on their behalf. My own archive spans from Reddit's launch to about a month ago. I did collect upvotes and downvotes, but those numbers are always a bit fuzzy. I can query the top X most downvoted submissions for them if they want, but they should make it a reasonable number, as there are over a million submissions to r/politics. Happy to chat further off the list, Kelly On Mon, Oct 21, 2013 at 2:40 PM, Alex Leavitt <alexleavitt@gmail.com> wrote:
To follow up with Kyle's comment, you can also use the API to get profile information (though in my attempts I've also set up a web scraper to get user profile info like karma scores, date joined, and awarded shields). Still, the API only allows you to see the user's most recent 1000 posts/comments.
Alexander Leavitt PhD Student USC Annenberg School for Communication & Journalism http://alexleavitt.com Twitter: @alexleavitt <http://twitter.com/alexleavitt>
On Mon, Oct 21, 2013 at 11:36 AM, Kyle Kontour <kkontour@gmail.com> wrote:
Alex is spot on. Also I would suggest that in /r/politics it may be fruitful to check the individual histories of the "controversial" posts and/or any downvoted posts within any of those threads: unpopular views/posts seem likely to me to be posted either by people who hold consistently unpopular views, or who are trolling (the difference may not be clear). That might give you some metadata to go on.
On Mon, Oct 21, 2013 at 12:29 PM, Alex Leavitt <alexleavitt@gmail.com
wrote:
I've done a lot of work recently with reddit data. It's actually a difficult issue of sampling unless you have access to their backend (which I believe nobody has ever gotten). Unfortunately what you're asking is near impossible.
One thing you can do is sort by "controversial" (though this isn't "most downvoted," it's only the most-even up-to-downvote ratio). Then you can collect via the API the "top 1000" posts (or here, most controversial
by various date lengths (all time, past year, past month, today, etc.).
As far as "near" impossible, if you're interested in a sub other than /r/politics, you can start scraping every single post from a newly created subreddit, so you have a population sample of the posts and their voting scores. But with a long-time, active sub like /r/politics, you can't get past the API limitations.
Alex
---
Alexander Leavitt PhD Student USC Annenberg School for Communication & Journalism http://alexleavitt.com Twitter: @alexleavitt <http://twitter.com/alexleavitt>
On Mon, Oct 21, 2013 at 11:23 AM, jose marichal < marichal@callutheran.edu
wrote:
Colleagues,
A student of mine is doing a project looking at the most downvoted articles in the sub-reddit "r/politics." Does anyone know of a way to identify the most "downvoted" links on the site?
Many thanks, Jose
_______________________________________________________________________________________
josé marichal, ph.d. | associate professor | political science <http://about.me/marichal> department | california lutheran university 60 w. olsen road | #3800 | thousand oaks, ca 91360 805-493-3328 _______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers
Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/
_______________________________________________ The Air-L@listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers: http://www.aoir.org/