[HN Gopher] Most popular links in Hacker News comments, 2006-2015
       ___________________________________________________________________
        
       Most popular links in Hacker News comments, 2006-2015
        
       Author : simonebrunozzi
       Score  : 58 points
       Date   : 2020-11-12 20:13 UTC (2 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | zuhayeer wrote:
       | Is it possible to do this for 2015 until now as well?
        
       | dgritsko wrote:
       | If you're like me, and immediately clicked on all the XKCD links:
       | 
       | #3. http://xkcd.com/927/
       | 
       | #11. http://xkcd.com/386/
       | 
       | #12. http://xkcd.com/538/
       | 
       | #46. http://xkcd.com/936/
       | 
       | #60. http://xkcd.com/327/
       | 
       | #64. http://xkcd.com/605/
       | 
       | #69. http://xkcd.com/378/
       | 
       | #78. http://xkcd.com/810/
       | 
       | #91. http://xkcd.com/1053/
        
         | sleavey wrote:
         | I'm quite surprised #3 has only been mentioned 197 times (well,
         | a few more after this thread) in 14 years.
        
         | deeg wrote:
         | #11 literally changed my life. I used to debate people online
         | and get irrationally upset when I couldn't change their
         | opinions. Reading that xkcd was like getting a dope slap. I
         | still debate but I rarely let myself get upset and if I do I
         | try to hold that in my mind.
        
       | macieklaskus wrote:
       | Very cool! It's already useful, but you could make it even more
       | so by enabling to sort by smaller time ranges (e.g. 1 year).
       | 
       | It would also be interesting to see a version of this list
       | weighted by karma scores of users who posted the links.
       | 
       | Edit: even better, use your own h-index ranking
       | https://github.com/antontarasenko/smq/blob/master/reports/ha...
        
       | saagarjha wrote:
       | If I ever needed to summarize Hacker News into a single list,
       | this would absolutely be it.
        
       | minimaxir wrote:
       | A reminder that BigQuery (as used in the query in this link) is
       | the best way to play with Hacker News data; don't scrape HN data
       | manually!
       | 
       | The `bigquery-public-data.hacker_news.full` table appears to be
       | up to date with the most recent HN data as well (table last
       | updated today).
       | 
       | However, I'm not 100% sure the query is correct for unilaterally
       | getting all links, as running the query on the full dataset
       | returns the same results as running it from 2006-2015. And I
       | value my sanity enough to not fuss around with the regex.
        
         | bigdict wrote:
         | By the way, what is the best way to download this dataset? Last
         | time I messed with it I had to pay for a Google Cloud bucket
         | and run through some awkward sequence of steps to eventually
         | get a local copy.
        
           | minimaxir wrote:
           | That's essentially it (export the BQ table as a CSV to a
           | Google Cloud Storage bucket, then download it from there),
           | but you can do that entirely in the web UI, no CLI needed.
        
       | simonsarris wrote:
       | Funny that searchyc.com was so necessary for so long, coming in
       | at #54 _and_ #68 (it seems it should be higher as these should be
       | combined). Now it just redirects to a spam /ad website, but
       | before HN had a search bar it was very useful.
       | 
       | Also interesting that it contains at #55:
       | https://news.ycombinator.com/best
       | 
       | But not the ostensibly more useful:
       | https://news.ycombinator.com/active
       | 
       | The url "u.ly/73I" #63 is very interesting, its not seen almost
       | anywhere else on the web (at least on Google), and is apparently
       | spam, now, and when you click on mentions for that matter you
       | get:
       | 
       | > We found no comments matching u.ly/73I
       | 
       | What's the deal with that one? Was it spam comments that all got
       | removed? It may be since some of the others were spam, like this
       | one: https://goo.gl/l5v0b
       | 
       | It's impressive how little spam spam (as opposed to submarines,
       | this is where I link to PGs essay) is on HN.
        
         | duckerude wrote:
         | The u.ly link seems to have been spam for shoes. The wayback
         | machine captured it:
         | https://web.archive.org/web/20110319115346/http://u.ly/73I
        
         | pgt wrote:
         | HN has a search bar? _scrolls down_ Oh, wow! I always use
         | hn.algolia.com.
        
       | anonytrary wrote:
       | Nice idea! Note the scraper seems to have produced a duplicate
       | entry:                 1.
       | en.wikipedia.org/wiki/Betteridges_Law_of_Headlines       2.
       | en.wikipedia.org/wiki/Betteridge%2527s_law_of_headlines
       | 
       | The second is a bad link, so I am curious how that link got
       | shared so much.
        
       ___________________________________________________________________
       (page generated 2020-11-12 23:02 UTC)