[HN Gopher] Most popular links in Hacker News comments, 2006-2015 ___________________________________________________________________ Most popular links in Hacker News comments, 2006-2015 Author : simonebrunozzi Score : 58 points Date : 2020-11-12 20:13 UTC (2 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | zuhayeer wrote: | Is it possible to do this for 2015 until now as well? | dgritsko wrote: | If you're like me, and immediately clicked on all the XKCD links: | | #3. http://xkcd.com/927/ | | #11. http://xkcd.com/386/ | | #12. http://xkcd.com/538/ | | #46. http://xkcd.com/936/ | | #60. http://xkcd.com/327/ | | #64. http://xkcd.com/605/ | | #69. http://xkcd.com/378/ | | #78. http://xkcd.com/810/ | | #91. http://xkcd.com/1053/ | sleavey wrote: | I'm quite surprised #3 has only been mentioned 197 times (well, | a few more after this thread) in 14 years. | deeg wrote: | #11 literally changed my life. I used to debate people online | and get irrationally upset when I couldn't change their | opinions. Reading that xkcd was like getting a dope slap. I | still debate but I rarely let myself get upset and if I do I | try to hold that in my mind. | macieklaskus wrote: | Very cool! It's already useful, but you could make it even more | so by enabling to sort by smaller time ranges (e.g. 1 year). | | It would also be interesting to see a version of this list | weighted by karma scores of users who posted the links. | | Edit: even better, use your own h-index ranking | https://github.com/antontarasenko/smq/blob/master/reports/ha... | saagarjha wrote: | If I ever needed to summarize Hacker News into a single list, | this would absolutely be it. | minimaxir wrote: | A reminder that BigQuery (as used in the query in this link) is | the best way to play with Hacker News data; don't scrape HN data | manually! | | The `bigquery-public-data.hacker_news.full` table appears to be | up to date with the most recent HN data as well (table last | updated today). | | However, I'm not 100% sure the query is correct for unilaterally | getting all links, as running the query on the full dataset | returns the same results as running it from 2006-2015. And I | value my sanity enough to not fuss around with the regex. | bigdict wrote: | By the way, what is the best way to download this dataset? Last | time I messed with it I had to pay for a Google Cloud bucket | and run through some awkward sequence of steps to eventually | get a local copy. | minimaxir wrote: | That's essentially it (export the BQ table as a CSV to a | Google Cloud Storage bucket, then download it from there), | but you can do that entirely in the web UI, no CLI needed. | simonsarris wrote: | Funny that searchyc.com was so necessary for so long, coming in | at #54 _and_ #68 (it seems it should be higher as these should be | combined). Now it just redirects to a spam /ad website, but | before HN had a search bar it was very useful. | | Also interesting that it contains at #55: | https://news.ycombinator.com/best | | But not the ostensibly more useful: | https://news.ycombinator.com/active | | The url "u.ly/73I" #63 is very interesting, its not seen almost | anywhere else on the web (at least on Google), and is apparently | spam, now, and when you click on mentions for that matter you | get: | | > We found no comments matching u.ly/73I | | What's the deal with that one? Was it spam comments that all got | removed? It may be since some of the others were spam, like this | one: https://goo.gl/l5v0b | | It's impressive how little spam spam (as opposed to submarines, | this is where I link to PGs essay) is on HN. | duckerude wrote: | The u.ly link seems to have been spam for shoes. The wayback | machine captured it: | https://web.archive.org/web/20110319115346/http://u.ly/73I | pgt wrote: | HN has a search bar? _scrolls down_ Oh, wow! I always use | hn.algolia.com. | anonytrary wrote: | Nice idea! Note the scraper seems to have produced a duplicate | entry: 1. | en.wikipedia.org/wiki/Betteridges_Law_of_Headlines 2. | en.wikipedia.org/wiki/Betteridge%2527s_law_of_headlines | | The second is a bad link, so I am curious how that link got | shared so much. ___________________________________________________________________ (page generated 2020-11-12 23:02 UTC)