[HN Gopher] Show HN: Marginalia - Exploration Mode
       ___________________________________________________________________
        
       Show HN: Marginalia - Exploration Mode
        
       I've been a bit obsessed with the idea of flipping through the
       internet a bit like you would a magazine, of undirected browsing as
       a discovery mechanism, and I think I'm approaching something that's
       beginning to feel pretty fun.  The link at the top will return
       results out of a pool of approximately 10,000 domains, you can
       refresh to get new ones. You can also explore in a directed fashion
       by using the 'Similar Domains'-buttons. These are not random.  A
       sampler, beyond the random sites offered with the head link
       https://search.marginalia.nu/explore/www.amiga-news.de
       https://search.marginalia.nu/explore/www.aaronsw.com
       https://search.marginalia.nu/explore/therealbitcoin.org  I don't
       have thumbnails for all 500k domains in the database yet, but I
       think it's getting to a number where it's reasonable useful.
        
       Author : marginalia_nu
       Score  : 161 points
       Date   : 2022-01-23 16:29 UTC (6 hours ago)
        
 (HTM) web link (search.marginalia.nu)
 (TXT) w3m dump (search.marginalia.nu)
        
       | ancientsofmumu wrote:
       | This feels like what StumbleUpon was (a positive correlation :) )
       | -- would you be willing to add what criteria "similar" is based
       | on in the info box upper left? For example I have no clue what
       | this below domain is, so would be curious as to what the
       | algorithm uses as "similar" to show me more (keywords? links?
       | domain names? hosting providers? tech used? country located?
       | etc.) https://search.marginalia.nu/explore/cblgh.org
        
         | marginalia_nu wrote:
         | It's mostly adjacency in the link graph. I use a mix of direct
         | neighbors and Personalized PageRank to produce the list.
        
       | arendtio wrote:
       | Feels pretty cool, more like the traditional internet.
       | 
       | After a few minutes I found, that I would prefer a page that is
       | not left-aligned, something like                 #article {
       | margin: 0 auto;       }
       | 
       | A minor change that makes it much more comfortable to use IMHO.
        
         | marginalia_nu wrote:
         | Hmm, are you on mobile, desktop? What's your browser and screen
         | resolution?
         | 
         | I did relatively recently redesign the whole stylesheet, so
         | there's probably a few minor problems to iron out.
        
           | arendtio wrote:
           | Desktop, Firefox 3840x2160 with window.devicePixelRatio = 1.5
           | 
           | So I run into the max-width of 160ch (which feels good), but
           | I have a lot of whitespace on the right.
        
             | marginalia_nu wrote:
             | Hmm, yeah. I think I see what you mean. Good call. I've
             | pushed a new CSS.
        
               | arendtio wrote:
               | Cool, it looks great now. Thanks :-)
        
       | laputan_machine wrote:
       | This is really cool, it's like being back in 1998 again when
       | browsing the internet was exciting, bookmarked! I'll be exploring
       | this for a while I think
        
       | gbuk2013 wrote:
       | I don't get it - I clicked on about 10 sites and none of them
       | look anything like the screenshot picture?
        
         | marginalia_nu wrote:
         | I wanted to provide an example of what the content of the
         | websites look like, which you'll rarely find on the front page.
         | So the screenshots are of URLs that are actually indexed by my
         | search index. If you use the 'Info' link you can usually find
         | the particular page. On the flip side, actually linking to
         | those URLs may land you on a privacy policy or some weird deep
         | link.
         | 
         | Dunno, maybe it's a confusing choice.
        
       | broahmed wrote:
       | This is really cool. I have 7 tabs of quirky barely known
       | websites open after maybe less than 5 minutes of interacting with
       | Marginalia. This is so much fun!
        
         | marsa wrote:
         | any final fantasy series related fansites? for me there's
         | always at least 1-2 on the list whenever i reload.
        
           | marginalia_nu wrote:
           | Haha, the fansite-sphere is one of like 6-7 hotspots the
           | random function favors. May be there's a smidge too many
           | right now, but I've tried to get a good mix of bits and bobs
           | with hopefully a little bit for everyone.
        
       | MayeulC wrote:
       | It's awesome! I like most suggestions, it feels a bit like
       | https://wiby.me/surprise but generally less weird.
       | 
       | I've often wanted to have a go at making my own search engine,
       | and I think I would penalize any form of advertising (especially
       | big ad networks, referral links) or tracking (Google Analytics,
       | etc.) as these can create (or reveal) perverse incentives. This
       | would likely get rid of most of the "SEO spam" that we see
       | nowadays. Reading the about page[1], this seems like what you are
       | doing here, but I can't really tell as it's light on details.
       | 
       | Q: would this be able to handle foreign-language sites? I don't
       | yet have a blog/personal website, but if I did, I guess it would
       | be mixed-language. Should I submit some of my friends' blogs,
       | even though they might not be entirely (or at all) written in
       | English?
       | 
       | A relatively new sort of search-engine junk, especially visible
       | in non-English results from big search engines is also auto-
       | generated (or probably machine-translated) websites, full of
       | nonsensical content. They might be translated from genuine sites
       | in other languages, I'm not sure. It would seem hard to fend
       | these off, but luckily, fighting perverse incentives such as
       | advertisement revenue probably gets rid of them too.
       | 
       | I also wondered if this was curated list, and if the list was
       | available somewhere, but it seems it's just a good old spider,
       | and I guess that exposing too much info about the metrics might
       | enable some to game the system? Not that marginalia is big enough
       | to make it an attractive target, of course!
       | 
       | [1]: https://memex.marginalia.nu/projects/edge/about.gmi
        
         | marginalia_nu wrote:
         | I'm keeping a few of the details intentionally sketchy, but in
         | general, I do think it's relatively resilient to manipulation.
         | I'm using a Personalized PageRank which uses the opinions of a
         | secret subset of websites to calculate a ranking. I've also
         | selected those websites to be not be particularly likely to be
         | bribed.
         | 
         | Bilingual sites should be fine, I think. It will reject
         | individual pages that don't have enough English text on them,
         | but as long as it finds pages with English relatively easily
         | they ought to get indexed.
        
       | joebob42 wrote:
       | It's gem after gem after gem, this is brilliant
        
       | marsa wrote:
       | > undirected browsing as a discovery mechanism
       | 
       | this seems to be a thoroughly underrated way of discovery and
       | it's so disappointing when websites focus solely on search,
       | forcing users to know and articulate what they came for.
       | 
       | browsing through these random sites you list is indeed a very fun
       | -- and liberating -- experience. thank you for putting this
       | together!
        
         | marginalia_nu wrote:
         | I do think a lot of recommendation algorithms these days are
         | bit too good at finding things that are similar to what we
         | like. Which means you never discover things that you'll like,
         | but are not similar to the things you've tried before. It
         | becomes incredibly samey after a while.
         | 
         | The great joy of, say, flipping through a magazine or browsing
         | a library is that they are passive, and don't know who you are,
         | and can't adapt what you read based on what you're likely to
         | read. So you might read something unexpected, you might
         | discover something you didn't even know about yourself.
        
           | marsa wrote:
           | thing is they're often really really bad too. e.g. when i
           | 'explore' albums on youtube music it force feeds me a limited
           | selection of new releases based on popularity, percieved
           | genre preference, and my geographic location, probably some
           | other stuff as well. less than 5% of those recommendations
           | end up being of any interest to me.
           | 
           | meanwhile all i really desire is a full list of releases
           | ordered by date and just let me sift through that myself, but
           | there seems to be no way to get that list, at least not
           | through the regular user interface.
           | 
           | it's very frustrating.
        
             | blowski wrote:
             | For the last few months, I've consumed every piece of media
             | reviewed by the FT Weekend. It's been a mixed bag, but I've
             | made some wonderful discoveries.
        
               | marsa wrote:
               | thanks for the suggestion -- is this what you're refering
               | to? https://www.ft.com/arts/music/albums
               | 
               | i'll probably add it to my sources, but still a
               | comprehensive list of new releases would be a dream come
               | true.
        
         | ncpa-cpl wrote:
         | > browsing through these random sites you list is indeed a very
         | fun -- and liberating -- experience. thank you for putting this
         | together!
         | 
         | Yeah! I really liked the concept too.
         | 
         | This reminds me of the early Stumble Upon or even channel
         | surfing cable tv back when it was analog!
        
           | marginalia_nu wrote:
           | I really liked StumbleUpon before it kinda turned to shit. I
           | also kinda miss the feeling of not having everything be tuned
           | for user engagement. It's a big part why there is no vote-
           | arrows, thumbs up, stars, et cetera involved here. You shake
           | the snow globe and get what you get.
        
       | phendrenad2 wrote:
       | Hey this works great. Found some new and interesting sites.
        
       | slx26 wrote:
       | Quirky sites alleviate my disdain for humanity, somewhat. Thanks.
        
       | rixed wrote:
       | It looks like a book shop and I like the idea.
       | 
       | A nitpick though: Shouldn't the "capture in progress" pages be
       | excluded from the random search?
        
         | marginalia_nu wrote:
         | Yeah, the whole thing isn't super polished, still a work in
         | progress. There's also a few thumbnails that were captured mid-
         | loading I'd like to improve down the line.
         | 
         | Right now it's a mix between domains I simply haven't captured
         | a thumbnail for yet, and domains that for some reason won't be
         | captured (errors, etc). Once I reduce the first category, I'll
         | look for a way of hiding the second category.
        
       | was_a_dev wrote:
       | I almost kinda didn't bother because of the cloudflare DDoS
       | protection. I know that can be petty, but I wouldn't have waited
       | if it was from a Google results page for example.
        
         | marginalia_nu wrote:
         | I just turned it up a notch right now for a moment, a lot of
         | people are really aggressively bot-scraping new HN submissions
         | for whatever reason. It's like a minor DoS every time you
         | submit a link.
         | 
         | That's fine for a blog I guess, but this I perform a non-
         | trivial calculation for each request, so I'd rather not have
         | bot spam. (This is hosted on a computer in my living room, so I
         | can't just scale it up)
        
           | Nextgrid wrote:
           | > I perform a non-trivial calculation for each request
           | 
           | Any reason why caching wouldn't work here? Do the results
           | have to be different on each request instead of being cached
           | for a short while (10 seconds)?
        
             | marginalia_nu wrote:
             | Oh yeah, you could probably do some sort of caching to that
             | effect. This is just a fun toy I hacked together, so it's
             | not super optimized.
        
           | was_a_dev wrote:
           | I mean fair enough. I gave you the time since I came from HN
           | and knew the risk/reward for good content was strong.
           | 
           | If it was more than a toy, it would need to be less aggresive
        
           | kreeben wrote:
           | >> This is hosted on a computer in my living room
           | 
           | Since you serve Swedish weather info from marginalia I'm
           | assuming you live in Sweden, is that correct? Could you very
           | briefly explain how you host and serve pages from your living
           | room and what your bandwidth is?
           | 
           | Does your ISP get cranky when you see DOS type of traffic?
           | 
           | I'm a fellow hobbyist search engine dev, also from Sweden.
           | Whenever I demonstrate my search engine by hosting in the
           | cloud the expenses get so big I have to go offline after a
           | short while and I've therefore been contemplating personal,
           | living room hosting.
        
             | marginalia_nu wrote:
             | > Since you serve Swedish weather info from marginalia I'm
             | assuming you live in Sweden, is that correct?
             | 
             | It is indeed.
             | 
             | > Could you very briefly explain how you host and serve
             | pages from your living room and what your bandwidth is?
             | 
             | 100/100 mbit municipal broadband, through Bahnhof.
             | 
             | > Does your ISP get cranky when you see DOS type of
             | traffic?
             | 
             | Haven't heard a word form them, although you'd be surprised
             | how far I am from saturating my line. Your average
             | bittorrent enthusiast probably uses a lot more. I do try to
             | not be a nuisance though. Cloudflare helps take the edge
             | off things, as does running a local DNS cache.
             | 
             | > I'm a fellow hobbyist search engine dev, also from
             | Sweden. Whenever I demonstrate my search engine by hosting
             | in the cloud the expenses get so big I have to go offline
             | after a short while and I've therefore been contemplating
             | personal, living room hosting.
             | 
             | You might also consider server rental. Can get away with
             | SEK 2-4k/month. My server, including UPS and other expenses
             | is like SEK 40k, plus I expect to burn through an SSD once
             | a year or so.
        
               | kreeben wrote:
               | Very helpful, thank you. Which part of Sweden are you in,
               | by the way.
        
               | marginalia_nu wrote:
               | Up north.
        
       | [deleted]
        
       | gillesjacobs wrote:
       | You posted on related topics a few weeks back with your
       | Marginalia projects and I spent an hour browsing your sites.
       | Making the "small web" and its creative weirdness visible again
       | pulls on my nostalgia strings. Good work!
        
       | 1vuio0pswjnm7 wrote:
       | The <h1> banner is "Search the internet" but are we only
       | searching www servers.
       | 
       | Can we use marginalia.nu to search for servers offering other
       | protocols like ftp.
        
         | marginalia_nu wrote:
         | It's been my ambition to support Gemini and Gopher down the
         | line.
        
       ___________________________________________________________________
       (page generated 2022-01-23 23:00 UTC)