[HN Gopher] Show HN: Marginalia - Exploration Mode ___________________________________________________________________ Show HN: Marginalia - Exploration Mode I've been a bit obsessed with the idea of flipping through the internet a bit like you would a magazine, of undirected browsing as a discovery mechanism, and I think I'm approaching something that's beginning to feel pretty fun. The link at the top will return results out of a pool of approximately 10,000 domains, you can refresh to get new ones. You can also explore in a directed fashion by using the 'Similar Domains'-buttons. These are not random. A sampler, beyond the random sites offered with the head link https://search.marginalia.nu/explore/www.amiga-news.de https://search.marginalia.nu/explore/www.aaronsw.com https://search.marginalia.nu/explore/therealbitcoin.org I don't have thumbnails for all 500k domains in the database yet, but I think it's getting to a number where it's reasonable useful. Author : marginalia_nu Score : 161 points Date : 2022-01-23 16:29 UTC (6 hours ago) (HTM) web link (search.marginalia.nu) (TXT) w3m dump (search.marginalia.nu) | ancientsofmumu wrote: | This feels like what StumbleUpon was (a positive correlation :) ) | -- would you be willing to add what criteria "similar" is based | on in the info box upper left? For example I have no clue what | this below domain is, so would be curious as to what the | algorithm uses as "similar" to show me more (keywords? links? | domain names? hosting providers? tech used? country located? | etc.) https://search.marginalia.nu/explore/cblgh.org | marginalia_nu wrote: | It's mostly adjacency in the link graph. I use a mix of direct | neighbors and Personalized PageRank to produce the list. | arendtio wrote: | Feels pretty cool, more like the traditional internet. | | After a few minutes I found, that I would prefer a page that is | not left-aligned, something like #article { | margin: 0 auto; } | | A minor change that makes it much more comfortable to use IMHO. | marginalia_nu wrote: | Hmm, are you on mobile, desktop? What's your browser and screen | resolution? | | I did relatively recently redesign the whole stylesheet, so | there's probably a few minor problems to iron out. | arendtio wrote: | Desktop, Firefox 3840x2160 with window.devicePixelRatio = 1.5 | | So I run into the max-width of 160ch (which feels good), but | I have a lot of whitespace on the right. | marginalia_nu wrote: | Hmm, yeah. I think I see what you mean. Good call. I've | pushed a new CSS. | arendtio wrote: | Cool, it looks great now. Thanks :-) | laputan_machine wrote: | This is really cool, it's like being back in 1998 again when | browsing the internet was exciting, bookmarked! I'll be exploring | this for a while I think | gbuk2013 wrote: | I don't get it - I clicked on about 10 sites and none of them | look anything like the screenshot picture? | marginalia_nu wrote: | I wanted to provide an example of what the content of the | websites look like, which you'll rarely find on the front page. | So the screenshots are of URLs that are actually indexed by my | search index. If you use the 'Info' link you can usually find | the particular page. On the flip side, actually linking to | those URLs may land you on a privacy policy or some weird deep | link. | | Dunno, maybe it's a confusing choice. | broahmed wrote: | This is really cool. I have 7 tabs of quirky barely known | websites open after maybe less than 5 minutes of interacting with | Marginalia. This is so much fun! | marsa wrote: | any final fantasy series related fansites? for me there's | always at least 1-2 on the list whenever i reload. | marginalia_nu wrote: | Haha, the fansite-sphere is one of like 6-7 hotspots the | random function favors. May be there's a smidge too many | right now, but I've tried to get a good mix of bits and bobs | with hopefully a little bit for everyone. | MayeulC wrote: | It's awesome! I like most suggestions, it feels a bit like | https://wiby.me/surprise but generally less weird. | | I've often wanted to have a go at making my own search engine, | and I think I would penalize any form of advertising (especially | big ad networks, referral links) or tracking (Google Analytics, | etc.) as these can create (or reveal) perverse incentives. This | would likely get rid of most of the "SEO spam" that we see | nowadays. Reading the about page[1], this seems like what you are | doing here, but I can't really tell as it's light on details. | | Q: would this be able to handle foreign-language sites? I don't | yet have a blog/personal website, but if I did, I guess it would | be mixed-language. Should I submit some of my friends' blogs, | even though they might not be entirely (or at all) written in | English? | | A relatively new sort of search-engine junk, especially visible | in non-English results from big search engines is also auto- | generated (or probably machine-translated) websites, full of | nonsensical content. They might be translated from genuine sites | in other languages, I'm not sure. It would seem hard to fend | these off, but luckily, fighting perverse incentives such as | advertisement revenue probably gets rid of them too. | | I also wondered if this was curated list, and if the list was | available somewhere, but it seems it's just a good old spider, | and I guess that exposing too much info about the metrics might | enable some to game the system? Not that marginalia is big enough | to make it an attractive target, of course! | | [1]: https://memex.marginalia.nu/projects/edge/about.gmi | marginalia_nu wrote: | I'm keeping a few of the details intentionally sketchy, but in | general, I do think it's relatively resilient to manipulation. | I'm using a Personalized PageRank which uses the opinions of a | secret subset of websites to calculate a ranking. I've also | selected those websites to be not be particularly likely to be | bribed. | | Bilingual sites should be fine, I think. It will reject | individual pages that don't have enough English text on them, | but as long as it finds pages with English relatively easily | they ought to get indexed. | joebob42 wrote: | It's gem after gem after gem, this is brilliant | marsa wrote: | > undirected browsing as a discovery mechanism | | this seems to be a thoroughly underrated way of discovery and | it's so disappointing when websites focus solely on search, | forcing users to know and articulate what they came for. | | browsing through these random sites you list is indeed a very fun | -- and liberating -- experience. thank you for putting this | together! | marginalia_nu wrote: | I do think a lot of recommendation algorithms these days are | bit too good at finding things that are similar to what we | like. Which means you never discover things that you'll like, | but are not similar to the things you've tried before. It | becomes incredibly samey after a while. | | The great joy of, say, flipping through a magazine or browsing | a library is that they are passive, and don't know who you are, | and can't adapt what you read based on what you're likely to | read. So you might read something unexpected, you might | discover something you didn't even know about yourself. | marsa wrote: | thing is they're often really really bad too. e.g. when i | 'explore' albums on youtube music it force feeds me a limited | selection of new releases based on popularity, percieved | genre preference, and my geographic location, probably some | other stuff as well. less than 5% of those recommendations | end up being of any interest to me. | | meanwhile all i really desire is a full list of releases | ordered by date and just let me sift through that myself, but | there seems to be no way to get that list, at least not | through the regular user interface. | | it's very frustrating. | blowski wrote: | For the last few months, I've consumed every piece of media | reviewed by the FT Weekend. It's been a mixed bag, but I've | made some wonderful discoveries. | marsa wrote: | thanks for the suggestion -- is this what you're refering | to? https://www.ft.com/arts/music/albums | | i'll probably add it to my sources, but still a | comprehensive list of new releases would be a dream come | true. | ncpa-cpl wrote: | > browsing through these random sites you list is indeed a very | fun -- and liberating -- experience. thank you for putting this | together! | | Yeah! I really liked the concept too. | | This reminds me of the early Stumble Upon or even channel | surfing cable tv back when it was analog! | marginalia_nu wrote: | I really liked StumbleUpon before it kinda turned to shit. I | also kinda miss the feeling of not having everything be tuned | for user engagement. It's a big part why there is no vote- | arrows, thumbs up, stars, et cetera involved here. You shake | the snow globe and get what you get. | phendrenad2 wrote: | Hey this works great. Found some new and interesting sites. | slx26 wrote: | Quirky sites alleviate my disdain for humanity, somewhat. Thanks. | rixed wrote: | It looks like a book shop and I like the idea. | | A nitpick though: Shouldn't the "capture in progress" pages be | excluded from the random search? | marginalia_nu wrote: | Yeah, the whole thing isn't super polished, still a work in | progress. There's also a few thumbnails that were captured mid- | loading I'd like to improve down the line. | | Right now it's a mix between domains I simply haven't captured | a thumbnail for yet, and domains that for some reason won't be | captured (errors, etc). Once I reduce the first category, I'll | look for a way of hiding the second category. | was_a_dev wrote: | I almost kinda didn't bother because of the cloudflare DDoS | protection. I know that can be petty, but I wouldn't have waited | if it was from a Google results page for example. | marginalia_nu wrote: | I just turned it up a notch right now for a moment, a lot of | people are really aggressively bot-scraping new HN submissions | for whatever reason. It's like a minor DoS every time you | submit a link. | | That's fine for a blog I guess, but this I perform a non- | trivial calculation for each request, so I'd rather not have | bot spam. (This is hosted on a computer in my living room, so I | can't just scale it up) | Nextgrid wrote: | > I perform a non-trivial calculation for each request | | Any reason why caching wouldn't work here? Do the results | have to be different on each request instead of being cached | for a short while (10 seconds)? | marginalia_nu wrote: | Oh yeah, you could probably do some sort of caching to that | effect. This is just a fun toy I hacked together, so it's | not super optimized. | was_a_dev wrote: | I mean fair enough. I gave you the time since I came from HN | and knew the risk/reward for good content was strong. | | If it was more than a toy, it would need to be less aggresive | kreeben wrote: | >> This is hosted on a computer in my living room | | Since you serve Swedish weather info from marginalia I'm | assuming you live in Sweden, is that correct? Could you very | briefly explain how you host and serve pages from your living | room and what your bandwidth is? | | Does your ISP get cranky when you see DOS type of traffic? | | I'm a fellow hobbyist search engine dev, also from Sweden. | Whenever I demonstrate my search engine by hosting in the | cloud the expenses get so big I have to go offline after a | short while and I've therefore been contemplating personal, | living room hosting. | marginalia_nu wrote: | > Since you serve Swedish weather info from marginalia I'm | assuming you live in Sweden, is that correct? | | It is indeed. | | > Could you very briefly explain how you host and serve | pages from your living room and what your bandwidth is? | | 100/100 mbit municipal broadband, through Bahnhof. | | > Does your ISP get cranky when you see DOS type of | traffic? | | Haven't heard a word form them, although you'd be surprised | how far I am from saturating my line. Your average | bittorrent enthusiast probably uses a lot more. I do try to | not be a nuisance though. Cloudflare helps take the edge | off things, as does running a local DNS cache. | | > I'm a fellow hobbyist search engine dev, also from | Sweden. Whenever I demonstrate my search engine by hosting | in the cloud the expenses get so big I have to go offline | after a short while and I've therefore been contemplating | personal, living room hosting. | | You might also consider server rental. Can get away with | SEK 2-4k/month. My server, including UPS and other expenses | is like SEK 40k, plus I expect to burn through an SSD once | a year or so. | kreeben wrote: | Very helpful, thank you. Which part of Sweden are you in, | by the way. | marginalia_nu wrote: | Up north. | [deleted] | gillesjacobs wrote: | You posted on related topics a few weeks back with your | Marginalia projects and I spent an hour browsing your sites. | Making the "small web" and its creative weirdness visible again | pulls on my nostalgia strings. Good work! | 1vuio0pswjnm7 wrote: | The <h1> banner is "Search the internet" but are we only | searching www servers. | | Can we use marginalia.nu to search for servers offering other | protocols like ftp. | marginalia_nu wrote: | It's been my ambition to support Gemini and Gopher down the | line. ___________________________________________________________________ (page generated 2022-01-23 23:00 UTC)