[HN Gopher] Uncertain Future for Marginalia Search ___________________________________________________________________ Uncertain Future for Marginalia Search Author : panic Score : 88 points Date : 2022-04-29 01:27 UTC (1 days ago) (HTM) web link (memex.marginalia.nu) (TXT) w3m dump (memex.marginalia.nu) | marginalia_nu wrote: | Hopefully this will turn out to be a good thing. Maybe having | some time to work on the project full time is exactly what's | needed to push it forward. | | Still a bit uncomfortable how sketchy it feels in the longer | term. But whatever. All I can do about it is do a good job. | O_H_E wrote: | This might be intentional on your part, but I couldn't find | your Patreon linked anywhere from the blog. | | This might be a good time to start linking that in obvious | places. | | Fwiw it was very easy to find it through Google, but ironically | not through marginalia. | | I hope you the best in your endeavors. | marginalia_nu wrote: | Yeah I have it linked from the search engine as a top | link[1], but I can only have 2-3 of them so I haven't linked | to it anywhere in the blog. | | Haven't really been a priority to get donations since I've | had more than plenty income. | | Maybe I should look over the design. | | [1] https://memex.marginalia.nu/projects/edge/supporting.gmi | imiric wrote: | I've been following the project for a while now, and while I | don't use it yet, we need it and more like it to succeed if we | ever hope to loosen Google's chokehold on the web. | | Best of luck to you and to the project! | | I'm curious about a few things: | | 1. What's your (planned) business model? | | 2. Have you tried asking for sponsorships, either from companies | or individuals? You should have an easy way for people to donate. | I'm sure you'd have some support there, especially if your day | job situation is unstable. | | 3. Is it just you working on it right now? Have you considered | open sourcing it to get community contributions, or hiring more | devs (once donations pick up or maybe someone would be willing to | work on it on their free time as you do)? I can imagine that | writing a search engine is a gargantuan effort, and doing it | alone must be close to impossible. | marginalia_nu wrote: | > 1. What's your (planned) business model? | | Dunno. In general I don't have a lot of faith in the | profitability of search engines. Ads _can_ work if you 're | Google-scale, the other option is subscriptions, but in that | case, you need to be _really_ good and my search engine just | isn 't, outside of some areas. That's actually one of my bigger | design problems, how to let people understand which queries are | likely to be useful. It looks like Google, and people assume it | has the affordances of Google. It doesn't, and if you go in | with those assumptions, you'll be disappointed. | | My model, as far as I've planned one, is just to keep the | operation as cheap as possible and subsist on donations and | maybe partnerships with other search engines. A big part of | what I'm exploring is ways of doing as much as possible with | low power hardware. I think rather than indexing 1 billion | documents, 90% of which are garbage that will never be a good | search result for any query ever, if I can index 100 million | 50% of which are potentially good hits, then maybe that goes a | decent way. | | > 2. Have you tried asking for sponsorships, either from | companies or individuals? You should have an easy way for | people to donate. I'm sure you'd have some support there, | especially if your day job situation is unstable. | | I haven't really been fishing for this. I honestly didn't see | having to change jobs as I am right now. I do have a donations | page from before, but all of this was fairly sudden, so I | haven't really gone over that whole process all too much yet. | | > 3. Is it just you working on it right now? Have you | considered open sourcing it to get community contributions, or | hiring more devs (once donations pick up or maybe someone would | be willing to work on it on their free time as you do)? I can | imagine that writing a search engine is a gargantuan effort, | and doing it alone must be close to impossible. | | It's been just me up until now. Solo work can be ridiculously | efficient when beginning a new project, especially when doing | the sort of exploratory programming this has been. I also | haven't felt I have had enough bandwidth to manage an open | source project. But I am approaching a point where it's | becoming a bit much to do all by myself, especially given this | isn't my only project. | | So I am considering open sourcing it or bringing more people | in, just need to think a bit about a good format for such a | collaboration. It's relatively high maintenance and requires | manual operations to keep going. As it stands, a lot of the | code isn't trivially testable, running it (even with few | documents) requires large language models and so on. | Kye wrote: | A handful of people have paid me about $100/month | collectively for a few years expecting nothing in return via | Patreon. Marginalia has a much, much, much bigger audience | and I suspect you could manage a multiple of that sort of "I | don't care what you make, don't expect anything in return, | and I'm just glad to see you making stuff" support. | | I would recommend Ko-fi though since they make real live | subscriptions right on your Stripe, so you can migrate them | if you ever decide to do it in-house. | hahnchen wrote: | I just use google to search hn, "site:news.ycombinator.com | <query>" | NeutralForest wrote: | Very important project, I hope you'll be able to settle into | something comfortable! | benwills wrote: | In a very different way, I'm also involved in a search-related | project. (edited to add: also going solo on my project as well) | If you ever want to bounce ideas around, I'd totally be up for | that. | | Related: you mention other sources than Common Crawl for WARC | data. Is there a list of those somewhere? | marginalia_nu wrote: | Sure, my email is in my profile if you want to chat. | | Some WARCs that go into IA get published on archive.org, not | all of them, but some: | https://archive.org/search.php?query=warc | | It's also an all-around useful format as you can produce it | from wget and other common tools. But the big reason I'm moving | toward something relatively homomorphic to WARCs is to be able | to (in the future) publish my own crawls. | benwills wrote: | Thanks for that link. I've done a bit of work with the Common | Crawl data (and proposed moving to ZSTD with a proof of | concept and performance metrics in C a few years ago). | | I'll send you an email later this weekend to connect. | theobeers wrote: | This is a great search engine. I entered "Persian | transliteration," since that's what I was working on today. It | sent me to the readme for a program written in 1996,[0] which | takes a Latin-script transliteration of some Persian text, and | generates ASCII art that resembles the way that text would be | written in Persian script. Useful? Eh... Delightful? 100%. It | would never have occurred to me that such a program would exist. | | Best wishes to you, Marginalia developer. | | [0]: http://www.payvand.com/gerdsooz/README.html | jmclnx wrote: | I never heard of it, but looks good. I hope they can succeed. And | good luck to you too. | ColinHayhurst wrote: | I wish you well and we welcome what you are doing with | marginalia. As you know search needs a shakeup. One vital | approach to a real shakeup is true independence of crawler and | index. If it's any encouragement, Marc our founder started Mojeek | as a hobby project back in 2004. | marginalia_nu wrote: | Thanks, man. | yuhong wrote: | kumarsw wrote: | Using Marginalia always reminds me just how much we have lost | since the golden age (2000-2010) of the internet. Thanks for | bringing it back in a small way. | daxfohl wrote: | Surprised Elon bought that dumpster fire instead of something | like this. | marginalia_nu wrote: | Yeah, it would be far cheaper too :-/ ___________________________________________________________________ (page generated 2022-04-30 23:00 UTC)