[HN Gopher] Infinity: open-source search engine
       ___________________________________________________________________
        
       Infinity: open-source search engine
        
       Author : freediver
       Score  : 179 points
       Date   : 2020-08-07 10:38 UTC (12 hours ago)
        
 (HTM) web link (infinitysearch.co)
 (TXT) w3m dump (infinitysearch.co)
        
       | fsflover wrote:
       | See also: peer to peer free search engine https://yacy.net
        
       | numpad0 wrote:
       | > When you search for something on our site, we take the results
       | from other search engines and our own indexes,
        
         | qihqi wrote:
         | Currently its just Bing:
         | https://gitlab.com/infinitysearch/infinity-search/-/blob/mas...
        
           | harryf wrote:
           | Since Yahoo! got out of the game, Bing is the only major
           | search engine in the English speaking world providing an API
           | for developers against their index.
           | 
           | Yandex _might_ be another option (haven't looked) or even
           | Baidu. Of course there are unofficial ways to scrape Google
           | but you can't build a legit business on that.
           | 
           | Side note: a far simpler first step than trying to break up
           | Google would be requiring them to have a search API and
           | contractual obligations that enable others to do business on
           | top of it.
        
         | searchableguy wrote:
         | Here are the results: https://ibb.co/album/Mfc6br
         | 
         | Obviously one data point but I searched few more things that I
         | didn't include. Google is better for the same query.
         | 
         | Now that I think about it, google's best strength is their
         | crawler and cache infrastructure. I never find latest and up-to
         | date content with the same speed on other search engines.
         | 
         | They don't seem to modify results from bing.
        
           | lisper wrote:
           | Yes, AFAICT this is just DDG with a less silly name.
        
             | searchableguy wrote:
             | I like duckduckgo's name though. It's playful. They also
             | own duck.com
             | 
             |  _Hey, duck this query_
        
               | ffpip wrote:
               | PS: Google was the company that sold duck.com to DDG.
               | 
               | DDG also owns ddg.gg
        
       | rmetzler wrote:
       | I find it really disturbing that you can't link to result pages.
       | I think this is some kind of REST principles violation.
        
       | adsjhdashkj wrote:
       | Neat, their ads are powered by: https://www.ethicalads.io/
       | 
       | This makes me wonder - what ads would be acceptable in the eyes
       | of privacy focused developers? Any?
       | 
       | And i don't mean UX - i mean, what ads would you, a privacy
       | focused developer, recommend i install on my site. Assuming i
       | want to fully respect by users privacy, while still attempting to
       | pay for the server costs.
       | 
       | Maybe this warrants a AskHN, but it seems relevant here as
       | they're implementing this exact concept. Thoughts?
        
         | ziddoap wrote:
         | >what ads would be acceptable in the eyes of privacy focused
         | developers
         | 
         | I think it's a pretty straight-forward and easy answer.
         | 
         | Don't put ads that breach user privacy. Just use context-based
         | advertisements.
         | 
         | If your ads are asking about my canvas size, mouse position,
         | etc. or otherwise are attempting to track me from website to
         | website, I'm not cool with that and will block it whatever way
         | possible. Other dark patterns (blending in to website e.g.
         | Reddit's sponsored posts) are also terrible, but not
         | necessarily related to your question re: privacy.
         | 
         | If I know a website is using context-based advertisements with
         | no tracking, I allow the ads and sometimes even follow them if
         | I need whatever is being advertised.
        
           | cxr wrote:
           | OP seems to be asking for specific examples of ad services to
           | use.
           | 
           | (I'm not sure what's up with the number of responses to the
           | question, "what ads would you [...] recommend i install on my
           | site [a]ssuming i want to fully respect [m]y users privacy",
           | with answers such as, "Don't put ads that breach user
           | privacy".)
        
             | ziddoap wrote:
             | Obviously the question is ambiguous enough that there is
             | more than one interpretation.
             | 
             | Even if they said "What specific & exact ad service would
             | you recommend", a reply of "Any ad service that doesn't do
             | tracking -- consider contextual based ads" is still a valid
             | one, no?
        
               | cxr wrote:
               | No.
               | 
               | > Even if they said [...]
               | 
               | We don't have to imagine anything. We can look at what
               | was written (and I quoted it again above). No matter what
               | a person has in mind when writing out, "Don't put ads
               | that breach user privacy", it's not a meaningful reply to
               | the question that was actually asked.
        
               | ziddoap wrote:
               | Interesting...
               | 
               | You started with "OP seems [...]", which sort of implies
               | you are interpreting as well...
               | 
               | Whatever, I'll stand corrected I guess. My (and everyone
               | else except you!) reply is not meaningful.
               | 
               | Edit to add: If I ask for a product that does X and
               | respects Y, and someone says "Here are the criteria I use
               | to make sure any product respects your Y requirement", I
               | would accept that as a valid and helpful reply. Maybe
               | it's just me.
        
               | cxr wrote:
               | > You started with "OP seems [...]", which sort of
               | implies you are interpreting as well...
               | 
               | I edited that in to soften the message. It's like the
               | approach of phrasing things to avoid use of "you" when
               | addressing a topic, to prevent people from getting
               | defensive--which I also do a lot and did here. Clearly
               | that effort was wasted.
        
         | eternalban wrote:
         | We can also ask the alternative question which is never asked:
         | 
         | - what other mechanisms can server the same utility as
         | advertising for businesses?
         | 
         | - what other mechanisms can support viable business models.
         | 
         | What is strange about advertising is that it enjoys a bizzare
         | outsized role in determining social norms that is never
         | questioned. Sure, businesses need to advertise their services
         | and goods but at the cost of even destroying privacy? I find it
         | strange.
        
           | erikrothoff wrote:
           | One aspect to consider is that it is the tried and true
           | solution. Surely many businesses and a lot of smart people
           | have considered alternatives. It seems, save from actually
           | paying money or living off the generosity of others, ads won.
        
         | utf_8x wrote:
         | Personally, I think ethicalads.io and contextcue.com are just
         | fine. Maybe even carbonads.net but good luck getting them to
         | even talk to you without a very large audience already in
         | place.
        
         | ximeng wrote:
         | Ads that can be turned off. If they're not adding enough value
         | to users for them to turn them on, get rid of them.
        
         | komali2 wrote:
         | Cory Doctorow just did a whole email on this. In short:
         | contextual ads. Here's the email:
         | https://mail.flarn.com/pipermail/plura-list/2020-August/0001...
        
           | skyfaller wrote:
           | Same content in a more web-friendly format:
           | https://pluralistic.net/2020/08/05/behavioral-v-
           | contextual/#...
        
         | chris_f wrote:
         | I've spent a lot of time over the last few months looking at
         | different privacy focused forms of monetization, specifically
         | aimed at search engines because of my use case. [0]
         | 
         | The main factor determining your options is whether you are
         | comfortable having a 3rd party ad provider have access to the
         | IP, user agent (and potentially search term in some cases) of
         | your users.
         | 
         | If that is not a dealbreaker for your use case then there are a
         | lot of different options, including EthicalAds which I believe
         | is a great service.
         | 
         | Understandably, the main reasons for this are because the ad
         | networks need to prevent fraud, and the ad sponsors usually
         | want some kind of measurable metric to determine ROI.
         | 
         | If you are not comfortable with the above compromise then your
         | options for contextual advertising are significantly limited
         | with the only real option being to sell direct ads to
         | companies.
         | 
         | [0] https://coil.com/p/runnaroo/Privacy-and-Search-Engine-
         | Moneti...
        
           | olah_1 wrote:
           | How does Coil not have _all_ of your data, though? They
           | maintain centralized knowledge of everything you look at and
           | for how long. Or am I missing something?
        
           | wintermutestwin wrote:
           | >with the only real option being to sell direct ads to
           | companies.
           | 
           | Maybe I don't understand the complexity behind the ad-network
           | curtain today, but direct ad sales worked fine pre-internet.
           | Serve the ads from the same domain as your content and you
           | can prevent ad blocking. Sounds like a win-win to me.
        
             | NotSammyHagar wrote:
             | Almost all companies won't bother with buying ads directly
             | from a random small company, because they don't know if you
             | are trustworthy or won't show up some controversial content
             | that hurts their brand. Much easier to use an ad bundler
             | (from their standpoint) like google.
        
             | chris_f wrote:
             | _> Sounds like a win-win to me._
             | 
             | Yeah, I agree. The challenging is finding the relevant
             | companies who wish to advertise and the ad 'sales' process.
             | It's significantly more work than just plugging in an
             | existing ad network who has a ready supply of ads, and it
             | requires a slightly different skillset that's not
             | necessarily a core competency. I'm going through the
             | process right now of reaching out to relevant companies
             | with an aligned privacy focused and it is a mixed bag of
             | results.
             | 
             | Having said that, I think it is worth it in the long term
             | because it makes you less reliant on other companies, but
             | it is a tremendous amount of work upfront.
        
         | 1propionyl wrote:
         | In case anyone from EthicalAds is reading this, you've got a
         | copy error:
         | 
         | > We believe good advertising should _compliment_ content and
         | never get in the way. That's why we built a network that
         | complies with AcceptableAds, BetterAds, and DoNotTrack.
         | 
         | Complement, not compliment.
        
         | nicoburns wrote:
         | > This makes me wonder - what ads would be acceptable in the
         | eyes of privacy focused developers? Any?
         | 
         | Ads that:
         | 
         | - don't track you.
         | 
         | - Are either textual or images (non-animated)
         | 
         | - Are clearly marked as adverts and don't try to blend into the
         | page
        
         | lykr0n wrote:
         | I'm fine with Ads. What I'm not fine with is:
         | 
         | - Intrusive Ads that either demand attention or mask themselves
         | as something else
         | 
         | - Ads that are based on targeted information pulled from other
         | sources, such as other websites.
         | 
         | - Ads that are just completely off topic. I'm fine with general
         | targeting. "male" "computer science" "linux" stuff like that
        
         | Abishek_Muthian wrote:
         | Does the lists used in adblockers like UBlock Origin whitelist
         | Ads from this company, are there any other advertising
         | companies with similar philosophies?
        
           | ffpip wrote:
           | uBlock Origin whitelists nothing. You decide what to let in.
           | 
           | I've noticed it blocks carbonads, ethicalads and even ads on
           | DDG.
           | 
           | I turned it off on DDG and notepad++ sites, to let ads
           | through. If you want you can too.
        
             | Abishek_Muthian wrote:
             | Thanks, have carbon ads gained good reputation? I've been
             | seeing it increasingly on small project sites, but most of
             | the time the Ads have been from adobe.
        
               | ffpip wrote:
               | I don't whitelist whole ad networks. So don't really know
               | their reputataion.
               | 
               | I whitelist specific sites (some sites I like, and
               | reddit)
        
         | skinkestek wrote:
         | You'll find that a number of HNers, including a number of high
         | profile ones are against any form for advertising.
         | 
         | Personally I am against tracking and unethical ads[0] but I
         | think purely contextual ads can be OK:
         | 
         | - ycombinator ads on news.ycombinator.com: OK with me even if
         | they blend in, it is part of what I want to see when I am here
         | 
         | - Webstorm/IntelliJ/upcoming programming conference ads on
         | developer blogs: Smashing
         | 
         | - "Sponsored by" ads on Open Source projects (as long as they
         | aren't bad[1]): fine with me
         | 
         | [0][1]: like pushing payday loans to poor, gambling to addicts,
         | cheating sites etc
        
           | oska wrote:
           | > You'll find that a number of HNers, including a number of
           | high profile ones are against any form for advertising.
           | 
           | Yep, this is me (although I'm not at all 'high profile').
           | Specifically, I'm against all _push_ advertising, where ads
           | are being served to a user when they are not actively looking
           | for product /service information. If it's a site that is
           | _all_ advertising or nearly all advertising with some filler
           | editorial then I 'm fine with that because obviously a user
           | is not going to go to that site unless they want to be
           | marketed too and are _actively_ looking for product /service
           | information (including about products/services of which they
           | were previously unaware).
        
       | DoctorNick wrote:
       | is this just a searx instance that's been loaded with ads?
        
       | didip wrote:
       | The video search is broken for me, tested on Chrome with the
       | keyword "golang". It's just showing blank.
        
       | petra wrote:
       | Any info on search operators ? And do they work, unlike Google ?
       | 
       | And where's the source ? could i create search plugins just for
       | my own use ? because that would be a killer feature.
       | 
       | So it's only technically open-source, it uses Bing's API probably
       | for their "web" tab.[1]
       | 
       | Their "infinity tab" doesn't give good results at all, currently.
       | This is what may run on their own system. Maybe.
       | 
       | [1]"How It Works For the web application, we use Flask. For the
       | search results, we use the Microsoft Cognitive Services Search
       | API and the DuckDuckGo Instant Answers API. We will have more
       | detailed information about how our system works in the future."
       | 
       | -- from their source :
       | https://gitlab.com/infinitysearch/infinity-search
        
         | searchableguy wrote:
         | > If you have ever used DuckDuckGo before, we integrated the
         | same bang feature as them so that you can redirect your
         | searches to other websites directly from our site. Here is an
         | example: https://infinitysearch.co/results?q=!ddg privacy Right
         | now, we have the same ones as DuckDuckGo so you can use the
         | same ones found at https://duckduckgo.com/bang for now.
         | 
         | > The whole Infinity Search system is open source and the code
         | can be found at https://gitlab.com/infinitysearch.
         | 
         | From the FAQ here: https://infinitysearch.co/why
        
       | kgraves wrote:
       | I saw everything this search engine has to offer, all looks good
       | except their _' Ethical Ads'_. What does that even mean?
       | 
       | There should be NO ads. period.
       | 
       | I would go so far to say that ALL ads are a distraction.
       | 
       | You're supposed to be an open source privacy focused search
       | engine, not a surveillance capitalist serving search ads like
       | Google, this is NOT OK.
        
         | hinkley wrote:
         | How do you propose to pay for the infrastructure?
        
           | oska wrote:
           | Apart from you question not being in good faith (because
           | there have always been business models to support publishing
           | and providing services that do not rely on adverstising and
           | these models are well known), the main point is that they do
           | not have to provide the answer to this question. Push
           | advertising is unethical and culturally toxic and thus is not
           | justifiable using the simple pragmatic argument of "It pays
           | the bills". You can't justify unethical behaviour using a
           | simple pragmatic argument. (Quite obviously; this shouldn't
           | even need to be said.)
        
       | faitswulff wrote:
       | If GPT-3 is trained on a corpus of the entire internet, could you
       | just use it as a search engine by posing questions to it?
        
         | freediver wrote:
         | No, but it can look deceivingly able to:
         | 
         | https://twitter.com/paraschopra/status/1284801028676653060
         | 
         | (and this is how to prime it:
         | 
         | https://twitter.com/vybhavram/status/1284803750146617344)
         | 
         | The main problem is that GPT3 has a free will of a creative but
         | delusional person and will distort facts as it pleases to
         | maximize its output function score.
         | 
         | So, similar to evening news.
        
           | worldsayshi wrote:
           | Hmm, can you turn GPT-3 api into a fact checker by priming it
           | with tuples of (in)correct statements, true/false? And then
           | maybe prime it to add explanations and references for the
           | responses as well.
        
             | wizzwizz4 wrote:
             | Yes, but it's only effective for things a few months before
             | its creation; anything after that is just "sounds plausible
             | to GPT-3". Likewise, its explanations aren't very good
             | unless it has good domain knowledge on the subject; if it
             | doesn't know something, it won't tell you so unless you've
             | primed it with not-knowing being an option.
             | 
             | https://www.gwern.net/GPT-3#expressing-uncertainty
        
       | softwaredoug wrote:
       | I wish there was a way to differentiate a "search engine" as in
       | one that crawls the web and a "search engine" like Elastic or
       | Solr.
       | 
       | Maybe the latter should be a search server?
        
         | nkristoffersen wrote:
         | wouldn't a crawler be a "web crawler"? It doesn't actually
         | handle the search engine aspects.
        
         | somurzakov wrote:
         | search engine vs full-text search engine?
        
           | searchableguy wrote:
           | Google would count as a full text search engine then.
        
             | penagwin wrote:
             | Web search engines generally are also full text search
             | engines.
             | 
             | This project is useless for people who have their own
             | private text to search.
        
         | ralusek wrote:
         | Web search vs application search?
        
         | tekkk wrote:
         | I think the correct terms would be "web search engine" or "web
         | search" and "search engine" or "full-text search engine".
         | Wikipedia says "web search engine". But I agree, the ubiquity
         | of Google as the "search engine" makes it a bit muddy to
         | acknowledge that a search engine is any program or system that
         | is used to search data. It's an interesting information
         | retrieval concept that can be applied to many things.
        
         | 1_person wrote:
         | I think it's a lot easier to maintain clarity when we identify
         | this as what it actually is: yet another low effort front end
         | for the Bing search engine's API, which aims to attract and
         | monetize privacy-conscious consumers with a big lie.
        
       | Dowwie wrote:
       | Very cool to see innovation emerging from Oklahoma!
        
       | tgb wrote:
       | If the devs are reading this, I found a bug: if you use the
       | wikipedia bang and search, say: "!w knot theory" the actual
       | search page it brings up is for "knottheory" without the space.
        
         | grey_earthling wrote:
         | Also if you search for just "!w", you don't arrive at the
         | English Wikipedia front page, which is the behaviour I'm used
         | to from DuckDuckGo. "!bbc" should also lead to the BBC front
         | page, etc.
        
       | wirelessbrain wrote:
       | What's the difference between "web" and "infinity" search? It
       | doesn't seem to be explained anywhere.
        
       ___________________________________________________________________
       (page generated 2020-08-07 23:00 UTC)