[HN Gopher] Infinity: open-source search engine ___________________________________________________________________ Infinity: open-source search engine Author : freediver Score : 179 points Date : 2020-08-07 10:38 UTC (12 hours ago) (HTM) web link (infinitysearch.co) (TXT) w3m dump (infinitysearch.co) | fsflover wrote: | See also: peer to peer free search engine https://yacy.net | numpad0 wrote: | > When you search for something on our site, we take the results | from other search engines and our own indexes, | qihqi wrote: | Currently its just Bing: | https://gitlab.com/infinitysearch/infinity-search/-/blob/mas... | harryf wrote: | Since Yahoo! got out of the game, Bing is the only major | search engine in the English speaking world providing an API | for developers against their index. | | Yandex _might_ be another option (haven't looked) or even | Baidu. Of course there are unofficial ways to scrape Google | but you can't build a legit business on that. | | Side note: a far simpler first step than trying to break up | Google would be requiring them to have a search API and | contractual obligations that enable others to do business on | top of it. | searchableguy wrote: | Here are the results: https://ibb.co/album/Mfc6br | | Obviously one data point but I searched few more things that I | didn't include. Google is better for the same query. | | Now that I think about it, google's best strength is their | crawler and cache infrastructure. I never find latest and up-to | date content with the same speed on other search engines. | | They don't seem to modify results from bing. | lisper wrote: | Yes, AFAICT this is just DDG with a less silly name. | searchableguy wrote: | I like duckduckgo's name though. It's playful. They also | own duck.com | | _Hey, duck this query_ | ffpip wrote: | PS: Google was the company that sold duck.com to DDG. | | DDG also owns ddg.gg | rmetzler wrote: | I find it really disturbing that you can't link to result pages. | I think this is some kind of REST principles violation. | adsjhdashkj wrote: | Neat, their ads are powered by: https://www.ethicalads.io/ | | This makes me wonder - what ads would be acceptable in the eyes | of privacy focused developers? Any? | | And i don't mean UX - i mean, what ads would you, a privacy | focused developer, recommend i install on my site. Assuming i | want to fully respect by users privacy, while still attempting to | pay for the server costs. | | Maybe this warrants a AskHN, but it seems relevant here as | they're implementing this exact concept. Thoughts? | ziddoap wrote: | >what ads would be acceptable in the eyes of privacy focused | developers | | I think it's a pretty straight-forward and easy answer. | | Don't put ads that breach user privacy. Just use context-based | advertisements. | | If your ads are asking about my canvas size, mouse position, | etc. or otherwise are attempting to track me from website to | website, I'm not cool with that and will block it whatever way | possible. Other dark patterns (blending in to website e.g. | Reddit's sponsored posts) are also terrible, but not | necessarily related to your question re: privacy. | | If I know a website is using context-based advertisements with | no tracking, I allow the ads and sometimes even follow them if | I need whatever is being advertised. | cxr wrote: | OP seems to be asking for specific examples of ad services to | use. | | (I'm not sure what's up with the number of responses to the | question, "what ads would you [...] recommend i install on my | site [a]ssuming i want to fully respect [m]y users privacy", | with answers such as, "Don't put ads that breach user | privacy".) | ziddoap wrote: | Obviously the question is ambiguous enough that there is | more than one interpretation. | | Even if they said "What specific & exact ad service would | you recommend", a reply of "Any ad service that doesn't do | tracking -- consider contextual based ads" is still a valid | one, no? | cxr wrote: | No. | | > Even if they said [...] | | We don't have to imagine anything. We can look at what | was written (and I quoted it again above). No matter what | a person has in mind when writing out, "Don't put ads | that breach user privacy", it's not a meaningful reply to | the question that was actually asked. | ziddoap wrote: | Interesting... | | You started with "OP seems [...]", which sort of implies | you are interpreting as well... | | Whatever, I'll stand corrected I guess. My (and everyone | else except you!) reply is not meaningful. | | Edit to add: If I ask for a product that does X and | respects Y, and someone says "Here are the criteria I use | to make sure any product respects your Y requirement", I | would accept that as a valid and helpful reply. Maybe | it's just me. | cxr wrote: | > You started with "OP seems [...]", which sort of | implies you are interpreting as well... | | I edited that in to soften the message. It's like the | approach of phrasing things to avoid use of "you" when | addressing a topic, to prevent people from getting | defensive--which I also do a lot and did here. Clearly | that effort was wasted. | eternalban wrote: | We can also ask the alternative question which is never asked: | | - what other mechanisms can server the same utility as | advertising for businesses? | | - what other mechanisms can support viable business models. | | What is strange about advertising is that it enjoys a bizzare | outsized role in determining social norms that is never | questioned. Sure, businesses need to advertise their services | and goods but at the cost of even destroying privacy? I find it | strange. | erikrothoff wrote: | One aspect to consider is that it is the tried and true | solution. Surely many businesses and a lot of smart people | have considered alternatives. It seems, save from actually | paying money or living off the generosity of others, ads won. | utf_8x wrote: | Personally, I think ethicalads.io and contextcue.com are just | fine. Maybe even carbonads.net but good luck getting them to | even talk to you without a very large audience already in | place. | ximeng wrote: | Ads that can be turned off. If they're not adding enough value | to users for them to turn them on, get rid of them. | komali2 wrote: | Cory Doctorow just did a whole email on this. In short: | contextual ads. Here's the email: | https://mail.flarn.com/pipermail/plura-list/2020-August/0001... | skyfaller wrote: | Same content in a more web-friendly format: | https://pluralistic.net/2020/08/05/behavioral-v- | contextual/#... | chris_f wrote: | I've spent a lot of time over the last few months looking at | different privacy focused forms of monetization, specifically | aimed at search engines because of my use case. [0] | | The main factor determining your options is whether you are | comfortable having a 3rd party ad provider have access to the | IP, user agent (and potentially search term in some cases) of | your users. | | If that is not a dealbreaker for your use case then there are a | lot of different options, including EthicalAds which I believe | is a great service. | | Understandably, the main reasons for this are because the ad | networks need to prevent fraud, and the ad sponsors usually | want some kind of measurable metric to determine ROI. | | If you are not comfortable with the above compromise then your | options for contextual advertising are significantly limited | with the only real option being to sell direct ads to | companies. | | [0] https://coil.com/p/runnaroo/Privacy-and-Search-Engine- | Moneti... | olah_1 wrote: | How does Coil not have _all_ of your data, though? They | maintain centralized knowledge of everything you look at and | for how long. Or am I missing something? | wintermutestwin wrote: | >with the only real option being to sell direct ads to | companies. | | Maybe I don't understand the complexity behind the ad-network | curtain today, but direct ad sales worked fine pre-internet. | Serve the ads from the same domain as your content and you | can prevent ad blocking. Sounds like a win-win to me. | NotSammyHagar wrote: | Almost all companies won't bother with buying ads directly | from a random small company, because they don't know if you | are trustworthy or won't show up some controversial content | that hurts their brand. Much easier to use an ad bundler | (from their standpoint) like google. | chris_f wrote: | _> Sounds like a win-win to me._ | | Yeah, I agree. The challenging is finding the relevant | companies who wish to advertise and the ad 'sales' process. | It's significantly more work than just plugging in an | existing ad network who has a ready supply of ads, and it | requires a slightly different skillset that's not | necessarily a core competency. I'm going through the | process right now of reaching out to relevant companies | with an aligned privacy focused and it is a mixed bag of | results. | | Having said that, I think it is worth it in the long term | because it makes you less reliant on other companies, but | it is a tremendous amount of work upfront. | 1propionyl wrote: | In case anyone from EthicalAds is reading this, you've got a | copy error: | | > We believe good advertising should _compliment_ content and | never get in the way. That's why we built a network that | complies with AcceptableAds, BetterAds, and DoNotTrack. | | Complement, not compliment. | nicoburns wrote: | > This makes me wonder - what ads would be acceptable in the | eyes of privacy focused developers? Any? | | Ads that: | | - don't track you. | | - Are either textual or images (non-animated) | | - Are clearly marked as adverts and don't try to blend into the | page | lykr0n wrote: | I'm fine with Ads. What I'm not fine with is: | | - Intrusive Ads that either demand attention or mask themselves | as something else | | - Ads that are based on targeted information pulled from other | sources, such as other websites. | | - Ads that are just completely off topic. I'm fine with general | targeting. "male" "computer science" "linux" stuff like that | Abishek_Muthian wrote: | Does the lists used in adblockers like UBlock Origin whitelist | Ads from this company, are there any other advertising | companies with similar philosophies? | ffpip wrote: | uBlock Origin whitelists nothing. You decide what to let in. | | I've noticed it blocks carbonads, ethicalads and even ads on | DDG. | | I turned it off on DDG and notepad++ sites, to let ads | through. If you want you can too. | Abishek_Muthian wrote: | Thanks, have carbon ads gained good reputation? I've been | seeing it increasingly on small project sites, but most of | the time the Ads have been from adobe. | ffpip wrote: | I don't whitelist whole ad networks. So don't really know | their reputataion. | | I whitelist specific sites (some sites I like, and | reddit) | skinkestek wrote: | You'll find that a number of HNers, including a number of high | profile ones are against any form for advertising. | | Personally I am against tracking and unethical ads[0] but I | think purely contextual ads can be OK: | | - ycombinator ads on news.ycombinator.com: OK with me even if | they blend in, it is part of what I want to see when I am here | | - Webstorm/IntelliJ/upcoming programming conference ads on | developer blogs: Smashing | | - "Sponsored by" ads on Open Source projects (as long as they | aren't bad[1]): fine with me | | [0][1]: like pushing payday loans to poor, gambling to addicts, | cheating sites etc | oska wrote: | > You'll find that a number of HNers, including a number of | high profile ones are against any form for advertising. | | Yep, this is me (although I'm not at all 'high profile'). | Specifically, I'm against all _push_ advertising, where ads | are being served to a user when they are not actively looking | for product /service information. If it's a site that is | _all_ advertising or nearly all advertising with some filler | editorial then I 'm fine with that because obviously a user | is not going to go to that site unless they want to be | marketed too and are _actively_ looking for product /service | information (including about products/services of which they | were previously unaware). | DoctorNick wrote: | is this just a searx instance that's been loaded with ads? | didip wrote: | The video search is broken for me, tested on Chrome with the | keyword "golang". It's just showing blank. | petra wrote: | Any info on search operators ? And do they work, unlike Google ? | | And where's the source ? could i create search plugins just for | my own use ? because that would be a killer feature. | | So it's only technically open-source, it uses Bing's API probably | for their "web" tab.[1] | | Their "infinity tab" doesn't give good results at all, currently. | This is what may run on their own system. Maybe. | | [1]"How It Works For the web application, we use Flask. For the | search results, we use the Microsoft Cognitive Services Search | API and the DuckDuckGo Instant Answers API. We will have more | detailed information about how our system works in the future." | | -- from their source : | https://gitlab.com/infinitysearch/infinity-search | searchableguy wrote: | > If you have ever used DuckDuckGo before, we integrated the | same bang feature as them so that you can redirect your | searches to other websites directly from our site. Here is an | example: https://infinitysearch.co/results?q=!ddg privacy Right | now, we have the same ones as DuckDuckGo so you can use the | same ones found at https://duckduckgo.com/bang for now. | | > The whole Infinity Search system is open source and the code | can be found at https://gitlab.com/infinitysearch. | | From the FAQ here: https://infinitysearch.co/why | kgraves wrote: | I saw everything this search engine has to offer, all looks good | except their _' Ethical Ads'_. What does that even mean? | | There should be NO ads. period. | | I would go so far to say that ALL ads are a distraction. | | You're supposed to be an open source privacy focused search | engine, not a surveillance capitalist serving search ads like | Google, this is NOT OK. | hinkley wrote: | How do you propose to pay for the infrastructure? | oska wrote: | Apart from you question not being in good faith (because | there have always been business models to support publishing | and providing services that do not rely on adverstising and | these models are well known), the main point is that they do | not have to provide the answer to this question. Push | advertising is unethical and culturally toxic and thus is not | justifiable using the simple pragmatic argument of "It pays | the bills". You can't justify unethical behaviour using a | simple pragmatic argument. (Quite obviously; this shouldn't | even need to be said.) | faitswulff wrote: | If GPT-3 is trained on a corpus of the entire internet, could you | just use it as a search engine by posing questions to it? | freediver wrote: | No, but it can look deceivingly able to: | | https://twitter.com/paraschopra/status/1284801028676653060 | | (and this is how to prime it: | | https://twitter.com/vybhavram/status/1284803750146617344) | | The main problem is that GPT3 has a free will of a creative but | delusional person and will distort facts as it pleases to | maximize its output function score. | | So, similar to evening news. | worldsayshi wrote: | Hmm, can you turn GPT-3 api into a fact checker by priming it | with tuples of (in)correct statements, true/false? And then | maybe prime it to add explanations and references for the | responses as well. | wizzwizz4 wrote: | Yes, but it's only effective for things a few months before | its creation; anything after that is just "sounds plausible | to GPT-3". Likewise, its explanations aren't very good | unless it has good domain knowledge on the subject; if it | doesn't know something, it won't tell you so unless you've | primed it with not-knowing being an option. | | https://www.gwern.net/GPT-3#expressing-uncertainty | softwaredoug wrote: | I wish there was a way to differentiate a "search engine" as in | one that crawls the web and a "search engine" like Elastic or | Solr. | | Maybe the latter should be a search server? | nkristoffersen wrote: | wouldn't a crawler be a "web crawler"? It doesn't actually | handle the search engine aspects. | somurzakov wrote: | search engine vs full-text search engine? | searchableguy wrote: | Google would count as a full text search engine then. | penagwin wrote: | Web search engines generally are also full text search | engines. | | This project is useless for people who have their own | private text to search. | ralusek wrote: | Web search vs application search? | tekkk wrote: | I think the correct terms would be "web search engine" or "web | search" and "search engine" or "full-text search engine". | Wikipedia says "web search engine". But I agree, the ubiquity | of Google as the "search engine" makes it a bit muddy to | acknowledge that a search engine is any program or system that | is used to search data. It's an interesting information | retrieval concept that can be applied to many things. | 1_person wrote: | I think it's a lot easier to maintain clarity when we identify | this as what it actually is: yet another low effort front end | for the Bing search engine's API, which aims to attract and | monetize privacy-conscious consumers with a big lie. | Dowwie wrote: | Very cool to see innovation emerging from Oklahoma! | tgb wrote: | If the devs are reading this, I found a bug: if you use the | wikipedia bang and search, say: "!w knot theory" the actual | search page it brings up is for "knottheory" without the space. | grey_earthling wrote: | Also if you search for just "!w", you don't arrive at the | English Wikipedia front page, which is the behaviour I'm used | to from DuckDuckGo. "!bbc" should also lead to the BBC front | page, etc. | wirelessbrain wrote: | What's the difference between "web" and "infinity" search? It | doesn't seem to be explained anywhere. ___________________________________________________________________ (page generated 2020-08-07 23:00 UTC)