hngopher.com

       [HN Gopher] Launch HN: Andi (YC W22) - Q&A based, ad-free, anti-...
       ___________________________________________________________________
        
       Launch HN: Andi (YC W22) - Q&A based, ad-free, anti-spam search
       engine
        
       Hi HN, we're Angie and Jed, and we're building Andi
       (https://andisearch.com), a new type of search engine with an AI
       assistant that answers complex questions, and gives you tools to
       fight spam and ad tech.  There has been a lot of discussion on HN
       recently about how Google search is dying. If you do a Google
       search in a category like finance or health, the results are
       overwhelmed with spam, clickbait and ads. That's what we're working
       to fix.  For us, the problem is personal. I'm a programmer who also
       trained and worked as a journalist and built a media SaaS startup.
       I watched first-hand as the media industry not-so-slowly starved,
       my startup failed, and my friends lost jobs and businesses. Google
       took all the revenue as media turned to trash, and ad tech,
       clickbait and content marketing took over the Internet, and made
       using the web awful.  With Andi, we already apply spam blacklists
       server-side and we're adding tools to blacklist and report spam
       locally. Andi's free from ads and tracking. And it protects your
       privacy more than other search engines, because searches don't pass
       through browser history (they're encrypted POST requests). It's
       free and anonymous, and there are no usage limits. You don't need
       to register, or install an extension or app.  Andi is not just
       another copy of Google. The UX is radically different, like
       messaging with a smart friend who answers questions and sends you
       useful links. It shows results in a cleaner, more visual way (or
       you can change view to a simple list). You can preview content from
       the web safely using a proxied Reader View with no ads or clutter.
       Our AI assistant uses a conversational interface to answer complex
       questions, explain topics, and find key information. We call these
       "deep answers". It is a significant break-through, as you'll see if
       you try it out for yourself. Try something factual and current,
       like "How many Ukrainian refugees will the US accept, and what
       humanitarian aid will it provide?", "What were the demands of the
       cybercriminals who breached Nvidia?", or "Why is elon musk
       considering creating a new social media platform?"  We're doing
       much more here than GPT-based text generation. You've probably seen
       examples from GPT writing assistants that look impressive but make
       no sense. Large language models on their own generate plausible-
       sounding text that is often plain wrong or dangerous. That's
       because they predict the next word in a sequence based on training
       data. They have no understanding of factual correctness, or moral
       right or wrong. They're like human linguistic intuitive perception.
       Our approach works more like humans do, combining large language
       models (both commercial and open source gpt-based models) with
       reasoning (directed logic and classifiers) and common sense
       (heuristics). We answer many questions using APIs or knowledge
       graphs, or quoting extracted text. When the question is appropriate
       for complex question answering, we use the new approach. It works
       by finding the best sources, and extracting the content with the
       relevant facts. We then combine GPT-based models with the results
       to compose a conversational answer that is also factually correct,
       presented alongside the full search results.  The way Andi searches
       is also different. We use classifiers and NLP to understand
       question intent, entities and topics, and predict the best sources
       for an answer. Then we query APIs and vertical searches directly,
       and retrieve content in real-time, before ranking and filtering the
       results. The content you see in search results is retrieved
       directly from each site in real-time.  When we can't find good
       results, we fall back as an agent to legacy web search (Google,
       Bing and others - about 50% of the time now). Andi does best with
       natural language queries. We've trained classifiers for content
       quality and spam detection, and blacklist and downrank known bad
       sources and copycat sites (for non-political content). You can
       disable these.  The stack is a serverless application hosted on
       AWS, using Lambda and Kubernetes, with inference moving to
       Sagemaker to improve speed. We use PyTorch, SpaCy, GPT-based models
       (GPT-2, GPT-J/NeoX and commercial providers) and HuggingFace, BERT-
       style transformers, plus AWS Lex for some initial intent routing.
       Classifiers are trained on custom-labeled public search data and
       content examples. We have a database of 30k+ top sites. We're
       building some custom vertical searches. Services are written in
       Python and Node. The front end is a Progressive Web App written in
       React.  Some fun features to try include recent events ("Why was
       the James Webb telescope launched from French Guiana?"), direct
       navigation ("go hn google search dying" and allow pop-ups), or
       question answering ("what is the gdp per capita of china vs new
       zealand"). You can also "Change View" on results between a visual
       feed, grid of cards, or simple list, or even like Hacker News or
       early Google. Also try View in Reader for a proxied ad-free way to
       read articles, including many behind paywalls like the NYT or
       Economist.  Andi is a fairly stable alpha and still experimental
       (it sometimes misunderstands or gets things wrong). We plan to have
       a freemium model with some paid features and API use. We're a small
       team with two full-time founders and some help from friends. We've
       been live for a few weeks, and we're iterating fast based on
       feedback. We'd love to hear what you think about search and how to
       fix it, and answer any questions you have about what we're making.
        
       Author : jedwhite
       Score  : 101 points
       Date   : 2022-03-28 16:52 UTC (6 hours ago)
        
 (HTM) web link (andisearch.com)
 (TXT) w3m dump (andisearch.com)
        
       | melony wrote:
       | This is pretty good. I noticed the academia search to be
       | particularly strong. Can you update the router to include the
       | query in the URL so they can be shared?
        
         | jedwhite wrote:
         | Thank you! With the URL, we need to add that! You can start
         | searches via URL (with /?q= or /?query=), but we need to make
         | it easy to share and it's totally hidden currently and not
         | discoverable at all :)
         | 
         | One cool thing: if you want to, you can avoid your searches
         | going through browser history by keeping them within the search
         | session. A few of our early user community use Andi as a
         | standalone PWA (Settings > Install for chrome-based browsers.
         | Or the browser address bar Install button. If it's running in a
         | standalone window it acts a little like a command center and
         | links open in your default browser, but your search session
         | stays separate. That's how I normally use Andi too.
         | 
         | Thank you for the feedback on the search results too! Lots of
         | work to do there - it's broad but shallow at the moment, with a
         | few deeper pools. Over time we're working to improve in more
         | verticals. There are areas like local business, places, food
         | etc that still have to be worked on, not to mention proper
         | internationalization and localization. It is stronger on
         | knowledge and information searching. We hope to keep iterating
         | and improving the results quickly.
        
       | endisneigh wrote:
       | What's the monetization strategy?
        
         | jedwhite wrote:
         | We're thinking three things, although the details will likely
         | change as we're 100% focused on building the product currently.
         | 
         | 1. Freemium model - always free and anonymous for anyone to use
         | without any restrictions. Then paid pro plans with extra
         | features for teams/professionals and developer use, including
         | paid APIs and developer tools. 2. Business/enterprise use -
         | Andi could act as a front end to both public and organizational
         | data (the API approach and question answering could both work
         | well with that. 3. Possibly some other options to consider like
         | anonymous referral link attribution. We won't let commerce
         | influence search results, so that rules out most things that
         | aren't paid plans, APIs, licensing.
         | 
         | That's our thinking so far. Our main focus is on coding and
         | keeping it improving, but we know this is an important issue so
         | it is something that we'll think carefully about once Andi is
         | further along.
        
           | candiddevmike wrote:
           | Honestly, I wish you could've found some way to start your
           | business without taking VC money. I don't see how any of
           | those strategies will be as profitable as as inline search
           | ads, and your funding model pretty much forces some kind
           | monetization scheme as the end game.
           | 
           | A better search provider starts with a non profit company, I
           | think.
        
             | jedwhite wrote:
             | I think there is huge potential for a freemium approach.
             | And it means that searchers are the customer not the
             | product. We went a long time with no funding and living off
             | toast and vegemite to survive, so we're incredibly grateful
             | to have some support to help us pay for food and API and
             | model training costs now.
             | 
             | We've received backing and funds from YC and angels so far.
             | We're lucky that we're a small team of 2, and we're frugal
             | about things, so we've tried to come up with cost effective
             | ways of building Andi.
             | 
             | My own feeling is that search is such a huge problem, and
             | the more people trying different things, different
             | approaches, commercial and non-commercial, the better for
             | the whole world. It's really exciting to see projects like
             | https://search.marginala.nu/ popping up and getting lots of
             | interest and support as well. It has to be good for
             | everyone to have some innovation in search after so long as
             | a tired monopoly.
        
       | throwaway2474 wrote:
       | For a laugh, I tried the query from the other trending GPT
       | thread:
       | 
       | Is it safe to walk downstairs backwards if I close my eyes?
       | 
       | Thanks for your question. Let me research that for you. Give me a
       | minute please
       | 
       | I found this deep answer on cdotrends.com:
       | 
       | Yes, there is nothing to worry about. It's safe because the
       | spiral stairs curve outwards, it will make your descent
       | uncomfortable.
        
         | jedwhite wrote:
         | That's an interesting (and humorous) example. And although it
         | is counter-intuitive, it points towards where we're trying to
         | go long term here. Our aim is to find, extract and summarize
         | information from the web, rather than generate content based
         | purely on a language model. So rather than saying that it's
         | safe to walk downstairs backwards, it's saying that it found
         | this information (an answer from deep in the page, if you like)
         | and extracted it, and then rephrased it to answer the question.
         | 
         | Framing is definitely one of the areas there is still much
         | work!
         | 
         | With this approach, the search app is finding, extracting and
         | summarizing information from the web. So unlike a large
         | language model on its own, this has the advantage that there
         | will be a small number of attributable sources, and it can
         | point to where the answer was generated from.
         | 
         | So Andi is acting like a researcher here, and as a result it's
         | most useful with factual content. With factual answers, it can
         | extract concrete information with high confidence (eg a GDP
         | figure from a knowledge graph or source like Wikipedia). But if
         | you ask it humorous, "tricky" or subjective questions, it is
         | essentially coming back with a summary of what the highest
         | ranked results have to say about it.
         | 
         | We're calling the more subjective ones "IMHO".
         | 
         | We've only just deployed this feature today, so it's still new
         | and very much an experiment.
         | 
         | Good example! :)
        
         | dmsnell wrote:
         | Q: What is a color that isn't red? A: "The color that isn't red
         | is magenta."
         | 
         | (included is a link to a site explaining that the red color of
         | Mars is only a few inches deep into the soil)
         | 
         | Edit: Added another question
         | 
         | Q: when is an appropriate time to cast your loved ones entirely
         | in wax?
         | 
         | A: You can cast your loved ones in wax at any time.
         | 
         | Very glad to know!
        
           | jedwhite wrote:
           | I think these go in our "let's break the AI" hall of fame :)
           | 
           | Having search results alongside answers does lead to
           | interesting serendipity, particularly if you asking compound
           | questions that span a few topics.
           | 
           | When I was testing the new Q&A new feature, one of the test
           | set was "where was elon musk born and how did he first become
           | interested in computers?" It gives a pretty good answer but
           | some of the stories about his childhood and bullying that
           | came up were surprising.
           | 
           | Now I need to ask Andi about casting your loved ones in wax
           | to see what shows alongside that query!
        
           | averagedev wrote:
           | Ok, the last one is pretty funny question. I tried it, and I
           | got a different result:
           | 
           | Q: when is an appropriate time to cast your loved ones
           | entirely in wax?
           | 
           | A: When you are ready to make a commitment to them.
           | 
           | That said, I like the product, and I think it has a lot of
           | potential.
        
       | dewey wrote:
       | I just tried to search "what to eat tonight" and got:
       | 
       | "You could make nachos, cottage pie, beef stew, chili, cheese and
       | ham filled pockets, pork posole, pork chops with a glaze, pizza
       | on French bread, fish topped with Parmesan, cioppino, or a pizza
       | quesadilla."
       | 
       | Guess I'll be busy for the next few hours.
       | 
       | (It looks like it parsed some "32 Delicious Dinner Ideas for
       | Tonight - The Kitchen Community" into one dish)
        
         | detaro wrote:
         | > _or_ a pizza quesadilla. "
         | 
         | It's clearly proposing options, not that you eat all of this?
        
           | InCityDreams wrote:
           | Now you tell me.
        
         | AngelaHoover wrote:
         | Thanks, I changed the settings for complex question answering
         | to high depth but slower and got "I love sushi. I'm learning".
         | Then I tried medium depth and speed and got "pizza, I love it"
         | haha Then I went back to best bet, and it answered "I'm not
         | sure yet" !
        
       | charcircuit wrote:
       | This is a able to answer tricky queries I threw at it, but it's
       | so slow compared to other search engines like Google or Bing
       | which is also able to answer these queries.
        
       | paxys wrote:
       | While I'm impressed by the results, I'm not convinced that they
       | are significantly different from what Google or any other search
       | engine can serve, which you need to prove if you make claims like
       | "Google search is dying" and "results are overwhelmed with spam,
       | clickbait and ads".
       | 
       | In fact I tried all of your sample queries in Google and the very
       | first result was exactly what I needed, with the answer to the
       | question returned in the highlighted blurb.
        
       | asdadsdad wrote:
       | It's too slow to be any helpful
        
         | jedwhite wrote:
         | Thanks for the feedback on the speed. You'll notice the speed
         | varies a ton.
         | 
         | There are two modes here.
         | 
         | 1. Information search - average speed for these traditional
         | keyword-style searches are around 800ms to 1s, with a 90%
         | percentile about 1.3 seconds. These are pretty quick although
         | we still have work to do. That's probably 80% of searches.
         | 
         | 2. Question answering - this can take anywhere from a second to
         | up-to 24 seconds for complex searches. With question and
         | answering, searches are more like research - they're ingesting
         | live content and doing semantic search on the content,
         | extracting content in real-time live direct from the sources,
         | and then finding and extracting information to return facts and
         | direct answers. So that's something totally different to
         | traditional links on a results page because it's returning
         | direct information (and then showing results beside it).
         | 
         | To do this, we give the the questions answering a lot more time
         | and we allocate more resources the more complex the question
         | is.
         | 
         | You can see this if you try 2 searches to compare:
         | 
         | 1. Information search, say "elon musk latest news"
         | 
         | and
         | 
         | 2. "why did elon musk open a tesla factory in germany?"
         | 
         | The aim of the second one is that the system does the work of
         | reading through and finding the right information, rather than
         | you having to.
         | 
         | So you get a direct answer, for me it was:
         | 
         | "There are two unique goals that the Berlin location serves.
         | The first is that it is strategic to lure German automotive
         | talent to Tesla. The second is that it is a statement that Elon
         | wants to one-up auto companies from that region."
         | 
         | Some of our early users tell us this is the key thing they use
         | Andi for, because when it works well, it can save them a
         | substantial amount of reading and research time.
         | 
         | Our aim is to try to support both types of use - complex
         | question answering, and quick information searches. And have
         | the search results work in the way that supports best what a
         | user is trying to do in both cases.
         | 
         | It's a similar idea with navigation (go twitter elonmusk
         | berlin, etc). We know this is a very different approach and has
         | risks because it is so different.
         | 
         | Also, one really cool thing is that as a user, you can actually
         | control the trade-off between speed and complexity/depth of
         | searching.
         | 
         | If you jump to settings, you'll see there's a cool feature
         | where you can adjust this yourself.
         | 
         | The default is "Best bet" - it tries to guess how much time and
         | resource to give to a query based on the question asked. But
         | you can set it to either higher depth/slower speed (but much
         | better quality answers), or faster/not-so-smart. Or in-between.
         | 
         | We have a lot more to do on this. But when you're researching,
         | it can really help to give the system another 10 seconds to
         | save yourself 10 minutes, so that's the theory behind letting
         | people choose. You can also turn off the complex question
         | answering (Deep Answers) entirely on the setting, and then it
         | will be regular speed keyword searching.
         | 
         | Check it out here:
         | 
         | https://andisearch.com/settings/
         | 
         | You can see the impact pretty quickly with asking questions.
        
       | freediver wrote:
       | Congratulations on your launch.
       | 
       | Some bad examples of fairly "easy" questions:
       | 
       | "Who is the current president" - No answer
       | 
       | "Who is the previous president" - George H.W. Bush (!)
       | 
       | "Is this a leap year" - "I find sci-fi inspirational! While I
       | might not be as advanced, I'm trying to do good in the world, and
       | I'm here and you can use me today " (?!)
       | 
       | On the other hand good it got this right and even Google
       | struggles with this.
       | 
       | "When did Neil Armstrong land on Mars" - "Neil Armstrong never
       | landed on Mars".
       | 
       | Problem with basing a product on QA models is that they are
       | pretty much hit and miss (no real "understanding" of the real
       | world, and this is not likely to change any time soon). So you as
       | a user can never be sure if the model is right or wrong for an
       | actual question that you do not know the answer to (which is what
       | you would presumably pay for). It is like paying an assistant
       | that gets things right only half or so of the time. A much better
       | situation would be if the assistant could tell they are going to
       | be wrong and just keep quiet, and answer only questions they are
       | very certain they are right about.
       | 
       | Hope capabilities improve with time!
        
         | divbzero wrote:
         | Recent proofs of concept like GPT-3, though far from perfect,
         | make me optimistic that good Q&A models are within reach.
         | 
         |  _Q: Did Finland join NATO or the Warsaw Pact?_
         | 
         |  _A: Finland was not a member of either NATO or the Warsaw
         | Pact._
         | 
         |  _Q: What is the most populous city north of the Arctic
         | Circle?_
         | 
         |  _A: Barrow, Alaska_
         | 
         |  _Q: Does GPS accuracy depend on special relativity or general
         | relativity?_
         | 
         |  _A: GPS accuracy depends on general relativity._
         | 
         |  _Q: How may Boeing and Airbus planes have been built?_
         | 
         |  _A: Boeing has delivered over 10,500 aircraft, while Airbus
         | has delivered over 6,300 aircraft._
         | 
         |  _Q: What is an example of an undecidable question?_
         | 
         |  _A: The Post correspondence problem is an example of an
         | undecidable question._
         | 
         |  _Q: Is suicide ever the right thing to do?_
         | 
         |  _A: Suicide is never the right thing to do._
         | 
         |  _Q: Who broke up the Beatles?_
         | 
         |  _A: John Lennon, Paul McCartney, George Harrison and Ringo
         | Starr_
         | 
         |  _Q: Who has beaten Tom Brady in the Super Bowl?_
         | 
         |  _A: Eli Manning and Nick Foles have both beaten Tom Brady in
         | the Super Bowl._
         | 
         | Already not too shabby.
        
           | jedwhite wrote:
           | Thank you so much! Andi's strong point is these type of
           | factual answers, and I think you're 100% right to be optimist
           | about models being within reach.
           | 
           | The question answering feature has only just been released
           | for the first time today for this post, and for an entire
           | field of questions it already surprises us that not only does
           | it work but it does *well*. We can iterate on the intent
           | error correction and verbal tricks. And we're just a tiny
           | team standing on the shoulders of giants. The entire field is
           | moving quickly and making astonishing progress.
           | 
           | The exciting thing in this area is the rate of improvement.
           | The thing language models have lacked is factual accuracy,
           | and that's definitely a hard challenge. We have problems to
           | solve with applying common sense and reason to things like
           | information safety/confidence, and fixing misunderstood
           | intents is mostly just iterative training. But the exciting
           | thing is that this already works in many cases.
           | 
           | It's interesting to try it out with current news too.
           | Something from today like "why does tesla want to split its
           | stock?"
           | 
           | You can see the progress in this space is real and getting
           | faster. The verbal tricks are fun to laugh at, but the
           | underlying progress is real.
        
         | AngelaHoover wrote:
         | Thanks, I tried:
         | 
         | Q: "who is the current president" A: The current president of
         | the United States is Joe Biden since January 20, 2021
         | 
         | Q: "Who is the previous president?" A: Pulls a snippet from the
         | top of the wikipedia article 'List of presidents of the United
         | States'
         | 
         | Q: Is this a leap year A: Got the same response about Sci - Fi
         | LOL thank you, added to the bugs list! He's a friendly bot :)
         | 
         | Q: When did Neil Armstrong land on Mars A: Pulled a snippet of
         | text from Neil Armstrong's wiki page and says I found this
         | information on wikipedia.org:
         | 
         | I tried a different query,
         | 
         | Q: did Neil Armstrong land on the Moon? A: Yes, Neil Armstrong
         | landed on the moon.
         | 
         | It's experimental, but we know it can only improve over time
         | and we're really excited people are trying out! And I agree
         | that we need to add additional parameters for when do preform a
         | deep answer vs. when to just display the results.
        
         | jedwhite wrote:
         | Thanks very much for trying it out and for giving us feedback.
         | We can't see what people search, so it's really helpful to see
         | reports when things go wrong.
         | 
         | For "who is the current president" I get "The current president
         | of the United States is Joe Biden since January 20, 2021". Did
         | you see any results at all? I'm guessing something glitched
         | there. The previous president I just got the same as you, so
         | that's one re-training!
         | 
         | It's early days, but it tends to do best with factual
         | extraction with concrete and specific terms included.
         | Linguistic references (anaphors, previous, next) are still
         | works in progress. But it's heading in the right direction. So
         | you can join concrete questions with more complex phrases and
         | it has enough to work with.
         | 
         | So if you ask:
         | 
         | "who is the 45th president of the the USA and where was he
         | born?"
         | 
         | you get:
         | 
         | "Donald Trump is the 45th president of the United States and he
         | was born in Queens, New York City."
         | 
         | The leap year one was way off! The intent detection was way off
         | the mark there!
         | 
         | Thank you again for giving is feedback. It really does help
         | having examples because we can't see what folks search. The
         | encouraging thing there is that it is almost always
         | misunderstanding the question when things go wrong, rather than
         | identifying inaccurate facts (so answering the wrong question
         | correctly). You can see that with these examples.
         | 
         | Our Discord community likes to have challenges to break Andi's
         | answers, and it really helps us see where it is strong and
         | where it needs the most work. There's a whole school of thought
         | around "tricking AI" and we know how to trick Andi pretty well.
         | But one of the advantages of the search approach is that the
         | underlying results are always displayed alongside the answer.
         | As with many early products, having people use it and give us
         | feedback on what goes right and wrong is incredibly valuable
         | for us, as we can keep iterating and improving the things that
         | go wrong.
         | 
         | So we're grateful for every example of where it is off the mark
         | as well as does well.
        
       | spindle wrote:
       | I really really love having easy access to a reader view that
       | isn't tied to a specific web browser. Thank you! \o/
        
         | spindle wrote:
         | I also love the lack of advertising.
        
       | aiyen wrote:
       | I searched "Is Medicare Part B optional?"
       | 
       | And got back: I found this information on hhs.gov:
       | 
       | What is Medicare Part B? Medicare Part B helps cover medical
       | services like doctors' services, outpatient care, and other
       | medical services that Part A doesn't cover. Part B is optional.
       | Part B helps pay for covered medical services and items when they
       | are medically necessary.
       | 
       | Glad to see that reference is given!
       | 
       | Though not sure if I like the chat interface. A search interface
       | like Google would probably encourage me to use this more as I'm
       | already used to using an interface like Google everyday.
        
         | jedwhite wrote:
         | Thank you! Andi is starting to get really good at those sorts
         | of serious questions, and that's a great example.
         | 
         | The trick questions that everyone likes to ask any new AI give
         | us great training data for improving the intent-detection edge
         | cases. But Andi's mission is these more serious questions that
         | it is trying to help people with. Linguistic acrobatics are for
         | the next generation of gpt-based models. So your feedback
         | really lifts a coder's heart!
         | 
         | Users fall into two camps with the UI - pretty much 50/50
         | between "it makes my eyes bleed make it look like google" vs "I
         | love it, this is the future". We have some thoughts on how to
         | reconcile the two things :)
         | 
         | Thanks again for trying it out and grateful for all and any
         | feedback and suggestions you have!
        
       | btdmaster wrote:
       | What's with all the analytics and UUIDs + obfuscated data
       | transfer to analytics.andisearch.com?
        
         | jedwhite wrote:
         | Thanks for the chance to chat a bit about this, as it's super
         | important to us and our early user community.
         | 
         | We're using PostHog on our own domain. We've spent a lot of
         | time talking with our early user community about the right
         | approach to privacy while still trying to understand enough
         | about whether the app is useful to keep improving it. We've
         | tried really hard to come up with a good approach that allows
         | us to understand broadly how people use the app while
         | protecting individual privacy. Early on we tried building a
         | custom system using differential privacy and that failed badly,
         | so we're trying PostHog during our alpha. But this is very much
         | still something we're working out, and we're very open to
         | feedback on it.
         | 
         | So it's worth sharing some details and our thoughts on how
         | we're approaching understanding app-use enough to keep
         | improving the search product, while also protecting our
         | community's privacy.
         | 
         | With Posthog, we don't auto-capture events, and only a limited
         | specific engagement indicators are recorded (list below). We
         | don't log or store searches, personal information, IP address,
         | users' geocoordinates (we capture to the nearest city so we can
         | see what internationalization priorities matter).
         | 
         | You can also disable the in-app engagement on the Settings page
         | here (as well as disabling the question answering tech):
         | 
         | https://andisearch.com/settings/
         | 
         | We don't log or record searches in any way (either from the
         | address bar or within the search session). We don't log what is
         | typed, the links clicked on, or any personally identifying
         | information. Users are anonymous and the client identifiers
         | aren't connected in any way across browser profiles, devices,
         | or anonymous use. We use the client id in aggregate to
         | understand whether there is repeat use and roughly how many
         | visitors we have, without knowing anything about any user
         | individually, and then we're discarding it and just keeping
         | aggregate data (still figuring out how to do that properly as
         | we're only a team of two people and have no analytics
         | background). So lots of work to do here.
         | 
         | Things we try to understand about app use:
         | 
         | 1. Broad search intent (eg it was a knowledge search, wiki
         | search, programming search, question asked) but not what the
         | search was, and not what the results were. But without logging
         | any searches or what was opened. This tells us what broad areas
         | we need to improve.
         | 
         | 2. Engagement - that someone clicked a type of link (but not
         | what the link was), or used a reader view (but not what was
         | read), and whether anyone uses the different views (grid etc).
         | This gives us signals to improve the app.
         | 
         | The things we do to try to help protect privacy:
         | 
         | We don't store any cookies.
         | 
         | We block Google's FLoC (Federated Learning of Cohorts) tracking
         | technology from this app.
         | 
         | We don't log or store user IP address. It's used to lookup
         | approximate location (nearest town) for location searches only,
         | then discarded. It is never passed to third-parties.
         | 
         | We only use GPS or detailed location for searches with express
         | user permission, and then only to approximate the area. GPS
         | location details are not stored or passed to any third-parties.
         | 
         | Searches are anonymous and private to users. We don't log
         | searches.
         | 
         | We only use analytics within our service to improve it for our
         | users, and only record broad aggregated engagement data. We are
         | using PostHog on our own domain, with data restricted to
         | specific engagement actions and no IP use.
         | 
         | We block referrers on external links and use "nofollow noopener
         | noreferrer" to protect you.
         | 
         | We do not share or sell customer or personal data with any
         | third parties whatsoever.
         | 
         | We collect only the data needed to provide the service.
         | 
         | We don't use any off-site or third-party industry user
         | tracking. There is no ad tracking such as Facebook's or third-
         | party analytics platforms like Google Analytics.
         | 
         | No advertising display or advertising tracking.
         | 
         | We use randomized proxies to retrieve content for preview and
         | reader mode.
         | 
         | We use https encryption everywhere including for external links
         | wherever available.
         | 
         | We proxy images and try to strip third-party cookies from any
         | reader content as much as possible.
         | 
         | We use anonymous rotating proxies with all identifiers stripped
         | to connect to external APIs for searching.
         | 
         | We display embedded videos and content for our users'
         | convenience (so you can play a YouTube video in chat), but they
         | are in a sandbox to help protect a bit, and restricted to only
         | services that users have asked us to support (like YouTube or
         | Spotify). We use the no-cookie domains but an embedded video
         | might have cookies outside of our control.
         | 
         | Keeping searches within encrypted POST packets also helps with
         | privacy, because searches aren't being leaked to browser
         | vendors through browser history.
         | 
         | So we have a long way to go, and we're still figuring this out.
         | Before we exit beta we've also committed to have our privacy
         | audited. But as an early alpha this is still very much a work
         | in progress.
         | 
         | There are some more details on our privacy page also:
         | 
         | https://andisearch.com/privacy/
         | 
         | We also shared an early prototype with some of the privacy
         | focused communities on reddit and got some great feedback there
         | as well, and we have an active Discord with a lot of discussion
         | around privacy also if you'd like to chat more about this and
         | are interested in what we're doing on this front. It's
         | something we want to try to get right.
         | 
         | https://discord.gg/qcCcrbMuex
         | 
         | I was hoping someone would ask this, and thank you for giving
         | us the chance to share a little more about it :)
        
       | [deleted]
        
       | pouyarad_ wrote:
       | Congrats to the Andi team! I'm excited to see the innovation
       | continuing in this space.
        
         | InCityDreams wrote:
         | Wow. You and poster picodguyo use very similar language "I love
         | the innovation going on in the search space lately." What are
         | the chances of that?
        
           | AngelaHoover wrote:
           | Hey, we don't know picodguyo but Pouya is one of our W22
           | batchmates and just trying to be supportive of us. It's my
           | bad, I'm sorry!
        
       | riidom wrote:
       | The social security agency of the USA seems to be in high praise
       | of you, is there any connection between you and them?
        
       | joefigura wrote:
       | Cool stuff! A low-noise low spam alternative to Google is
       | desperately needed. Excited to see where the product goes!
       | 
       | How do you plan to make money?
       | 
       | Found a bug on one search result - searching "chicken salad
       | recipes" produces a solid blue screen.
        
         | jedwhite wrote:
         | Oh no, so sadly we have our own Blue Screen of Death!
         | 
         | And it's reproducible as I just got it for that search too!
         | I'll report back once I've fixed that!
         | 
         | Thank you sincerely for trying out Andi. There is a lot more
         | coming with anti-spam. I hate spam.
         | 
         | We want to have a freemium model where anyone can use free
         | anonymous search without restriction, and then we have some
         | paid plans and features, like API use. We still have to work
         | out the details, but we want to have a model where are users
         | are the customer rather than the product. And we think that
         | aligns the company solidly with looking after real people,
         | rather than serving advertisers or corporations.
         | 
         | We have to still work out in detail how it would work though.
        
       | jsnell wrote:
       | I tried [when is ambulance coming out], and you did correctly
       | understand it was referring to the movie. But the result is
       | strictly worse than the same search on Google (which also knew it
       | was a movie, but actually gave an instant answer to the question
       | rather just a IMDB link, and answered the question correctly for
       | my location unlike the IMDB page). Bing also returned an instant
       | answer, but their answer is even more wrong (the release was
       | pushed back by a couple of months, Bing claims that the movie
       | would be released at the future date of February 2022).
       | 
       | Then I tried [will there be a wheat shortage this year?], and the
       | top result was a lunatic fringe prepper blog, from which you
       | extracted a confident answer that there will definitely be a
       | wheat shortage. Google doesn't try to answer the question (which
       | I'd argue is correct for something this speculative), and only
       | links to reputable news sources all of which look relevant. Bing
       | is the same, but mixes in some more alternative results at the
       | bottom of the page.
       | 
       | Then, since you said that this worked best in a conversational
       | manner I asked [is it due to the war?], obviously referring to
       | the previous question. Rather than keep the context of that
       | previous search in mind, you replied with information on the
       | current local weather. (In Fahrenheit, for a location in Europe).
       | 
       | At this point, combined with how incredibly slow the searches
       | are, my goodwill would be gone as a real user.
        
         | jedwhite wrote:
         | Thank you for trying it out, and giving us feedback on the
         | searches. We don't log searches, and can't see what results
         | users get, so we rely heavily on community feedback to improve
         | things when searches go wrong.
         | 
         | Interestingly, with the Ambulance search I see:
         | 
         | ```I found this deep answer on imdb.com:
         | 
         | Laurits Munch-Petersen (based on the film "Ambulancen" written
         | by) Lars Andreas Pedersen (based on the film "Ambulancen"
         | written by) Stars Jake Gyllenhaal Yahya Abdul-Mateen II Eiza
         | Gonzalez See production, box office & company info Coming soon
         | Releases April 8, 2022 40 User reviews 28 Critic reviews Videos
         | 3 Trailer 2:58 Official Trailer ```
         | 
         | I'n not familiar with the movie, but one thing that really
         | helps us improve results is feedback on what the right result
         | *should* be to help retrain based on examples for ranking.
         | 
         | At the moment, the alpha is very US-centric (imperial units
         | only, no local business/maps etc), although our early user
         | community is global and they're pushing us to add better
         | international support quickly. Funnily enough, it's often
         | decent at searches in different languages as it uses APIs and
         | live content sources. But the front-end is not localized. As a
         | small two person team, we hope to get some help on this front
         | in future. There is a meaningful subset of folks who prefer not
         | to have locally filtered results. So we have some work to do
         | figuring out the right approach there.
         | 
         | With the wheat result, that's an interesting example. The other
         | results look reasonable but the ranking was out there
         | definitely. The Q&A process picked the wrong representation of
         | the consensus result from across the set of similar views from
         | Reuters, Fox etc.
         | 
         | This is a good case for user control of results I think.
         | 
         | We're working on anaphor resolution and better disambiguation
         | but have a lot of work to do there. There are some simple
         | things already in the alpha. eg try putting in "Paul Graham" :)
        
           | jsnell wrote:
           | > Interestingly, with the Ambulance search I see:
           | 
           | Right, that's a legit page info for that movie. But your
           | search engine doesn't actually extract the answer to my
           | question ("releases april 8th") from the page, and this
           | should have been about as easy as it gets. The answer happens
           | to be in the snippet, but the snippet is unreadable.
           | 
           | Now, I wasn't seriously expecting you to have localization of
           | results at this point, but the query wasn't meant to trip the
           | system up either. I only realized this wasn't a globally
           | synchronized release date when comparing the results. (And
           | again, this was about the easiest possible case for
           | localization. IMDB has a page with structured data for
           | release dates by country, and you clearly are using
           | geolocation for the query that produced local weather
           | results.)
        
             | jedwhite wrote:
             | Yes any localization is very much a future feature. There
             | are some small functions that use it that are things our
             | user community has asked for, but we're very much taking an
             | iterative approach. I think with a small team you have to
             | do that, and accept that early on you can't do everything
             | but that you can keep iterating and improving quickly based
             | on the things that users say are most important.
             | 
             | The most important feature for our early users so far,
             | based on the feedback we get, is factual information
             | question answering, and there are some classes of query
             | where Andi is already very useful. And as it gets better we
             | hope it can save people a lot of time. Reader view is
             | popular for the same reason because it saves people time
             | and distraction.
             | 
             | At the same time, I think everyone (myself included) likes
             | to play "let's trip up the AI" with something new like
             | this, and it's fun and a great way to get training data and
             | share a laugh on our Discord. When we don't log searches,
             | it's some of the most valuable data we can get :)
        
       | O5vYtytb wrote:
       | I tried a softball:
       | 
       | > what instrument did jimi hendrix play
       | 
       | > Jimi Hendrix played the bass guitar, wooden recorder, and
       | keyboard.
       | 
       | Mostly accurate, but misses the main answer.
        
         | spindle wrote:
         | Andi gets this right now:
         | 
         | > "Hendrix played the electric guitar, and his main electric
         | guitar was the Stratocaster."
         | 
         | But this raises a new problem: the quotation marks there are
         | part of Andi's reply, and it says "Source: What instruments did
         | Jimi Hendrix play? - Musician Authority", but the answer it
         | gives is NOT actually a quote from that source, it's a
         | paraphrase of part of that source. It's a really good
         | paraphrase, so that's impressive, but I feel strongly that it
         | should not have quotation marks around it if it's not a quote.
        
       | CPancholiUS wrote:
       | Would be nice to have kid-safe option of results...
        
         | jedwhite wrote:
         | Yes! Safe Search is in the pipeline. Images especially.
         | 
         | Most of the information safety work so far is around sensitive
         | or unsafe topic handling, and trying to just fall back to
         | straight search results when a topic is unsafe. This approach
         | has some special challenges with that, because if someone
         | specifically searches for keywords on controversial topics, and
         | you're answering questions about them, then it quickly gets
         | into dangerous territory. In this case it's not because the
         | model is being offensive, but it could well be answering
         | specific questions about dangerous content. People are used to
         | getting search results matching dangerous keywords they enter.
         | They'll be less used to see answers based on the content. So I
         | think we just need to avoid Q&A on any sensitive topic. The
         | downside is that while it is already quite good at answering
         | questions about news and politics, it currently stays away from
         | political or sensitive topics, and just presents search results
         | for this.
        
       | 58x14 wrote:
        
       | krmmalik wrote:
       | I asked "How can I stop my kidneys being overactive" and got a
       | surprisingly satisfactory result and my question has been
       | answered.
       | 
       | Honestly speaking, I had very low expectations going in so this
       | was a nice surprise.
        
         | jedwhite wrote:
         | Thanks for trying it out. I'm a Type 1 Diabetic, so health
         | search is one of the areas I really want us to help improve for
         | the world. We have a long way to go but it's one of the
         | suckiest areas of search with spam and clickbait taking over
         | the rest of the web. We're still putting together our spam
         | blacklists for health so all ears on suggestions, but we have a
         | pretty good model for detecting decent content, and a start at
         | some vertical "meta indexes" of good resources.
         | 
         | Very grateful for you trying out Andi, and for your
         | encouragement!
        
       | ShowalkKama wrote:
       | I am impressed: I tried a couple of python libraries / tools and
       | got all perfect results (official docs / man page) I'm
       | considering adding this to my bookmarks to more easily find
       | documentation (since it's usually hidden by 50 km of spammy blog
       | posts) since the only issue I could notice was the speed.
        
         | jedwhite wrote:
         | Programming search is both one of its strong areas (we built
         | Andi using Andi), but also the one with the most still to
         | improve. So I'm grateful you tried it for that. We're doing
         | pretty well with binning copycat and link farms in programming
         | searches as well. Try asking some questions from documentation
         | as well. Something like:
         | 
         | "what is the maximum memory you can allocate to an aws lambda
         | function?"
         | 
         | Lots to do here. Thank you for your encouragement and the kind
         | feedback. We've been working hard especially to get rid of spam
         | from programming results, and it's one of the first areas we've
         | made real progress on, with more to come!
        
       | smt88 wrote:
       | I emphatically don't want this. I want literal keyword search
       | like we used to have in the 90s and early 2000s.
       | 
       | NLP isn't good enough (and may never be good enough, unless it's
       | at the level of an intelligent human) to add enough benefit for
       | the cost -- the "cost" being when the NLP classifies my words and
       | gives me "fuzzy" results that have nothing to do with my original
       | search.
       | 
       | NLP has made it literally impossible to search for certain things
       | if if's enough of a niche, or in some cases even if it isn't a
       | niche.
        
         | version_five wrote:
         | One thing that is almost always missed in conversational
         | interfaces is that real life conversation works because it
         | converges (ideally) to a meeting of the minds where both
         | parties refine their understanding of what the other wants
         | until they understand each other. I have not seen natural
         | language search that does this, it's all just some shitty
         | intent resolution on top of a list of things that can be
         | returned. Until that is fixed, it doesn't perform better than
         | keywords for anyone who is remotely literate.
         | 
         | I feel like the main use case for natural language search would
         | be a time traveler from 100 years ago. Otherwise, everyone
         | already understands better and is not impressed. The only
         | people I see who want chatbots are lazy executives who think it
         | will let them fire people. I don't see a place for them at all
         | in search.
        
         | jedwhite wrote:
         | Thank you for your thoughts and I miss literal searches too.
         | Our feeling is that they don't have to be mutually exclusive.
         | So we're trying to build both, including slash commands,
         | advanced operators and some ideas borrowed from the terminal.
         | 
         | While it's early days, we've already got some cool things
         | working on that front, including the ability to totally bypass
         | NLP and do advanced operators.
         | 
         | So for example, try some of the following:
         | 
         | Use '/e' to skip the AI (exact):
         | 
         | ```/e site:cnn.com +"mars rover"```
         | 
         | DDG-style !bang commands are supported. Try:
         | 
         | ```!a shoes```
         | 
         | Search X for Y - and see the results within Andi:
         | 
         | ```/s reddit duckduckgo```
         | 
         | ```/s google python list comprehension```
         | 
         | or just:
         | 
         | ```search reddit for duckduckgo```
         | 
         | Direct navigation to on-site searches with 'go' or '/g':
         | 
         | ```/g twitter elonmusk free speech```
         | 
         | ```go youtube cute pandas```
         | 
         | ```go politics on reddit```
         | 
         | You can use this to directly navigate to pretty much any site,
         | or any search result on any site.
         | 
         | There's quite a bit more, and more coming.
         | 
         | So it has both direct literal commands (including operators
         | like quotes etc where we can), and also natural language.
         | 
         | Natural language has a lot of advantages working with language
         | models for things like question answering. There are subtle
         | clues in the way we say things to each other as humans that
         | help supply meaningful information, and for those sorts of
         | questions, Andi does much better when people ask questions in
         | plain language including extra information above keywords.
         | 
         | So we're trying to build a search that supports both
         | approaches.
         | 
         | Also, more terminal like things are coming. Try [space] and
         | up/down arrow to start navigating history.
        
         | alimoeeny wrote:
         | maybe another way to say this is: this is fantastic, keep up
         | the good work. However, can you also PLEASE, add a pure
         | "keyword search" where I can type a keyword and get a list of
         | pages that have that exact word in them? in some cases (which
         | might be the minority of searches, but still very important) I
         | need to search for an exact good old fashioned keyword or
         | phrase (similar to what `find` does except sorted by some
         | magical "page rank" like metric. Like the good old days of
         | google. Maybe this can be a paid feature?!
        
           | jedwhite wrote:
           | Hey just posted on another comment, but I'm with you 100%!
           | 
           | You can over-ride the NLP and do literal searches.
           | 
           | /e or ~e force it, but any search with things like +"phrase"
           | or site: etc will skip the AI and we'll *try* to do a literal
           | search. It's not always successful but we're trying to expand
           | coverage with it.
           | 
           | So try:
           | 
           | ```/e site:cnn.com +"mars rover"```
           | 
           | for example.
           | 
           | The Search Tools menu on the results panel lets you refine
           | searches, related sites, cached version etc too to help with
           | refining via literal patterns.
           | 
           | Thanks again!
        
       | picodguyo wrote:
       | I love the innovation going on in the search space lately.
       | Defaulting to a chat interface adds a lot of burden on your users
       | (need to formulate a longer query) and yourself (need to nail the
       | response in top1). Personally I think there's enough cool stuff
       | in your grid/list views to make it useful without trying to
       | shoehorn this into a chat UX.
        
         | AngelaHoover wrote:
         | Thanks for the great question. Andi works well for keyword
         | searches but where it really shines is when you give it more
         | context. The key reasons for a chat interface are that it's
         | simple, familiar and uncluttered, and reveals information
         | progressively.
         | 
         | * Chat is minimalist. Part of the problem with google is
         | "information overwhelm" - all the clutter and distraction in
         | results.
         | 
         | * It's a super easy and familiar interface. Gen-Z users
         | especially tells us they prefer messaging apps and visual feeds
         | like Instagram.
         | 
         | * Combined with visual cards, a chat UI gives you progressive-
         | reveal of information.
         | 
         | * conversational search long-term provides a natural and very
         | human way to explore and refine results. Humans are really
         | great at querying each other and maintaining context. And long
         | term that's the aim here.
         | 
         | When you think if sci-fi AI, they are always conversational. It
         | seems likely that's what the future will look like, rather than
         | a page of truncated links with a lot of ads. So that's our
         | thinking for the conversational interface, and we're super open
         | and excited to hear more feedback on it! :)
        
       | quinncom wrote:
       | This looks great, and works well so far. I'm using it on mobile
       | where dictation lends itself better to using natural language.
       | 
       | Feature request: I'd like to limit results to a recent time
       | range, e.g., the last five years. For many queries, anything
       | older will be outdated.
        
         | AngelaHoover wrote:
         | Thank you so much! Agree with you that the direct question
         | answering on mobile fits nicely. We appreciate the feedback on
         | adding a feature to limit the results based on a time range,
         | it's a good idea and I can see a lot of use cases for that.
        
       | jedwhite wrote:
       | A few folks may have seen a preview of Andi before (there were a
       | couple of kind mentions on here before we launched). If you did
       | try it out it (thank you!) and want to check out the new version
       | with the question answering, you may need to refresh/reload the
       | page (or quit browser) to force an update. It's a little
       | Progressive Web App so it does background-updates, which helps
       | performance but means updates don't happen immediately. You can
       | always try it out in a new Incognito/Private window. Thanks
       | again!
        
       | [deleted]
        
       | herodotus wrote:
       | I am glad to see work like this going on, but when I looked up
       | "BC Ferry Schedule" I got this answer:
       | 
       | "I found this information on bcferries.com:
       | 
       | Very weak Weak Medium Strong Very strong
       | password.strength.unsafepwd Too short Use %d - %m characters with
       | a mix of any 3 [upper case letters, lower case letters, numbers &
       | symbols] 8 32 Password must be more than eight characters and
       | contain a mix of any three of the following: upper case letters,
       | numbers and symbols. Password must be less than 32 characters and
       | contain a mix of any three of ... "
       | 
       | So you may want to watch for password protected websites...
        
         | AngelaHoover wrote:
         | Thanks for pointing this out and I added the query to our bugs
         | list. This is still an experiential feature so there are kinks
         | to work out but feedback helps us immensely. I tried a couple
         | of variations of the search with increased specificity like "bc
         | ferry schedule for Coastal Inspiration" and Andi is struggling
         | with this query! As Jed mentioned in a comment earlier, you can
         | also turn off complex question answering in settings.
        
         | jedwhite wrote:
         | Thanks - yes we don't have local searches working well yet and
         | it is very US-centric for those there are there.
         | 
         | The password information is interesting. In the old world,
         | you'd find out it is now password protected only when you go
         | there. Because Andi grabs content live from the source, you can
         | see they've changed it right there in the search results. I'm
         | guessing the page must have been open before, and now it's
         | password protected (or they let robots.txt in but not user
         | agents).
         | 
         | Andi is retrieving content to show in results via anonymous
         | proxies. So the search results page reflects what is on the
         | live site (so long as we can grab the latest content).
         | 
         | Thanks for trying out Andi and interesting edge case! I think
         | the best thing to do will be filter pages out of the live
         | results page on the fly if it turns out it's been password
         | protected very recently when a user sees the results.
        
       ___________________________________________________________________
       (page generated 2022-03-28 23:00 UTC)