[HN Gopher] Show HN: Answer Overflow  - Indexing Discord content...
       ___________________________________________________________________
        
       Show HN: Answer Overflow  - Indexing Discord content into the web
        
       Hi!  I'm Rhys, I develop Answer Overflow a search engine for
       Discord channels. Answer Overflow indexes content from channels
       into Google making them discoverable on the web.  I'm sharing this
       again after seeing a lot of discussion during the Reddit blackout
       about the inaccessibility of information sent in Discord servers.
       Answer Overflow is a verified bot in over 100 communities, fully
       complies with the Discord ToS, and is open source!
       https://github.com/AnswerOverflow/AnswerOverflow  Check out some of
       the communities here!  T3 Community -
       https://www.answeroverflow.com/c/966627436387266600  C# -
       https://www.answeroverflow.com/c/143867839282020352  Reactiflux -
       https://www.answeroverflow.com/c/143867839282020352  All -
       https://www.answeroverflow.com/browse  Please let me know what
       feedback you have, thanks for checking it out!
        
       Author : rhyssullivan1
       Score  : 80 points
       Date   : 2023-06-18 19:50 UTC (3 hours ago)
        
 (HTM) web link (www.answeroverflow.com)
 (TXT) w3m dump (www.answeroverflow.com)
        
       | berkle4455 wrote:
       | I'm sure Discord and their communities are absolutely ecstatic
       | about opening up the doors to openAI and others to scrape their
       | collective work for the latest LLM.
       | 
       | Walled gardens are going to get a whole lot stricter.
        
       | mdaniel wrote:
       | Welcome back. How does this compare to Linen
       | (https://github.com/linen-dev/linen.dev#readme), which claims to
       | support Slack and Discord? I do see the license difference, but
       | didn't know if that was the major differentiator
        
         | rhyssullivan1 wrote:
         | Couple key differences:
         | 
         | - Answer Overflow works on a consent basis for displaying
         | messages (https://docs.answeroverflow.com/user-
         | settings/displaying-mes...), while Linen does all the messages
         | in a community. The consent system Answer Overflow has helps a
         | lot with respecting user privacy while also getting content
         | indexed.
         | 
         | - Linen appears to be building out a competitor to Slack &
         | Discord while Answer Overflow is focused on building on top of
         | those platforms, so we've got very different roadmaps. From
         | what I can gather from the Linen roadmap, they're implementing
         | things like voice chat, private channels, etc. Whereas with
         | Answer Overflow some of the things I'm focused on is answer
         | automation, tracking outdated answers, analytics for where to
         | improve your docs etc
         | 
         | - Answer Overflow is pretty much only focused on Discord
         | servers, it wouldn't be too hard to support both Slack and
         | Discord but what's nice about focusing on Discord for now is it
         | helps with our goal of being the best indexing tool
         | specifically for Discord
         | 
         | - Global search (https://www.answeroverflow.com/search), you
         | can search all Answer Overflow communities at the same time
         | 
         | The team at Linen have built out a great product though and
         | it's cool watching them succeed with it!
        
       | mid-kid wrote:
       | I was talking about needing a solution like this just a second
       | ago. Down from the heavens, descends this. I'll be sure to give
       | it a try!
        
         | rhyssullivan1 wrote:
         | Send me a message if you have any questions! Happy to help with
         | getting it setup
        
           | mid-kid wrote:
           | This might sound a little bit picky, but from a cursory look
           | around the project, it feels a bit too corporate and
           | platform-ey for my tastes. I'm only interested in two things:
           | generating (ideally static, and seo-friendly) web pages out
           | of a discord forum channel and selfhosting it so we can
           | archive the data ourselves (and won't be bound to content
           | policies of answeroverflow.com). All of the extra bells and
           | whistles with the bot auto-managing channels, analytics, AI
           | and whatever else superfluous and make me sweat a little, as
           | I'll have to comb through the documentation to make sure
           | everything is set up correctly. It's also really a shame to
           | read that selfhosting will be a "Pro" feature. I'll give
           | props for considering users wanting to opt-out, however, and
           | it does at least seem rather simple to set up.
        
             | rhyssullivan1 wrote:
             | Where did you see self hosting is a pro feature? My bad if
             | the website gives that impression it will be free, the
             | whole codebase is MIT licensed.
             | 
             | For all the extra bells and whistles, it's mainly for
             | people who are doing community support at scale who need it
             | which would be paid customers - I do sort of need a way to
             | support myself so I can buy groceries. The core of the
             | product that matter is free and working well for indexing
             | content so now the focus is "what else can we do to improve
             | community support as a whole?"
             | 
             | As for self hosting, if you submit a PR for supporting it
             | I'd be happy to get that merged but it's not really a
             | priority at the moment. The codebase is setup to be pretty
             | easy to make a self hosted version though.
        
               | mid-kid wrote:
               | Haha, that's fair. I'll consider trying to set it up
               | myself and see how it goes.
               | 
               | I got the idea that it was a pro feature out of the
               | roadmap list on the website, where it's listed as "coming
               | soon", and "pro" is only mentioned when you click on the
               | waitlist join link. If it means custom domains, it might
               | be better off being listed as "custom domains" or
               | something similar. That's how it's called on google apps
               | and such. It also doesn't help that the roadmap on the
               | website doesn't match the one on the github page, I
               | thought the roadmap features on the github page might be
               | pro features as well.
        
               | rhyssullivan1 wrote:
               | Ah I see how that's confusing, sorry about that! I'll
               | update it in both places to make that clearer
        
       | tudorw wrote:
       | nice, there is a lot of good stuff on discord!
        
       | apignotti wrote:
       | Genuine question: I love Discord, but how on earth is it possible
       | that such functionality was not built-in to begin with?
       | 
       | I really don't understand how the need for indexing and search
       | was overlooked.
        
         | Kiro wrote:
         | It makes no sense to index the vast majority of content. You
         | would need to cherry pick really hard among all the noise to
         | find the stuff worth putting online.
        
           | jasonjmcghee wrote:
           | Interesting comment. I would think Reddit is similar in terms
           | of content, yet "site:reddit.com <query>" is common as a
           | general search pattern (pre-blackout)
        
           | michaelmior wrote:
           | I would argue it makes no sense to index the vast majority of
           | content _without good search_. If your search is good enough,
           | you can index everything and then surface only the good stuff
           | at query time.
        
         | thunky wrote:
         | What I wonder is why would anyone that cares about
         | archiving/search would choose to use Discord?
        
         | esafak wrote:
         | It's not made for knowledge discovery; it's for gamers. Just
         | look at that busy UI! The content is assumed to have no
         | historical value.
        
         | thrashh wrote:
         | Discord is a chatroom first. What non-enterprise chat comes
         | with archives?
         | 
         | A forum is totally different.
         | 
         | And even then, forums weren't designed to be archived from the
         | start. People just wrote web crawlers and search engines.
         | 
         | (I know Discord has some forum-like functionality now but the
         | point stands.)
        
         | rhyssullivan1 wrote:
         | I think it's due to how Discord evolved as a platform
         | 
         | Discord start as "your private place for your friends to talk"
         | during a time where there were a lot of privacy issues with
         | other communication methods.
         | 
         | Then as it grew beyond this scope of being a private place for
         | friends, it would have been good for indexing to be added but
         | indexing a normal text channel is really hard since you don't
         | know where the conversation starts / stops to submit to a
         | sitemap.
         | 
         | Now we've got large public communities and forum channels so
         | it's possible they roll out their own version soon, but it does
         | still slightly go against how their product was originally
         | created so there may be some hesitation with adding it due to
         | not knowing what the community reaction will be like.
        
         | madeofpalk wrote:
         | Discord has 'indexing' and search, just like how Slack does.
         | It's just not on the public & open web - only searchable inside
         | of Discord.
        
       | easygenes wrote:
       | While I see the value here, I don't really think most Discord
       | communities are appropriate to be indexed. It breaks the whole
       | cozy web aspect of it. [1]
       | 
       | [1] https://maggieappleton.com/cozy-web
        
         | rhyssullivan1 wrote:
         | Most Discord communities aren't meant to be indexed I agree!
         | Thanks for linking that article it was interesting to read
         | 
         | There's lots that have support channels though for programming
         | libraries, for games, etc and having all of that content locked
         | away can be really damaging.
         | 
         | One of the interesting things I've noticed is when a community
         | for a more niche game / programming library joins Answer
         | Overflow, they often shoot up to being top performers on the
         | site which is great to see.
         | 
         | Along with that, not all channels are indexed, mainly just help
         | channels. What's nice with this is it keeps that cozy feeling
         | of a private place to talk, while helping more people find a
         | community they will enjoy and keeping information accessible.
         | 
         | Long term, I'd like to implement forms of anti-abuse tools for
         | communities to use so they can understand what the types of
         | people who join their server from Answer Overflow are like. For
         | example, if it turns out that 90% of the people who join are
         | abusive, then it'd make sense for them to turn off indexing.
         | 
         | You could possibly make the argument that for the long term
         | health of some communities, having indexed content helps to
         | keep the community active
        
         | TeMPOraL wrote:
         | The "cozy web" is out of control these days. A lot of social
         | utility is lost by default because everyone uses Whatsapp and
         | Discord and other such information black holes, places where
         | knowledge goes to die. It's OK if you're using these to chat
         | with your family or friends, but it's kind of... less OK, when
         | every open source project these days, including major
         | programming languages, tells you to join their Slack or Discord
         | for support and learning.
         | 
         | What's happening is that these "communities" demand you to
         | commit _first_ , and deny providing value to passive
         | participants. If that sounds reasonable to some, let me point
         | out that the _entire value of the Internet_ is built on doing
         | the opposite. Wikipedia, Reddit, StackOverflow, everything that
         | you can find through a search engine - those are all resources
         | made available by people and groups that, for various reasons,
         | decided to _share_ knowledge instead of hoarding it, invite
         | passive participation instead of demanding active commitment.
         | The good days of the Internet, the ones people mourn, back
         | before it got fully commercialized? They were built on the
         | sentiment of openly sharing information, giving them  "pay it
         | forward" style - not gate-keeping them in webs of trust, and/or
         | demanding people to pay with effort.
         | 
         | Maybe I'm too old, but I _hate_ the  "cozy web" with passion.
        
         | philippejara wrote:
         | Most discord communities that are big enough to get indexed
         | were supposed to be forums anyway, or part of one.
        
       ___________________________________________________________________
       (page generated 2023-06-18 23:00 UTC)