[HN Gopher] Show HN: Answer Overflow - Indexing Discord content... ___________________________________________________________________ Show HN: Answer Overflow - Indexing Discord content into the web Hi! I'm Rhys, I develop Answer Overflow a search engine for Discord channels. Answer Overflow indexes content from channels into Google making them discoverable on the web. I'm sharing this again after seeing a lot of discussion during the Reddit blackout about the inaccessibility of information sent in Discord servers. Answer Overflow is a verified bot in over 100 communities, fully complies with the Discord ToS, and is open source! https://github.com/AnswerOverflow/AnswerOverflow Check out some of the communities here! T3 Community - https://www.answeroverflow.com/c/966627436387266600 C# - https://www.answeroverflow.com/c/143867839282020352 Reactiflux - https://www.answeroverflow.com/c/143867839282020352 All - https://www.answeroverflow.com/browse Please let me know what feedback you have, thanks for checking it out! Author : rhyssullivan1 Score : 80 points Date : 2023-06-18 19:50 UTC (3 hours ago) (HTM) web link (www.answeroverflow.com) (TXT) w3m dump (www.answeroverflow.com) | berkle4455 wrote: | I'm sure Discord and their communities are absolutely ecstatic | about opening up the doors to openAI and others to scrape their | collective work for the latest LLM. | | Walled gardens are going to get a whole lot stricter. | mdaniel wrote: | Welcome back. How does this compare to Linen | (https://github.com/linen-dev/linen.dev#readme), which claims to | support Slack and Discord? I do see the license difference, but | didn't know if that was the major differentiator | rhyssullivan1 wrote: | Couple key differences: | | - Answer Overflow works on a consent basis for displaying | messages (https://docs.answeroverflow.com/user- | settings/displaying-mes...), while Linen does all the messages | in a community. The consent system Answer Overflow has helps a | lot with respecting user privacy while also getting content | indexed. | | - Linen appears to be building out a competitor to Slack & | Discord while Answer Overflow is focused on building on top of | those platforms, so we've got very different roadmaps. From | what I can gather from the Linen roadmap, they're implementing | things like voice chat, private channels, etc. Whereas with | Answer Overflow some of the things I'm focused on is answer | automation, tracking outdated answers, analytics for where to | improve your docs etc | | - Answer Overflow is pretty much only focused on Discord | servers, it wouldn't be too hard to support both Slack and | Discord but what's nice about focusing on Discord for now is it | helps with our goal of being the best indexing tool | specifically for Discord | | - Global search (https://www.answeroverflow.com/search), you | can search all Answer Overflow communities at the same time | | The team at Linen have built out a great product though and | it's cool watching them succeed with it! | mid-kid wrote: | I was talking about needing a solution like this just a second | ago. Down from the heavens, descends this. I'll be sure to give | it a try! | rhyssullivan1 wrote: | Send me a message if you have any questions! Happy to help with | getting it setup | mid-kid wrote: | This might sound a little bit picky, but from a cursory look | around the project, it feels a bit too corporate and | platform-ey for my tastes. I'm only interested in two things: | generating (ideally static, and seo-friendly) web pages out | of a discord forum channel and selfhosting it so we can | archive the data ourselves (and won't be bound to content | policies of answeroverflow.com). All of the extra bells and | whistles with the bot auto-managing channels, analytics, AI | and whatever else superfluous and make me sweat a little, as | I'll have to comb through the documentation to make sure | everything is set up correctly. It's also really a shame to | read that selfhosting will be a "Pro" feature. I'll give | props for considering users wanting to opt-out, however, and | it does at least seem rather simple to set up. | rhyssullivan1 wrote: | Where did you see self hosting is a pro feature? My bad if | the website gives that impression it will be free, the | whole codebase is MIT licensed. | | For all the extra bells and whistles, it's mainly for | people who are doing community support at scale who need it | which would be paid customers - I do sort of need a way to | support myself so I can buy groceries. The core of the | product that matter is free and working well for indexing | content so now the focus is "what else can we do to improve | community support as a whole?" | | As for self hosting, if you submit a PR for supporting it | I'd be happy to get that merged but it's not really a | priority at the moment. The codebase is setup to be pretty | easy to make a self hosted version though. | mid-kid wrote: | Haha, that's fair. I'll consider trying to set it up | myself and see how it goes. | | I got the idea that it was a pro feature out of the | roadmap list on the website, where it's listed as "coming | soon", and "pro" is only mentioned when you click on the | waitlist join link. If it means custom domains, it might | be better off being listed as "custom domains" or | something similar. That's how it's called on google apps | and such. It also doesn't help that the roadmap on the | website doesn't match the one on the github page, I | thought the roadmap features on the github page might be | pro features as well. | rhyssullivan1 wrote: | Ah I see how that's confusing, sorry about that! I'll | update it in both places to make that clearer | tudorw wrote: | nice, there is a lot of good stuff on discord! | apignotti wrote: | Genuine question: I love Discord, but how on earth is it possible | that such functionality was not built-in to begin with? | | I really don't understand how the need for indexing and search | was overlooked. | Kiro wrote: | It makes no sense to index the vast majority of content. You | would need to cherry pick really hard among all the noise to | find the stuff worth putting online. | jasonjmcghee wrote: | Interesting comment. I would think Reddit is similar in terms | of content, yet "site:reddit.com <query>" is common as a | general search pattern (pre-blackout) | michaelmior wrote: | I would argue it makes no sense to index the vast majority of | content _without good search_. If your search is good enough, | you can index everything and then surface only the good stuff | at query time. | thunky wrote: | What I wonder is why would anyone that cares about | archiving/search would choose to use Discord? | esafak wrote: | It's not made for knowledge discovery; it's for gamers. Just | look at that busy UI! The content is assumed to have no | historical value. | thrashh wrote: | Discord is a chatroom first. What non-enterprise chat comes | with archives? | | A forum is totally different. | | And even then, forums weren't designed to be archived from the | start. People just wrote web crawlers and search engines. | | (I know Discord has some forum-like functionality now but the | point stands.) | rhyssullivan1 wrote: | I think it's due to how Discord evolved as a platform | | Discord start as "your private place for your friends to talk" | during a time where there were a lot of privacy issues with | other communication methods. | | Then as it grew beyond this scope of being a private place for | friends, it would have been good for indexing to be added but | indexing a normal text channel is really hard since you don't | know where the conversation starts / stops to submit to a | sitemap. | | Now we've got large public communities and forum channels so | it's possible they roll out their own version soon, but it does | still slightly go against how their product was originally | created so there may be some hesitation with adding it due to | not knowing what the community reaction will be like. | madeofpalk wrote: | Discord has 'indexing' and search, just like how Slack does. | It's just not on the public & open web - only searchable inside | of Discord. | easygenes wrote: | While I see the value here, I don't really think most Discord | communities are appropriate to be indexed. It breaks the whole | cozy web aspect of it. [1] | | [1] https://maggieappleton.com/cozy-web | rhyssullivan1 wrote: | Most Discord communities aren't meant to be indexed I agree! | Thanks for linking that article it was interesting to read | | There's lots that have support channels though for programming | libraries, for games, etc and having all of that content locked | away can be really damaging. | | One of the interesting things I've noticed is when a community | for a more niche game / programming library joins Answer | Overflow, they often shoot up to being top performers on the | site which is great to see. | | Along with that, not all channels are indexed, mainly just help | channels. What's nice with this is it keeps that cozy feeling | of a private place to talk, while helping more people find a | community they will enjoy and keeping information accessible. | | Long term, I'd like to implement forms of anti-abuse tools for | communities to use so they can understand what the types of | people who join their server from Answer Overflow are like. For | example, if it turns out that 90% of the people who join are | abusive, then it'd make sense for them to turn off indexing. | | You could possibly make the argument that for the long term | health of some communities, having indexed content helps to | keep the community active | TeMPOraL wrote: | The "cozy web" is out of control these days. A lot of social | utility is lost by default because everyone uses Whatsapp and | Discord and other such information black holes, places where | knowledge goes to die. It's OK if you're using these to chat | with your family or friends, but it's kind of... less OK, when | every open source project these days, including major | programming languages, tells you to join their Slack or Discord | for support and learning. | | What's happening is that these "communities" demand you to | commit _first_ , and deny providing value to passive | participants. If that sounds reasonable to some, let me point | out that the _entire value of the Internet_ is built on doing | the opposite. Wikipedia, Reddit, StackOverflow, everything that | you can find through a search engine - those are all resources | made available by people and groups that, for various reasons, | decided to _share_ knowledge instead of hoarding it, invite | passive participation instead of demanding active commitment. | The good days of the Internet, the ones people mourn, back | before it got fully commercialized? They were built on the | sentiment of openly sharing information, giving them "pay it | forward" style - not gate-keeping them in webs of trust, and/or | demanding people to pay with effort. | | Maybe I'm too old, but I _hate_ the "cozy web" with passion. | philippejara wrote: | Most discord communities that are big enough to get indexed | were supposed to be forums anyway, or part of one. ___________________________________________________________________ (page generated 2023-06-18 23:00 UTC)