[HN Gopher] Show HN: VideoMentions - Search YouTube based on the... ___________________________________________________________________ Show HN: VideoMentions - Search YouTube based on the spoken words in videos Author : kellenmace Score : 230 points Date : 2022-05-25 13:10 UTC (9 hours ago) (HTM) web link (videomentions.com) (TXT) w3m dump (videomentions.com) | alpb wrote: | FWIW Google already shows videos by their transcripts if the | video has automatic captions or CC enabled in the Google search | results. Try searching for a very specific phrase/sentence in | quotes and it's likely you'll get the video in Google Search | result page. | TuringNYC wrote: | Dear @kellenmace - This seems like a great service! I tried to | sign up and check out the site but didnt notice any contact info. | Could you contact me? Curious if you offer an API facade and/or | enterprise subscription plans? | kellenmace wrote: | Hey @TurningNYC! Thanks for reaching out. Yes, I'd be happy to | speak with you regarding API access and enterprise subscription | plans. You can reach out to me at kellenmace at gmail dot com. | Thanks! Talk to you soon. | HidyBush wrote: | Are we sure YouTube doesn't already do this? Often I have | searched for a video while only remembering a certain phrase or | even a certain comment left under it, and YouTube was able to | actually find it | kellenmace wrote: | Hey @HidyBush! No, YouTube does not reliably include spoken | word/transcript matches in search results. I've run tests where | I open up the transcript of a video using these steps | (https://kb.swtc.edu/page.php?id=90230), copy a few of the | words, then perform a search using those words across that | channel, and the video I copied them from doesn't appear on the | list of results. | | That's why VideoMentions Search exists. It provides a way to | search the videos within a YouTube channel to reliably find | spoken word matches (and it includes matches in the title & | description, too). | | Thanks for checking out my project! Please let me know if I can | answer any questions about it. | cphoover wrote: | They do... contents of speech are one of multiple input data to | youtube indexing. | | Also you can search individual videos by showing transcript and | pressing ctrl-f | kellenmace wrote: | I have come to the opposite conclusion - YouTube does not | reliably include spoken word/transcript matches in search | results. | | As I wrote in another comment, I've run tests where I open up | the transcript of a video using these steps | (https://kb.swtc.edu/page.php?id=90230), copy a few of the | words, then perform a search using those words across that | channel, and the video I copied the words from doesn't appear | on the list of results. | | If Google decides that it's going to revamp YouTube search to | include spoken word/transcription matches, that would make | VideoMentions Search irrelevant- and I'd be okay with that! | | In the meantime, I think this is a useful free tool for | quickly finding spoken word matches within specific channels. | | Thanks for checking it out! | tomatowurst wrote: | I didn't know this...how do you show transcript? | kellenmace wrote: | Hey @tomatowurst! You can search within the spoken | words/transcript of a single YouTube video by following | these steps: https://kb.swtc.edu/page.php?id=90230 | HAL9OOO wrote: | Really useful tool! I watch a lot of cycling content and people | do reviews of random products in the middle of videos, this helps | narrow it down! | prg318 wrote: | s/reviews/paid promotions/ | dmead wrote: | red letter media likes to talk about star trek | | https://videomentions.com/search?channelUrl=https%253A%252F%... | ss108 wrote: | Very cool, seems useful. | truly wrote: | Nice! Is it possible to do it without fixing the channel? | kellenmace wrote: | YouTube doesn't provide a way to search _all_ of YouTube based | on the spoken words in videos, unfortunately. | | I could update VideoMentions Search to allow users to select | multiple channels, and then perform the search across all of | those. Like maybe maybe auto-importing all the channels they're | subscribed to could be useful. One way or another though, it | would still require selecting specific channels to search | within. For this first iteration, I just kept things simple | with a single channel URL input. Despite that limitation, I | still think it's a useful tool, though; I plan to use it often | myself. | | Thanks for checking it out! | [deleted] | [deleted] | beeskneecaps wrote: | This field was a blocker for me as well. If the channel field | was an async select that helped autocomplete lookup channel | urls by channel name, this would be way more convenient. | | My use case would be for ham radio. Lots of ham radio | YouTubers film their QSOs (conversations) and mention the | callsigns that they make contact with. I want to find the | channels that mention my callsign and I'm sure lots of other | hams would want to know. Anywho cool project. GL 73 | kellenmace wrote: | I agree! This is how the paid VideoMentions.com service | works- users can search for channels by name, without the | need to paste in URLs. This auto-complete lookup requires | spending a finite number of API calls, though, which is why | it's restricted to customers and not available on this | freely accessible VideoMentions Search page. Thanks for | checking it out! | truly wrote: | Perhaps you could have channel owners register their channel | if they want to be indexed. That would be super useful. | dankwizard wrote: | I'm assuming it's using the API to download captions and | scanning them, which is why it'd need the Channel. It would be | so hard to know where to begin without it! | | Potentially future updates could search a logged in user's | history? | kellenmace wrote: | Yep, you're exactly right. The tool works by getting the | channel's videos, then fetching the voice-to-text transcripts | for them, then searching within the spoken words (along with | the title and description) for any keyword matches. So there | isn't a feasible way to do that across _all_ of YouTube. | | I like your idea of searching the logged-in user's history! | That could be handy. Another thing I've thought about is | auto-importing all the channels they're subscribed to so they | can search within those. | | For the first version of VideoMentions Search, I kept things | simple with a single "Channel URL" input field, but using | their history/subscriptions is totally doable. I'll see if | there's enough demand for that. | | In any event, I still think that this simple first version | has utility. I like being able to quickly pinpoint all the | moments when a certain topic was mentioned across all of a | channel's videos. | | Thanks so much for taking a look, and for your feedback! | fps_doug wrote: | Do you keep the transcripts around on the server? It | shouldn't matter much in terms of storage unless the site | becomes crazy popular, so you could offer a "best effort | search" or something along the lines, that just searches | everything you got so far, so the site would get better and | better over time. | corobo wrote: | YouTube kills your API key if you do this and make the | data available (eg via API). | | You're allowed to cache responses for a bit but not store | them long term. "How would they know", etc of course, but | if you're distributing the data they'll figure it out. | Some smart cookies over there at Google. | | My small website managed to get on their radar and I | didn't even post it to HN! | lrae wrote: | No, they'd have to index all youtube videos then. | mikewhy wrote: | I've been wanting to make a supercut of Civvie 11 mentioning John | Carmack but have been putting it off. This just did half the work | for me, amazing! | f0e4c2f7 wrote: | This is pretty amazing! Thank you for making it. | | I'm not sure if this would be against Google's terms so you might | want to check: | | If you started indexing such that I could do a search and it | would come back with any indexed content I think you would have | invented a new search engine. Seems extremely useful. | | I already use YouTube this way today but as you pointed out | searching by title can be tricky. | | Search engines supported by ads are also typically quite | profitable. | happy_pancake wrote: | Nice! My friends and I had this idea for a hackathon and we won! | We skipped lectures and we needed a way to fast forward to | relevant parts of a lecture | baxuz wrote: | This is amazing. I actually used yt-dlp to download all subtitles | and search in them a few times already. | | Thanks a lot! | nop_slide wrote: | Nothing of note to say other than this is really cool! | wenbin wrote: | Great job! | | How do you do search result ranking? Any signals that you use to | decide "this is a high quality video from reputable channel and | it's relevant"? I would imagine # of followers / likes / comments | / like-comment ratio or the likes are used? | atleta wrote: | I wanted to try it but for some reason it doesn't accept Lex | Fridman's channel as a valid URL: | https://www.youtube.com/c/lexfridman | pks016 wrote: | I tried with some other channels. Looks like It's not accepting | all channels as valid urls. | dmead wrote: | would you add support for regexes? | | I would like to search for rich evans saying "aiiiiiiiiids" with | varying word length but now i can only search for "aids" | | https://videomentions.com/search?channelUrl=https%253A%252F%... | kellenmace wrote: | Hey HN! I just launched VideoMentions Search, a free tool that | allows you to search YouTube to find videos that contain specific | spoken words. | | Here's how to quickly try it out - | | Let's say you want to find every video on The Verge's YouTube | channel within the last several months in which "MacBook Pro" is | mentioned. Here's a pre-populated search to accomplish that: | | https://videomentions.com/search?channelUrl=https%253A%252F%... | | If you follow that ^ link, then click the button to perform the | search, you'll see all matching videos, with every single mention | of "MacBook Pro" highlighted. | | My favorite part is that all the highlighted matches are | timestamped. So you can click on any of them to jump to that | exact moment in the video when that keyword was said! | | VideoMentions Search is great for these scenarios: | | - You want to find all videos within a YouTube channel that | mention your brand, your product, or the topics you care about. | | - You remember watching a video where a certain topic was | discussed, and now you're trying to remember which video it was | to rewatch it. | | - You run your own YouTube channel and want to quickly find the | exact moments in your videos where you cover certain topics, so | you can link others to that content. | | I have wanted a way to search YouTube based on spoken words for a | while. I couldn't find a tool that provides that capability | though, so I built it! | | I hope you find VideoMentions Search useful! I'd love to hear any | feedback you have on it. Please let me know! | loxias wrote: | This looks super cool! There's a need for tools to make the | millions of videos on the web actually useful for people like | me who would much rather skim search and read than watch. Bravo | to you for making a stab at it! | | I'm guessing you need a specific channel as a way of getting a | list of videos, then pulling the captioning, then searching? | (That's the only reason I could think of for the channel | restriction, which unfortunately removes 90% of the utility) | kellenmace wrote: | Hey @loxias! Yes, you're exactly right - VideoMentions | scrapes the video page markup and pulls out the "baseUrl" for | the English caption track. It converts that XML caption track | into JSON, then searches it for keyword matches. | | YouTube doesn't provide a way to search _all_ of YouTube | based on spoken words, unfortunately. Still, I think this is | a useful tool for quickly locating spoken word matches within | a certain channel. | | Some Hacker News commenters have suggested that I could auto- | import all the channels the user is subscribed to, or get | videos from their watch history, which are interesting ideas. | For this first iteration, I kept things simple with a single | "Channel URL" input. | | Thanks for checking out my project, and for your feedback! | loxias wrote: | > Still, I think this is a useful tool for quickly locating | spoken word matches within a certain channel. | | Absolutely, and I agree, however,.... it necessitates that | the user use or understand "channels". When I went to your | site my first thought was "what the hell's a channel?" ;-) | | But yeah, for whatever percentage of youtube users use | channels, I see this having a lot of use. My totally | unsolicited advice is that you'll have to find _some_ way | to remove the channel requirement. Perhaps suggesting | channels based on the search terms. (I 've seen your other | comments about adding features to a paid version) | | I haven't yet managed to get VideoMentions to actually | _work_ , but that's fine, I assume you're getting hugged to | death. :D Shipping _anything_ is hard, and congrats for | that. | kellenmace wrote: | "I assume you're getting hugged to death" -- lol | | Yeah, maybe I'll consider having two versions of search - | one version for paid customers that allows you to search | for YouTube channels by name/keyword and allows you to | search across all the channels you're subscribed to or | across all the videos in your watch history, and a second | version that's free and similar to the current iteration. | | VideoMentions Search should work just fine for you - | there is no server and no database behind it. When | searches are performed, serverless functions are called | that source data from YouTube, then return a response to | the client. So it should scale just fine. | | Can you please visit this link, click the button to | perform a search and let me know if you see results pop | up? https://videomentions.com/search?channelUrl=https%253 | A%252F%... | | I'd love to know if you don't, or if you see any errors | in the browser console. Thanks again for your thoughts on | it! | loxias wrote: | (firstly, that search works fine and returns very | quickly! Now that I know it's client side, I've been able | to do a few searches of my own to see how it works. It | works well! The ones I did earlier were looking at a | large channel "PBS Newshour" for various news topics over | "All time") | | (secondly, DO consider putting your email address on your | HN profile!) | | Whoa! Client side!?! Far out man, far out. To be honest, | I was a little unimpressed before -- but didn't feel like | "your idea is _brilliant_, but implementation leaves a | lot to be desired. makes me want to try my hand at | writing my own." would be a constructive comment. ;-) | | But, knowing this is all done client side?? I tip my hat, | that's *clever*! The whole site could be a static and | local set of files? I had no idea things like this could | even be done without a server or other real program to do | the work. (though now that I think about it, I have an | idea how, duh. YouTube exposes a JS API I bet, so you can | have the client call it each time, do the searching work, | &c.) | | That avoids scalability issues and probably legal ones as | well! What a game changer... | | Was there an existing framework used for building client | side web apps like this, or did you roll your own? | | While I've got you, and I admit that graphical UIs are | ... an area where my opinion should carry no weight (I'm | the sort who.. doesn't use icons, and when writing | personal tools, the "user interface" is positional | parameters to a command line program ;-)), here are some | suggestions: | | * The UI, while clean and unbusy (yay!!), feels BLOATED | with white space. Why do the results occupy this tiny | narrow column, forcing me to scroll way more than I | should need to? | | * In addition to the narrow column, the entries IN the | column take up too much space, the UI could be arranged | to be much more efficient. | | * Since search results are within a channel, don't put | the channel name after each entry, there's no point. | | * Might be nice to have the time offset visible to the | left of the text excerpt. I can see it on mouseover, but | still. Might be nice to see. | | * Can the videos be made to play inline, without | redirecting to youtube, then navigating back, and redoing | the search? | | * Perhaps, after searching for a result, we could see a | graph at the top showing all the videos containing that | term over time, and easily click on it to go directly | there. | corobo wrote: | Very cool! Well done! | | I've not got a legit use for it right now but the few example | searches I thought up returned the videos I was thinking of | | Nice work! | kellenmace wrote: | Cool! Thank you! | | Here are some example use-cases that I included in another | comment, in case any of them are relevant to you: | | 1. You want to find all videos within a YouTube channel that | mention your brand, your product, or the topics you care | about. | | 2. You remember watching a video where a certain topic was | discussed, and now you're trying to remember which video it | was to rewatch it. | | 3. You run your own YouTube channel and want to quickly find | the exact moments in your videos where you cover certain | topics, so you can link others to that content. | | Maybe #2 happens occasionally? If so, maybe you can bookmark | VideoMentions Search for just such an occasion :) | | In any event, thanks for checking out my project, and for the | kind words! | CopOnTheRun wrote: | How are you sourcing the text for the videos? This search [1] | grabs some results for my query, but it does miss this [2] | video which contains the searched keyword multiple times, and | the video's subtitles indicates as much. | | [1] | https://videomentions.com/search?channelUrl=https%253A%252F%... | | [2] https://www.youtube.com/watch?v=3denP7wX2XU&t=296s | kellenmace wrote: | VideoMentions scrapes the video page markup and pulls out the | "baseUrl" for the English caption track. It converts that XML | caption track into JSON, then searches it for keyword | matches. You're right that this particular search for "toxic" | should find several spoken word matches, but it doesn't. It | seems like the tool isn't able to access the captions data | for that video for some reason. I made a note of this bug, | and I'll look into fixing it. Thanks for pointing it out, and | for checking out VideoMentions Search! | tcmb wrote: | yt-dlp [1] has command-line options to download only the | captions of a video, in available languages, if you want to | skip the scraping for the link. | | I built something similar [2] for a slightly different use | case. I wanted to be able to search through all Ram Dass | talks in the 'Here and Now' podcast series on YT. I'm | obviously not as skilled at CSS. :) And the display of | timestamps is still a bit shaky, but for me it fulfills its | purpose. | | Since I'm able to preload all caption files ahead of time, | I'm just using pcregrep for the search which does a pretty | good job. | | [1] https://github.com/yt-dlp/yt-dlp [2] https://ramdass- | search.net | rodnim wrote: | > - You remember watching a video where a certain topic was | discussed, and now you're trying to remember which video it was | to rewatch it. | | But I need to remember which channel I watched? Maybe I'm | missing something, but in my eyes it would make it tremendously | more useful if I didn't have to specify channel. | Nowado wrote: | I'm sure it would be, but then you're indexing whole youtube | with respect to words spoken in each video. A thing that | Google, arguably the best organization when it comes to | indexing stuff, is working on. | kellenmace wrote: | Yes, my thoughts exactly. If Google decides that it's going | to revamp YouTube search to include spoken | word/transcription matches, that would make VideoMentions | Search irrelevant- and I'd be okay with that! | | In the meantime, I think this is a useful free tool for | quickly finding spoken word matches within specific | channels. | | Thanks for checking it out! | kellenmace wrote: | Yeah, YouTube doesn't provide a way to search _all_ of | YouTube based on the spoken words in videos. | | I could update VideoMentions Search to allow users to select | multiple channels, and then perform the search across all of | those (maybe importing all the channels they're subscribed to | could be handy... ). One way or another though, it would | still require selecting specific channels to search within. | That limitation notwithstanding, I still think it's a useful | tool, though! | | Thanks for checking it out! | hanniabu wrote: | Do you cache the channel, search term, and results for | faster more efficient responses later? | kellenmace wrote: | Hey @hanniabu! I do client-side in-memory caching of | videos data, only. No server-side caching. In fact, there | is no database involved at all- the client-side app calls | serverless function API endpoints to fetch the YouTube | channel and video data it needs. Here are the tricks I'm | using to make it fast: | | - As soon as the "Channel URL" field loses focus, I start | fetching the most recent 30 videos on that channel in the | background. This way, by the time the user enters the | keyword and date range, I've already fetched some (maybe | even all!) of the data ahead of time, which means less | wait time for them. | | - Once a specific video's data (title, description, | transcript, etc.) has been fetched once, it is saved in | memory. All other searches the user performs from that | point on will pull the video data from the in-memory | cache, if it's there. Otherwise, it will fall back to | fetching the video data over the network. This in-memory | caching makes subsequent searches within the same date | range (or a shorter date range) take <1 second. | | - Network requests to fetch video data are processed | concurrently rather than one at a time. So the browser | fires off as many as it can in parallel to get them all | resolved as quickly as possible. | | - As soon as any matches are found, the UI updates to | show the user. This way, the user can start scrolling | through the matches and reviewing them while the search | is still in progress- they don't have to wait until it | finishes to start interacting with the matches. | | Thanks for checking out my project! | alex_smart wrote: | >I do client-side in-memory caching of videos data, only. | | This works only if the user does not navigate from and | back to your website or refresh the page, but if they do, | you make the same api calls all over again. You should | set HTTP Cache-Control headers in your response from the | server, so that the browser knows that it can serve that | data from its cache and does not need to make those | requests again. You would then probably not need the | client-side in-memory cache at all. | bredren wrote: | I get why you wouldn't be able to index all of YouTube, | that's a big ask. | | However, I don't use YouTube enough to mess w channels | much. I'm usually searching on a particular topic. | | For example, "drop ceiling panel replacement." | | Perhaps you could help users limit the channel scope by | making an intelligent channel selection by keyword. | | So I would put in "home improvement," and you could choose | some appropriate channels to search for my search terms. | kellenmace wrote: | Hey @bredren! Yeah, I agree that this makes for a nice | user experience. This is how the paid VideoMentions.com | service works- users can search for channels by name or | keywords, without the need to paste in URLs. It looks | like this: https://cloudup.com/cUgKqErcx8G | | That auto-complete lookup requires spending a finite | number of API calls, though, which is why it's restricted | to customers and not available on this freely accessible | VideoMentions Search page. | | Thanks for checking it out, and for your feedback! | bredren wrote: | Ah! Okay. Makes sense. Thanks for the reply, maybe I'll | try out the pro version. | mistermann wrote: | Could you do a search based on their watch history if they | have it enabled? | hanniabu wrote: | That'd certainly be useful | kellenmace wrote: | This is a cool idea! It wouldn't apply to people who want | to search all the videos on a given channel for specific | keywords (including those they haven't watched). I can | see it being useful for folks trying to locate a specific | video they remember watching in the past, though. | | One consideration is that getting the user's watch | history would likely require calls to the YouTube API. So | that means I would have to make this a paid service in | order to offset the code of those API requests. The | beautiful thing about the current iteration is that it | doesn't rely on YouTube's API at all. By scraping YouTube | pages and leveraging a few NPM packages, I'm currently | able to offer free and unrestricted access to it. | | If enough people request that ability though, I'll | consider incorporating it. | | Thanks for checking out my project and for the great | idea! | joshvm wrote: | They don't offer direct search, but isn't this what "key | moments" is in search results? Try eg _how to change a | lightbulb_. | | I believe SeekToAction works even if the uploader didn't | put chapters in. This was a relatively recent update, to | make it fully automatic. So it's presumably doing some | audio/video analysis to figure it out. All you need to do | is tell Google how to seek your video (so it also works | with non youtube videos too). | | https://developers.google.com/search/blog/2021/07/new-way- | ke... | kellenmace wrote: | Oh, cool! I hadn't heard of the Key Moments/SeekToAction | feature before. I'll have to dig in and explore that a | bit. Thanks for the tip! | jobigoud wrote: | I tried on this channel which is in Spanish [1] and it returned | very fast with no results. | | Then tried on GeoWizard channel (English) and it worked great. | | There is a somewhat related tool called Youglish that works on | subtitles, it's great for checking the pronunciation or usage | of a word or expression in many languages, but it's based on a | curated list of channels with known good subtitles. I thought | yours could be a great complement to this as it works directly | on the audio. | | [1] https://www.youtube.com/c/Tercosmicqueen | kellenmace wrote: | Hey @jobigoud! Yeah, currently VideoMentions only uses the | auto-generated English video transcriptions when searching | for matches. If there was enough demand, it could be expanded | to support other languages. Doing so would add complexity and | would make searches take much longer, though. I'd probably | need to add more UI fields to allow users to select the | languages to target. Could be done, though! | | Yes, youglish.com is a neat tool! I stumbled upon it when | looking around for a way to search YouTube videos based on | spoken words. | | Thanks for checking it out! :) | [deleted] | barefeg wrote: | Are the video transcripts searchable? | kellenmace wrote: | Yes, this is how VideoMentions Search works. It scrapes the | video page markup and pulls out the "baseUrl" for the English | caption track. It converts that XML caption track into JSON, | then searches it for keyword matches. Is that what you're | asking about? | | If you want to search within the transcript of a single | video, you can accomplish that with these steps: | https://kb.swtc.edu/page.php?id=90230 | barefeg wrote: | Ah got it. I thought there would be some API where the | transcripts could be searched across videos. Maybe that | requires way to many resources for Google to index | kellenmace wrote: | Yeah, Google is of course the king of search, so they | could certainly decide to revamp YouTube search to | include spoken word/transcription matches. They have all | the data required to make that happen. That would make | VideoMentions Search irrelevant- and I'd be okay with | that! | | In the meantime, I think this is a useful tool for | quickly locating videos based on spoken words. | | Thanks for checking out my project! | atentaten wrote: | This is a nice tool. I wish one didn't have to specify the | channel, as some comments mentioned. If it's not possible to get | the channel when providing the video url via an API, it is | possible to get it from the video url's GET response data. The | latter may be little slower, but might be worth it in terms of | UX. | filoeleven wrote: | This is amazing. | | I just tested it against a smallish British channel with a video | that I wanted to see again, and couldn't remember which one of | the 50-odd videos it was. It did not carch the full quote | directly, because YT read it as "Marvin nature" instead of | "Mother Nature," a consequence of Alfie's accent. But my search | for "Mother Nature" picked up another reference close enough to | it to show the text. Sort of unlucky with the miss and very lucky | with the proximity to the hit. | | I drew a blank regarding the other channels whose videos I want | to revisit, but I know I will think of them later because there | are so many. This is extremely useful. I've already bookmarked | the site. | ojiwan wrote: | Pretty cool. I could see this as a useful automated tool for | companies to monitor their brand, or competitors. Maybe a weekly | report of channels that have mentioned specific keywords. | anoncow wrote: | How is this different from what Google currently does for in- | video text search? | 1970-01-01 wrote: | I wanted this just yesterday! I was searching several channels to | recall how many bolts are holding the battery for the electric | F150. Sure enough, Top Gear told me how many bolts the F150 used | for its battery pack! | | Result: | | SPOKEN WORDS "...kept things simple it recycled the same basic | chassis and attach the battery which can be removed with just | eight bolts from underneath and used carryover components | wherever possible in fact..." | [deleted] | kumarski wrote: | Is this based on acoustic keyword recognition or SRT search? | wolongong942 wrote: | Very interesting tool could see myself using it a lot. | | Couldn't help but search the guy that uploaded 2 million vids to | yt, sorry if it breaks anything. | | It would be cool to search for a specific quote like "buddy of | mine" (frequently said by Joe Rogan). Also searching for parts of | a word (similar to regex options?) might be difficult to | implement but would be super useful as an option i.e if i | searched for any word that starts with yc with a correct pattern | (or given option on the UI) i could find results for both "yc" | and "ycombinator". | kellenmace wrote: | Hey @wolongong942! Nice- I'm glad you're finding it useful! | | VideoMentions Search is built entirely using serverless | technologies, so it should scale really well. There's no single | server or database to act as a bottleneck. When you perform a | search, your browser fires off a number of network requests to | serverless function API endpoints that fetch the data from | YouTube, then return a response. So if you do an "All time" | search on a channel with 2 million videos, your laptop fan may | kick on while your browser works hard firing off thousand and | thousands of requests until it's fetched all 2 million videos, | you run out of memory, or you manually hit the "Cancel" button | - whichever comes first :) | | Your regex idea is interesting, and I'll consider implementing | some more complex rules like that if enough people request it. | For now, my answer would be to either perform separate searches | (like one for "yc" and another for "ycombinator"), or perform | one search, then use ctrl/cmd+f to search within the matches | displayed on the page. | | Thanks for checking out my project, and for the great feedback! | messe wrote: | What're the (approximate) costs for monthly usage? (Or if you | don't know yet: what're you budgeting for?) | PaulKeeble wrote: | One of the consequences of the effectiveness of click-bait titles | on youtube is that searching for videos you watched historically | is extremely difficult, because quite often the primary thing in | the video is never mentioned in the title. The descriptions are | almost universally used to store common links that are about the | channel not the video and the consequence of this is that click- | bait titled content is very hard to find after the fact. The | other problem is quite a lot of channels are testing one title | and then changing to another, the video wont then be found using | the initial title and so even if you do remember elements of the | title the channel may have made that impossible to use some hours | or days later as another title with better performance was used. | | I can see a need for search in the contents of the videos but I | don't want to specify the channel, I likely don't know especially | since channels can also change their name. | Akronymus wrote: | Also, the youtube built in search often fails to match videos | that have a word in the title for which you search for. | Minor49er wrote: | I can't even find a specific channel when I spell it | correctly, even with double quotes. I have to go into the | filters and specify channels-only, and even then it puts a | bunch of wrong results in front of it. Their search on | YouTube used to be better, showing results for what was | actually searched for. But now it seems like they're trying | to drive views to high-performing videos since they now give | a single page of results for any query before backfilling the | results with a bunch of suggestions that have nothing to do | with the search | Akronymus wrote: | Up until a few days ago I had a redirect for search that | fisabled the suggestions on the search page. Now, if I do | that, the search only gives 1, irrelevant, result. | nonameiguess wrote: | You should be able to just go directly to the channel using | the URL if you know the channel name. | youtube.com/c/<CHANNEL_NAME>. What on earth they use the | distinction for, I don't know, but if it's a user with | videos instead of a channel, then | youtube.com/user/<USER_NAME>. | andai wrote: | YouTube search is almost completely useless, I just use | Google to find YouTube videos. It does seem to search in the | video transcript and the comments (I'm not sure but it seems | that way). | hbn wrote: | YouTube's search results will show you like 10 results of | the thing you actually searched for, then resort to a | section of completely random video suggestions unrelated to | your search in hopes some thumbnail will draw your | attention and suck you back in to wasting your time | watching videos and being served ads. | lostinroutine wrote: | I also suffered from this issue for a while but then | discovered "Unhook" browser extension, which in addition | to many other amazing features (e.g. hide recommended | videos/comments) has a feature to hide irrelevant search | results. With the feature on I get an endless list of | matching results and no junk. | PaulKeeble wrote: | Youtube is clearly blacklisting certain channels and they | can't be found via even precise searches. | sph wrote: | Unreasonable of you to expect search to be good in a Google | product. | htrp wrote: | people also edit the titles of videos from time to time... so | even the exact title doesn't help | [deleted] | corobo wrote: | Aye especially in modern day YouTube | | Once you're big enough your stats start being useful, folks | in the know (mrbeast's analytics guy, I can't think of more | rn, etc) advise you rotate out the thumb/titles until the | click through rate is decent | lewisjoe wrote: | Hi, Great job. I was almost about to build such a thing before I | realized the crazy amount of crawling and transcriptions that I | have to index. | | How exactly did you solve/approach the problem? | | 1. How did you crawl across those millions of videos from the | platform? | | 2. How are you indexing stuff like that | jonplackett wrote: | Bump. Would like the lowdown too! | SteveDR wrote: | I built something similar to this when I was first learning to | program. Except mine lets you perform a one-time search for | videos containing keywords, similar to how you would normally | search YouTube. So there are no notifications or anything. | | https://phrasefinder.net | | I don't know if it even works anymore and I'm sure the code is | atrocious. But I remember that I would just scrape YouTube | pages for video IDs, and then use an API that returned video | captions for a given ID [1]. I could see how OP would do | something similar. | | [1] https://pypi.org/project/youtube-transcript-api/ | alex_smart wrote: | Based purely on the speed of the results, I believe that the | crawling is happening in real time. | | The search is scoped by channel, so the closed-caption files | for all the videos in the channel are downloaded and searched | for on the fly. | | Edit: Wow, thanks to dev tools, I can see that the website is | downloading the transcript and metadata for all the videos from | the channel to the client. So the search is happening client- | side!! | galori wrote: | Its only for a single channel. That's the answer, it queries in | real time. | galori wrote: | This is neat, but I would like it to search against the entire | archive of all youtube videos. I know you can't do that...but | perhaps a Google employee making use of Google's Big Data | offerings. ___________________________________________________________________ (page generated 2022-05-25 23:01 UTC)