[HN Gopher] Show HN: VideoMentions - Search YouTube based on the...
       ___________________________________________________________________
        
       Show HN: VideoMentions - Search YouTube based on the spoken words
       in videos
        
       Author : kellenmace
       Score  : 230 points
       Date   : 2022-05-25 13:10 UTC (9 hours ago)
        
 (HTM) web link (videomentions.com)
 (TXT) w3m dump (videomentions.com)
        
       | alpb wrote:
       | FWIW Google already shows videos by their transcripts if the
       | video has automatic captions or CC enabled in the Google search
       | results. Try searching for a very specific phrase/sentence in
       | quotes and it's likely you'll get the video in Google Search
       | result page.
        
       | TuringNYC wrote:
       | Dear @kellenmace - This seems like a great service! I tried to
       | sign up and check out the site but didnt notice any contact info.
       | Could you contact me? Curious if you offer an API facade and/or
       | enterprise subscription plans?
        
         | kellenmace wrote:
         | Hey @TurningNYC! Thanks for reaching out. Yes, I'd be happy to
         | speak with you regarding API access and enterprise subscription
         | plans. You can reach out to me at kellenmace at gmail dot com.
         | Thanks! Talk to you soon.
        
       | HidyBush wrote:
       | Are we sure YouTube doesn't already do this? Often I have
       | searched for a video while only remembering a certain phrase or
       | even a certain comment left under it, and YouTube was able to
       | actually find it
        
         | kellenmace wrote:
         | Hey @HidyBush! No, YouTube does not reliably include spoken
         | word/transcript matches in search results. I've run tests where
         | I open up the transcript of a video using these steps
         | (https://kb.swtc.edu/page.php?id=90230), copy a few of the
         | words, then perform a search using those words across that
         | channel, and the video I copied them from doesn't appear on the
         | list of results.
         | 
         | That's why VideoMentions Search exists. It provides a way to
         | search the videos within a YouTube channel to reliably find
         | spoken word matches (and it includes matches in the title &
         | description, too).
         | 
         | Thanks for checking out my project! Please let me know if I can
         | answer any questions about it.
        
         | cphoover wrote:
         | They do... contents of speech are one of multiple input data to
         | youtube indexing.
         | 
         | Also you can search individual videos by showing transcript and
         | pressing ctrl-f
        
           | kellenmace wrote:
           | I have come to the opposite conclusion - YouTube does not
           | reliably include spoken word/transcript matches in search
           | results.
           | 
           | As I wrote in another comment, I've run tests where I open up
           | the transcript of a video using these steps
           | (https://kb.swtc.edu/page.php?id=90230), copy a few of the
           | words, then perform a search using those words across that
           | channel, and the video I copied the words from doesn't appear
           | on the list of results.
           | 
           | If Google decides that it's going to revamp YouTube search to
           | include spoken word/transcription matches, that would make
           | VideoMentions Search irrelevant- and I'd be okay with that!
           | 
           | In the meantime, I think this is a useful free tool for
           | quickly finding spoken word matches within specific channels.
           | 
           | Thanks for checking it out!
        
           | tomatowurst wrote:
           | I didn't know this...how do you show transcript?
        
             | kellenmace wrote:
             | Hey @tomatowurst! You can search within the spoken
             | words/transcript of a single YouTube video by following
             | these steps: https://kb.swtc.edu/page.php?id=90230
        
       | HAL9OOO wrote:
       | Really useful tool! I watch a lot of cycling content and people
       | do reviews of random products in the middle of videos, this helps
       | narrow it down!
        
         | prg318 wrote:
         | s/reviews/paid promotions/
        
       | dmead wrote:
       | red letter media likes to talk about star trek
       | 
       | https://videomentions.com/search?channelUrl=https%253A%252F%...
        
       | ss108 wrote:
       | Very cool, seems useful.
        
       | truly wrote:
       | Nice! Is it possible to do it without fixing the channel?
        
         | kellenmace wrote:
         | YouTube doesn't provide a way to search _all_ of YouTube based
         | on the spoken words in videos, unfortunately.
         | 
         | I could update VideoMentions Search to allow users to select
         | multiple channels, and then perform the search across all of
         | those. Like maybe maybe auto-importing all the channels they're
         | subscribed to could be useful. One way or another though, it
         | would still require selecting specific channels to search
         | within. For this first iteration, I just kept things simple
         | with a single channel URL input. Despite that limitation, I
         | still think it's a useful tool, though; I plan to use it often
         | myself.
         | 
         | Thanks for checking it out!
        
           | [deleted]
        
           | [deleted]
        
           | beeskneecaps wrote:
           | This field was a blocker for me as well. If the channel field
           | was an async select that helped autocomplete lookup channel
           | urls by channel name, this would be way more convenient.
           | 
           | My use case would be for ham radio. Lots of ham radio
           | YouTubers film their QSOs (conversations) and mention the
           | callsigns that they make contact with. I want to find the
           | channels that mention my callsign and I'm sure lots of other
           | hams would want to know. Anywho cool project. GL 73
        
             | kellenmace wrote:
             | I agree! This is how the paid VideoMentions.com service
             | works- users can search for channels by name, without the
             | need to paste in URLs. This auto-complete lookup requires
             | spending a finite number of API calls, though, which is why
             | it's restricted to customers and not available on this
             | freely accessible VideoMentions Search page. Thanks for
             | checking it out!
        
           | truly wrote:
           | Perhaps you could have channel owners register their channel
           | if they want to be indexed. That would be super useful.
        
         | dankwizard wrote:
         | I'm assuming it's using the API to download captions and
         | scanning them, which is why it'd need the Channel. It would be
         | so hard to know where to begin without it!
         | 
         | Potentially future updates could search a logged in user's
         | history?
        
           | kellenmace wrote:
           | Yep, you're exactly right. The tool works by getting the
           | channel's videos, then fetching the voice-to-text transcripts
           | for them, then searching within the spoken words (along with
           | the title and description) for any keyword matches. So there
           | isn't a feasible way to do that across _all_ of YouTube.
           | 
           | I like your idea of searching the logged-in user's history!
           | That could be handy. Another thing I've thought about is
           | auto-importing all the channels they're subscribed to so they
           | can search within those.
           | 
           | For the first version of VideoMentions Search, I kept things
           | simple with a single "Channel URL" input field, but using
           | their history/subscriptions is totally doable. I'll see if
           | there's enough demand for that.
           | 
           | In any event, I still think that this simple first version
           | has utility. I like being able to quickly pinpoint all the
           | moments when a certain topic was mentioned across all of a
           | channel's videos.
           | 
           | Thanks so much for taking a look, and for your feedback!
        
             | fps_doug wrote:
             | Do you keep the transcripts around on the server? It
             | shouldn't matter much in terms of storage unless the site
             | becomes crazy popular, so you could offer a "best effort
             | search" or something along the lines, that just searches
             | everything you got so far, so the site would get better and
             | better over time.
        
               | corobo wrote:
               | YouTube kills your API key if you do this and make the
               | data available (eg via API).
               | 
               | You're allowed to cache responses for a bit but not store
               | them long term. "How would they know", etc of course, but
               | if you're distributing the data they'll figure it out.
               | Some smart cookies over there at Google.
               | 
               | My small website managed to get on their radar and I
               | didn't even post it to HN!
        
         | lrae wrote:
         | No, they'd have to index all youtube videos then.
        
       | mikewhy wrote:
       | I've been wanting to make a supercut of Civvie 11 mentioning John
       | Carmack but have been putting it off. This just did half the work
       | for me, amazing!
        
       | f0e4c2f7 wrote:
       | This is pretty amazing! Thank you for making it.
       | 
       | I'm not sure if this would be against Google's terms so you might
       | want to check:
       | 
       | If you started indexing such that I could do a search and it
       | would come back with any indexed content I think you would have
       | invented a new search engine. Seems extremely useful.
       | 
       | I already use YouTube this way today but as you pointed out
       | searching by title can be tricky.
       | 
       | Search engines supported by ads are also typically quite
       | profitable.
        
       | happy_pancake wrote:
       | Nice! My friends and I had this idea for a hackathon and we won!
       | We skipped lectures and we needed a way to fast forward to
       | relevant parts of a lecture
        
       | baxuz wrote:
       | This is amazing. I actually used yt-dlp to download all subtitles
       | and search in them a few times already.
       | 
       | Thanks a lot!
        
       | nop_slide wrote:
       | Nothing of note to say other than this is really cool!
        
       | wenbin wrote:
       | Great job!
       | 
       | How do you do search result ranking? Any signals that you use to
       | decide "this is a high quality video from reputable channel and
       | it's relevant"? I would imagine # of followers / likes / comments
       | / like-comment ratio or the likes are used?
        
       | atleta wrote:
       | I wanted to try it but for some reason it doesn't accept Lex
       | Fridman's channel as a valid URL:
       | https://www.youtube.com/c/lexfridman
        
         | pks016 wrote:
         | I tried with some other channels. Looks like It's not accepting
         | all channels as valid urls.
        
       | dmead wrote:
       | would you add support for regexes?
       | 
       | I would like to search for rich evans saying "aiiiiiiiiids" with
       | varying word length but now i can only search for "aids"
       | 
       | https://videomentions.com/search?channelUrl=https%253A%252F%...
        
       | kellenmace wrote:
       | Hey HN! I just launched VideoMentions Search, a free tool that
       | allows you to search YouTube to find videos that contain specific
       | spoken words.
       | 
       | Here's how to quickly try it out -
       | 
       | Let's say you want to find every video on The Verge's YouTube
       | channel within the last several months in which "MacBook Pro" is
       | mentioned. Here's a pre-populated search to accomplish that:
       | 
       | https://videomentions.com/search?channelUrl=https%253A%252F%...
       | 
       | If you follow that ^ link, then click the button to perform the
       | search, you'll see all matching videos, with every single mention
       | of "MacBook Pro" highlighted.
       | 
       | My favorite part is that all the highlighted matches are
       | timestamped. So you can click on any of them to jump to that
       | exact moment in the video when that keyword was said!
       | 
       | VideoMentions Search is great for these scenarios:
       | 
       | - You want to find all videos within a YouTube channel that
       | mention your brand, your product, or the topics you care about.
       | 
       | - You remember watching a video where a certain topic was
       | discussed, and now you're trying to remember which video it was
       | to rewatch it.
       | 
       | - You run your own YouTube channel and want to quickly find the
       | exact moments in your videos where you cover certain topics, so
       | you can link others to that content.
       | 
       | I have wanted a way to search YouTube based on spoken words for a
       | while. I couldn't find a tool that provides that capability
       | though, so I built it!
       | 
       | I hope you find VideoMentions Search useful! I'd love to hear any
       | feedback you have on it. Please let me know!
        
         | loxias wrote:
         | This looks super cool! There's a need for tools to make the
         | millions of videos on the web actually useful for people like
         | me who would much rather skim search and read than watch. Bravo
         | to you for making a stab at it!
         | 
         | I'm guessing you need a specific channel as a way of getting a
         | list of videos, then pulling the captioning, then searching?
         | (That's the only reason I could think of for the channel
         | restriction, which unfortunately removes 90% of the utility)
        
           | kellenmace wrote:
           | Hey @loxias! Yes, you're exactly right - VideoMentions
           | scrapes the video page markup and pulls out the "baseUrl" for
           | the English caption track. It converts that XML caption track
           | into JSON, then searches it for keyword matches.
           | 
           | YouTube doesn't provide a way to search _all_ of YouTube
           | based on spoken words, unfortunately. Still, I think this is
           | a useful tool for quickly locating spoken word matches within
           | a certain channel.
           | 
           | Some Hacker News commenters have suggested that I could auto-
           | import all the channels the user is subscribed to, or get
           | videos from their watch history, which are interesting ideas.
           | For this first iteration, I kept things simple with a single
           | "Channel URL" input.
           | 
           | Thanks for checking out my project, and for your feedback!
        
             | loxias wrote:
             | > Still, I think this is a useful tool for quickly locating
             | spoken word matches within a certain channel.
             | 
             | Absolutely, and I agree, however,.... it necessitates that
             | the user use or understand "channels". When I went to your
             | site my first thought was "what the hell's a channel?" ;-)
             | 
             | But yeah, for whatever percentage of youtube users use
             | channels, I see this having a lot of use. My totally
             | unsolicited advice is that you'll have to find _some_ way
             | to remove the channel requirement. Perhaps suggesting
             | channels based on the search terms. (I 've seen your other
             | comments about adding features to a paid version)
             | 
             | I haven't yet managed to get VideoMentions to actually
             | _work_ , but that's fine, I assume you're getting hugged to
             | death. :D Shipping _anything_ is hard, and congrats for
             | that.
        
               | kellenmace wrote:
               | "I assume you're getting hugged to death" -- lol
               | 
               | Yeah, maybe I'll consider having two versions of search -
               | one version for paid customers that allows you to search
               | for YouTube channels by name/keyword and allows you to
               | search across all the channels you're subscribed to or
               | across all the videos in your watch history, and a second
               | version that's free and similar to the current iteration.
               | 
               | VideoMentions Search should work just fine for you -
               | there is no server and no database behind it. When
               | searches are performed, serverless functions are called
               | that source data from YouTube, then return a response to
               | the client. So it should scale just fine.
               | 
               | Can you please visit this link, click the button to
               | perform a search and let me know if you see results pop
               | up? https://videomentions.com/search?channelUrl=https%253
               | A%252F%...
               | 
               | I'd love to know if you don't, or if you see any errors
               | in the browser console. Thanks again for your thoughts on
               | it!
        
               | loxias wrote:
               | (firstly, that search works fine and returns very
               | quickly! Now that I know it's client side, I've been able
               | to do a few searches of my own to see how it works. It
               | works well! The ones I did earlier were looking at a
               | large channel "PBS Newshour" for various news topics over
               | "All time")
               | 
               | (secondly, DO consider putting your email address on your
               | HN profile!)
               | 
               | Whoa! Client side!?! Far out man, far out. To be honest,
               | I was a little unimpressed before -- but didn't feel like
               | "your idea is _brilliant_, but implementation leaves a
               | lot to be desired. makes me want to try my hand at
               | writing my own." would be a constructive comment. ;-)
               | 
               | But, knowing this is all done client side?? I tip my hat,
               | that's *clever*! The whole site could be a static and
               | local set of files? I had no idea things like this could
               | even be done without a server or other real program to do
               | the work. (though now that I think about it, I have an
               | idea how, duh. YouTube exposes a JS API I bet, so you can
               | have the client call it each time, do the searching work,
               | &c.)
               | 
               | That avoids scalability issues and probably legal ones as
               | well! What a game changer...
               | 
               | Was there an existing framework used for building client
               | side web apps like this, or did you roll your own?
               | 
               | While I've got you, and I admit that graphical UIs are
               | ... an area where my opinion should carry no weight (I'm
               | the sort who.. doesn't use icons, and when writing
               | personal tools, the "user interface" is positional
               | parameters to a command line program ;-)), here are some
               | suggestions:
               | 
               | * The UI, while clean and unbusy (yay!!), feels BLOATED
               | with white space. Why do the results occupy this tiny
               | narrow column, forcing me to scroll way more than I
               | should need to?
               | 
               | * In addition to the narrow column, the entries IN the
               | column take up too much space, the UI could be arranged
               | to be much more efficient.
               | 
               | * Since search results are within a channel, don't put
               | the channel name after each entry, there's no point.
               | 
               | * Might be nice to have the time offset visible to the
               | left of the text excerpt. I can see it on mouseover, but
               | still. Might be nice to see.
               | 
               | * Can the videos be made to play inline, without
               | redirecting to youtube, then navigating back, and redoing
               | the search?
               | 
               | * Perhaps, after searching for a result, we could see a
               | graph at the top showing all the videos containing that
               | term over time, and easily click on it to go directly
               | there.
        
         | corobo wrote:
         | Very cool! Well done!
         | 
         | I've not got a legit use for it right now but the few example
         | searches I thought up returned the videos I was thinking of
         | 
         | Nice work!
        
           | kellenmace wrote:
           | Cool! Thank you!
           | 
           | Here are some example use-cases that I included in another
           | comment, in case any of them are relevant to you:
           | 
           | 1. You want to find all videos within a YouTube channel that
           | mention your brand, your product, or the topics you care
           | about.
           | 
           | 2. You remember watching a video where a certain topic was
           | discussed, and now you're trying to remember which video it
           | was to rewatch it.
           | 
           | 3. You run your own YouTube channel and want to quickly find
           | the exact moments in your videos where you cover certain
           | topics, so you can link others to that content.
           | 
           | Maybe #2 happens occasionally? If so, maybe you can bookmark
           | VideoMentions Search for just such an occasion :)
           | 
           | In any event, thanks for checking out my project, and for the
           | kind words!
        
         | CopOnTheRun wrote:
         | How are you sourcing the text for the videos? This search [1]
         | grabs some results for my query, but it does miss this [2]
         | video which contains the searched keyword multiple times, and
         | the video's subtitles indicates as much.
         | 
         | [1]
         | https://videomentions.com/search?channelUrl=https%253A%252F%...
         | 
         | [2] https://www.youtube.com/watch?v=3denP7wX2XU&t=296s
        
           | kellenmace wrote:
           | VideoMentions scrapes the video page markup and pulls out the
           | "baseUrl" for the English caption track. It converts that XML
           | caption track into JSON, then searches it for keyword
           | matches. You're right that this particular search for "toxic"
           | should find several spoken word matches, but it doesn't. It
           | seems like the tool isn't able to access the captions data
           | for that video for some reason. I made a note of this bug,
           | and I'll look into fixing it. Thanks for pointing it out, and
           | for checking out VideoMentions Search!
        
             | tcmb wrote:
             | yt-dlp [1] has command-line options to download only the
             | captions of a video, in available languages, if you want to
             | skip the scraping for the link.
             | 
             | I built something similar [2] for a slightly different use
             | case. I wanted to be able to search through all Ram Dass
             | talks in the 'Here and Now' podcast series on YT. I'm
             | obviously not as skilled at CSS. :) And the display of
             | timestamps is still a bit shaky, but for me it fulfills its
             | purpose.
             | 
             | Since I'm able to preload all caption files ahead of time,
             | I'm just using pcregrep for the search which does a pretty
             | good job.
             | 
             | [1] https://github.com/yt-dlp/yt-dlp [2] https://ramdass-
             | search.net
        
         | rodnim wrote:
         | > - You remember watching a video where a certain topic was
         | discussed, and now you're trying to remember which video it was
         | to rewatch it.
         | 
         | But I need to remember which channel I watched? Maybe I'm
         | missing something, but in my eyes it would make it tremendously
         | more useful if I didn't have to specify channel.
        
           | Nowado wrote:
           | I'm sure it would be, but then you're indexing whole youtube
           | with respect to words spoken in each video. A thing that
           | Google, arguably the best organization when it comes to
           | indexing stuff, is working on.
        
             | kellenmace wrote:
             | Yes, my thoughts exactly. If Google decides that it's going
             | to revamp YouTube search to include spoken
             | word/transcription matches, that would make VideoMentions
             | Search irrelevant- and I'd be okay with that!
             | 
             | In the meantime, I think this is a useful free tool for
             | quickly finding spoken word matches within specific
             | channels.
             | 
             | Thanks for checking it out!
        
           | kellenmace wrote:
           | Yeah, YouTube doesn't provide a way to search _all_ of
           | YouTube based on the spoken words in videos.
           | 
           | I could update VideoMentions Search to allow users to select
           | multiple channels, and then perform the search across all of
           | those (maybe importing all the channels they're subscribed to
           | could be handy... ). One way or another though, it would
           | still require selecting specific channels to search within.
           | That limitation notwithstanding, I still think it's a useful
           | tool, though!
           | 
           | Thanks for checking it out!
        
             | hanniabu wrote:
             | Do you cache the channel, search term, and results for
             | faster more efficient responses later?
        
               | kellenmace wrote:
               | Hey @hanniabu! I do client-side in-memory caching of
               | videos data, only. No server-side caching. In fact, there
               | is no database involved at all- the client-side app calls
               | serverless function API endpoints to fetch the YouTube
               | channel and video data it needs. Here are the tricks I'm
               | using to make it fast:
               | 
               | - As soon as the "Channel URL" field loses focus, I start
               | fetching the most recent 30 videos on that channel in the
               | background. This way, by the time the user enters the
               | keyword and date range, I've already fetched some (maybe
               | even all!) of the data ahead of time, which means less
               | wait time for them.
               | 
               | - Once a specific video's data (title, description,
               | transcript, etc.) has been fetched once, it is saved in
               | memory. All other searches the user performs from that
               | point on will pull the video data from the in-memory
               | cache, if it's there. Otherwise, it will fall back to
               | fetching the video data over the network. This in-memory
               | caching makes subsequent searches within the same date
               | range (or a shorter date range) take <1 second.
               | 
               | - Network requests to fetch video data are processed
               | concurrently rather than one at a time. So the browser
               | fires off as many as it can in parallel to get them all
               | resolved as quickly as possible.
               | 
               | - As soon as any matches are found, the UI updates to
               | show the user. This way, the user can start scrolling
               | through the matches and reviewing them while the search
               | is still in progress- they don't have to wait until it
               | finishes to start interacting with the matches.
               | 
               | Thanks for checking out my project!
        
               | alex_smart wrote:
               | >I do client-side in-memory caching of videos data, only.
               | 
               | This works only if the user does not navigate from and
               | back to your website or refresh the page, but if they do,
               | you make the same api calls all over again. You should
               | set HTTP Cache-Control headers in your response from the
               | server, so that the browser knows that it can serve that
               | data from its cache and does not need to make those
               | requests again. You would then probably not need the
               | client-side in-memory cache at all.
        
             | bredren wrote:
             | I get why you wouldn't be able to index all of YouTube,
             | that's a big ask.
             | 
             | However, I don't use YouTube enough to mess w channels
             | much. I'm usually searching on a particular topic.
             | 
             | For example, "drop ceiling panel replacement."
             | 
             | Perhaps you could help users limit the channel scope by
             | making an intelligent channel selection by keyword.
             | 
             | So I would put in "home improvement," and you could choose
             | some appropriate channels to search for my search terms.
        
               | kellenmace wrote:
               | Hey @bredren! Yeah, I agree that this makes for a nice
               | user experience. This is how the paid VideoMentions.com
               | service works- users can search for channels by name or
               | keywords, without the need to paste in URLs. It looks
               | like this: https://cloudup.com/cUgKqErcx8G
               | 
               | That auto-complete lookup requires spending a finite
               | number of API calls, though, which is why it's restricted
               | to customers and not available on this freely accessible
               | VideoMentions Search page.
               | 
               | Thanks for checking it out, and for your feedback!
        
               | bredren wrote:
               | Ah! Okay. Makes sense. Thanks for the reply, maybe I'll
               | try out the pro version.
        
             | mistermann wrote:
             | Could you do a search based on their watch history if they
             | have it enabled?
        
               | hanniabu wrote:
               | That'd certainly be useful
        
               | kellenmace wrote:
               | This is a cool idea! It wouldn't apply to people who want
               | to search all the videos on a given channel for specific
               | keywords (including those they haven't watched). I can
               | see it being useful for folks trying to locate a specific
               | video they remember watching in the past, though.
               | 
               | One consideration is that getting the user's watch
               | history would likely require calls to the YouTube API. So
               | that means I would have to make this a paid service in
               | order to offset the code of those API requests. The
               | beautiful thing about the current iteration is that it
               | doesn't rely on YouTube's API at all. By scraping YouTube
               | pages and leveraging a few NPM packages, I'm currently
               | able to offer free and unrestricted access to it.
               | 
               | If enough people request that ability though, I'll
               | consider incorporating it.
               | 
               | Thanks for checking out my project and for the great
               | idea!
        
             | joshvm wrote:
             | They don't offer direct search, but isn't this what "key
             | moments" is in search results? Try eg _how to change a
             | lightbulb_.
             | 
             | I believe SeekToAction works even if the uploader didn't
             | put chapters in. This was a relatively recent update, to
             | make it fully automatic. So it's presumably doing some
             | audio/video analysis to figure it out. All you need to do
             | is tell Google how to seek your video (so it also works
             | with non youtube videos too).
             | 
             | https://developers.google.com/search/blog/2021/07/new-way-
             | ke...
        
               | kellenmace wrote:
               | Oh, cool! I hadn't heard of the Key Moments/SeekToAction
               | feature before. I'll have to dig in and explore that a
               | bit. Thanks for the tip!
        
         | jobigoud wrote:
         | I tried on this channel which is in Spanish [1] and it returned
         | very fast with no results.
         | 
         | Then tried on GeoWizard channel (English) and it worked great.
         | 
         | There is a somewhat related tool called Youglish that works on
         | subtitles, it's great for checking the pronunciation or usage
         | of a word or expression in many languages, but it's based on a
         | curated list of channels with known good subtitles. I thought
         | yours could be a great complement to this as it works directly
         | on the audio.
         | 
         | [1] https://www.youtube.com/c/Tercosmicqueen
        
           | kellenmace wrote:
           | Hey @jobigoud! Yeah, currently VideoMentions only uses the
           | auto-generated English video transcriptions when searching
           | for matches. If there was enough demand, it could be expanded
           | to support other languages. Doing so would add complexity and
           | would make searches take much longer, though. I'd probably
           | need to add more UI fields to allow users to select the
           | languages to target. Could be done, though!
           | 
           | Yes, youglish.com is a neat tool! I stumbled upon it when
           | looking around for a way to search YouTube videos based on
           | spoken words.
           | 
           | Thanks for checking it out! :)
        
         | [deleted]
        
         | barefeg wrote:
         | Are the video transcripts searchable?
        
           | kellenmace wrote:
           | Yes, this is how VideoMentions Search works. It scrapes the
           | video page markup and pulls out the "baseUrl" for the English
           | caption track. It converts that XML caption track into JSON,
           | then searches it for keyword matches. Is that what you're
           | asking about?
           | 
           | If you want to search within the transcript of a single
           | video, you can accomplish that with these steps:
           | https://kb.swtc.edu/page.php?id=90230
        
             | barefeg wrote:
             | Ah got it. I thought there would be some API where the
             | transcripts could be searched across videos. Maybe that
             | requires way to many resources for Google to index
        
               | kellenmace wrote:
               | Yeah, Google is of course the king of search, so they
               | could certainly decide to revamp YouTube search to
               | include spoken word/transcription matches. They have all
               | the data required to make that happen. That would make
               | VideoMentions Search irrelevant- and I'd be okay with
               | that!
               | 
               | In the meantime, I think this is a useful tool for
               | quickly locating videos based on spoken words.
               | 
               | Thanks for checking out my project!
        
       | atentaten wrote:
       | This is a nice tool. I wish one didn't have to specify the
       | channel, as some comments mentioned. If it's not possible to get
       | the channel when providing the video url via an API, it is
       | possible to get it from the video url's GET response data. The
       | latter may be little slower, but might be worth it in terms of
       | UX.
        
       | filoeleven wrote:
       | This is amazing.
       | 
       | I just tested it against a smallish British channel with a video
       | that I wanted to see again, and couldn't remember which one of
       | the 50-odd videos it was. It did not carch the full quote
       | directly, because YT read it as "Marvin nature" instead of
       | "Mother Nature," a consequence of Alfie's accent. But my search
       | for "Mother Nature" picked up another reference close enough to
       | it to show the text. Sort of unlucky with the miss and very lucky
       | with the proximity to the hit.
       | 
       | I drew a blank regarding the other channels whose videos I want
       | to revisit, but I know I will think of them later because there
       | are so many. This is extremely useful. I've already bookmarked
       | the site.
        
       | ojiwan wrote:
       | Pretty cool. I could see this as a useful automated tool for
       | companies to monitor their brand, or competitors. Maybe a weekly
       | report of channels that have mentioned specific keywords.
        
       | anoncow wrote:
       | How is this different from what Google currently does for in-
       | video text search?
        
       | 1970-01-01 wrote:
       | I wanted this just yesterday! I was searching several channels to
       | recall how many bolts are holding the battery for the electric
       | F150. Sure enough, Top Gear told me how many bolts the F150 used
       | for its battery pack!
       | 
       | Result:
       | 
       | SPOKEN WORDS "...kept things simple it recycled the same basic
       | chassis and attach the battery which can be removed with just
       | eight bolts from underneath and used carryover components
       | wherever possible in fact..."
        
       | [deleted]
        
       | kumarski wrote:
       | Is this based on acoustic keyword recognition or SRT search?
        
       | wolongong942 wrote:
       | Very interesting tool could see myself using it a lot.
       | 
       | Couldn't help but search the guy that uploaded 2 million vids to
       | yt, sorry if it breaks anything.
       | 
       | It would be cool to search for a specific quote like "buddy of
       | mine" (frequently said by Joe Rogan). Also searching for parts of
       | a word (similar to regex options?) might be difficult to
       | implement but would be super useful as an option i.e if i
       | searched for any word that starts with yc with a correct pattern
       | (or given option on the UI) i could find results for both "yc"
       | and "ycombinator".
        
         | kellenmace wrote:
         | Hey @wolongong942! Nice- I'm glad you're finding it useful!
         | 
         | VideoMentions Search is built entirely using serverless
         | technologies, so it should scale really well. There's no single
         | server or database to act as a bottleneck. When you perform a
         | search, your browser fires off a number of network requests to
         | serverless function API endpoints that fetch the data from
         | YouTube, then return a response. So if you do an "All time"
         | search on a channel with 2 million videos, your laptop fan may
         | kick on while your browser works hard firing off thousand and
         | thousands of requests until it's fetched all 2 million videos,
         | you run out of memory, or you manually hit the "Cancel" button
         | - whichever comes first :)
         | 
         | Your regex idea is interesting, and I'll consider implementing
         | some more complex rules like that if enough people request it.
         | For now, my answer would be to either perform separate searches
         | (like one for "yc" and another for "ycombinator"), or perform
         | one search, then use ctrl/cmd+f to search within the matches
         | displayed on the page.
         | 
         | Thanks for checking out my project, and for the great feedback!
        
           | messe wrote:
           | What're the (approximate) costs for monthly usage? (Or if you
           | don't know yet: what're you budgeting for?)
        
       | PaulKeeble wrote:
       | One of the consequences of the effectiveness of click-bait titles
       | on youtube is that searching for videos you watched historically
       | is extremely difficult, because quite often the primary thing in
       | the video is never mentioned in the title. The descriptions are
       | almost universally used to store common links that are about the
       | channel not the video and the consequence of this is that click-
       | bait titled content is very hard to find after the fact. The
       | other problem is quite a lot of channels are testing one title
       | and then changing to another, the video wont then be found using
       | the initial title and so even if you do remember elements of the
       | title the channel may have made that impossible to use some hours
       | or days later as another title with better performance was used.
       | 
       | I can see a need for search in the contents of the videos but I
       | don't want to specify the channel, I likely don't know especially
       | since channels can also change their name.
        
         | Akronymus wrote:
         | Also, the youtube built in search often fails to match videos
         | that have a word in the title for which you search for.
        
           | Minor49er wrote:
           | I can't even find a specific channel when I spell it
           | correctly, even with double quotes. I have to go into the
           | filters and specify channels-only, and even then it puts a
           | bunch of wrong results in front of it. Their search on
           | YouTube used to be better, showing results for what was
           | actually searched for. But now it seems like they're trying
           | to drive views to high-performing videos since they now give
           | a single page of results for any query before backfilling the
           | results with a bunch of suggestions that have nothing to do
           | with the search
        
             | Akronymus wrote:
             | Up until a few days ago I had a redirect for search that
             | fisabled the suggestions on the search page. Now, if I do
             | that, the search only gives 1, irrelevant, result.
        
             | nonameiguess wrote:
             | You should be able to just go directly to the channel using
             | the URL if you know the channel name.
             | youtube.com/c/<CHANNEL_NAME>. What on earth they use the
             | distinction for, I don't know, but if it's a user with
             | videos instead of a channel, then
             | youtube.com/user/<USER_NAME>.
        
           | andai wrote:
           | YouTube search is almost completely useless, I just use
           | Google to find YouTube videos. It does seem to search in the
           | video transcript and the comments (I'm not sure but it seems
           | that way).
        
             | hbn wrote:
             | YouTube's search results will show you like 10 results of
             | the thing you actually searched for, then resort to a
             | section of completely random video suggestions unrelated to
             | your search in hopes some thumbnail will draw your
             | attention and suck you back in to wasting your time
             | watching videos and being served ads.
        
               | lostinroutine wrote:
               | I also suffered from this issue for a while but then
               | discovered "Unhook" browser extension, which in addition
               | to many other amazing features (e.g. hide recommended
               | videos/comments) has a feature to hide irrelevant search
               | results. With the feature on I get an endless list of
               | matching results and no junk.
        
           | PaulKeeble wrote:
           | Youtube is clearly blacklisting certain channels and they
           | can't be found via even precise searches.
        
           | sph wrote:
           | Unreasonable of you to expect search to be good in a Google
           | product.
        
         | htrp wrote:
         | people also edit the titles of videos from time to time... so
         | even the exact title doesn't help
        
           | [deleted]
        
           | corobo wrote:
           | Aye especially in modern day YouTube
           | 
           | Once you're big enough your stats start being useful, folks
           | in the know (mrbeast's analytics guy, I can't think of more
           | rn, etc) advise you rotate out the thumb/titles until the
           | click through rate is decent
        
       | lewisjoe wrote:
       | Hi, Great job. I was almost about to build such a thing before I
       | realized the crazy amount of crawling and transcriptions that I
       | have to index.
       | 
       | How exactly did you solve/approach the problem?
       | 
       | 1. How did you crawl across those millions of videos from the
       | platform?
       | 
       | 2. How are you indexing stuff like that
        
         | jonplackett wrote:
         | Bump. Would like the lowdown too!
        
         | SteveDR wrote:
         | I built something similar to this when I was first learning to
         | program. Except mine lets you perform a one-time search for
         | videos containing keywords, similar to how you would normally
         | search YouTube. So there are no notifications or anything.
         | 
         | https://phrasefinder.net
         | 
         | I don't know if it even works anymore and I'm sure the code is
         | atrocious. But I remember that I would just scrape YouTube
         | pages for video IDs, and then use an API that returned video
         | captions for a given ID [1]. I could see how OP would do
         | something similar.
         | 
         | [1] https://pypi.org/project/youtube-transcript-api/
        
         | alex_smart wrote:
         | Based purely on the speed of the results, I believe that the
         | crawling is happening in real time.
         | 
         | The search is scoped by channel, so the closed-caption files
         | for all the videos in the channel are downloaded and searched
         | for on the fly.
         | 
         | Edit: Wow, thanks to dev tools, I can see that the website is
         | downloading the transcript and metadata for all the videos from
         | the channel to the client. So the search is happening client-
         | side!!
        
         | galori wrote:
         | Its only for a single channel. That's the answer, it queries in
         | real time.
        
       | galori wrote:
       | This is neat, but I would like it to search against the entire
       | archive of all youtube videos. I know you can't do that...but
       | perhaps a Google employee making use of Google's Big Data
       | offerings.
        
       ___________________________________________________________________
       (page generated 2022-05-25 23:01 UTC)