[HN Gopher] Show HN: Podcast API ___________________________________________________________________ Show HN: Podcast API Author : wenbin Score : 165 points Date : 2020-11-18 17:48 UTC (5 hours ago) (HTM) web link (www.listennotes.com) (TXT) w3m dump (www.listennotes.com) | mikece wrote: | Is this service cooperating with or in competition to the Podcast | Index? | | https://podcastindex.org/ | a_brawling_boo wrote: | I was wondering the same thing. This one is a paid $$$ API. | stevoski wrote: | As an avid podcast creator and consumer, I'd love to take a full | look at this, but I kept getting the full captcha experience. You | know what I mean, "select the squares with sidewalks" :( | | OP, are you aware of this? | petercooper wrote: | It's not a replacement for this by any means, but in case anyone | would find a reasonably up to date list of around 600,000 | podcasts useful, here you go: https://gofile.io/d/MjYVy7 - No | episodes, just the name, creator, and feed URLs for further | crawling on your own. | darrenwestall wrote: | Happy customer here. I've seen a few people suggest a free | service but from our testing this is far more comprehensive and | with better search quality. | | We use the service in conjunction with iframely to load podcast | episodes that can be listened to with ease. | | Great product, customer service and documention. | | Thanks from team Paiger <3 | [deleted] | mongol wrote: | Suggestion for pivot: add a podcast playing web application | (basically podcast subscriptions, you have already most other in | place), and charge a more reasonable amount for that plus | unlimited regular search. The pro subscription is way too | expensive for me. | | Edit: I didn't notice that this is about a new API service | OJFord wrote: | > API Terms of Use | | > Applications using the Listen API must not pre-fetch, cache, | index, or store any content on the server side. Note that the id | and the pub_date (e.g., latest_pub_date_ms, pub_date_ms...) of a | podcast or an episode are exempt from the caching restriction. | | Is that.. common? I've never knowingly come across anything like | that before, seems weird to me. Sort of makes sense, in a 'you | must not try to avoid needing to pay us more because we want more | money' sort of way, but.. really? Also, almost entirely | (basically, except OSS) undetectable, surely. | | [Edit: failure to read my own quote correctly, thanks xd1936] --- | And if you really take it seriously - 'must not [...] store any | content' - it really limits what you could even use it for, not | being able to store the `id` even for a later reference. I don't | think that's what's intended, but it seems to be what's written. | --- | | (Just so I don't sound like a grumpy old git (I'm not old, at | least!) - I really _really_ really like the docs page | https://www.listennotes.com/api/docs/ only thing I'd suggest | perhaps is embedding the OpenAPI 'HTML' contents below the other | options, rather than it being a link to follow. Awesome though.) | loosetypes wrote: | I've noticed similar recently with many paid book search apis | out there and was also grossed out. | | You're not paying for a data source at all, you're paying for | an expensive embedded application. | | I don't see how it's remotely reasonable. The person managing | this api has stricter protections on this data (though they're | not even his podcasts) than we have on our personal data. | PragmaticPulp wrote: | You're not paying for the data, you're paying for the | service. | | This is common. Companies that provide the data for offline | use tend to have a separate licensing and subscription fee | structure. Companies that provide the API tend to forbid | offline caching/storage of the data. | loosetypes wrote: | I commend the service provider for aggregating the data and | making a business - hope that person is able to make a | living from it. | | It's an interesting service that I would be very interested | in using in providing a service of my own. And I'd be more | than happy to pay for it, but those terms are a non- | starter, at least for me. | | The year is 2040. There's no running water. Grocery stores | mandate that all purchased liquids must be consumed prior | to leaving the premises. | Fnoord wrote: | Thing is, you prevent an API so that people don't use some | kind of data harvester. If your API is terrible, people | resort to harvesting. | OJFord wrote: | The service, though, is 'convenient access to the data | [which is already out there]'. And once I've used it, I | don't need it 100/sec just because that's how frequently | people are using my downstream service to do something with | some popular 'trending' podcast; I'm perfectly happy (and | it would be a good practice to be!) caching it for some | period, until I need the service again to conveniently see | if the data that's already out there has changed. | PragmaticPulp wrote: | > The service, though, is 'convenient access to the data | [which is already out there]'. | | The service is whatever is described in the contract you | agree to when you purchase it. | | If you don't like the terms of the contract, you can | always try to negotiate an alternate agreement. Or you | can choose not to purchase the service. | | The seller isn't obligated to provide their services on | your terms, just as you're not obligated to purchase the | seller's services on their terms if you don't agree to | them. | hombre_fatal wrote: | Map tiling APIs do this, like Mapbox and Google. Else you could | circumvent all but their lowest-tier subscription plans with a | brain-dead caching proxy and a large disk which is what they | want to avoid. | | Amazon's API famously does this as well (or used to, it's been | a while) by requiring any prices you show to be no more out of | date than N minutes forcing you to basically request on demand | every time you need to show it. They'd rather you just send the | traffic their way for people to see the price. | klysm wrote: | Depending on the use case, possibly a whole lot of disks. | ehsankia wrote: | Right, I would assume that even just the tiles for the | biggest cities alone would still be way more than most | would want to store. On the other hand, let's assume on the | client-side, can you not even cache a tile a user just saw | 10s ago but went off screen? Or is it assumed the browser | will cache that tile? | OJFord wrote: | Heh, yeah. I think my reaction's still similar though - why | _shouldn 't_ I be allowed to do that? | | The alternative of course is to charge more per tile, or have | a base 'access fee' + small incremental charge. Pay per usage | doesn't work best for everything, IMO. | | (And I'd likely still want to come back occasionally to check | it hasn't changed, even if I cached every tile forever. | (Which I probably wouldn't, if the hit rate was really low, | like it was a one-off, and I'm being cheap about my API usage | why wouldn't I also be cheap about my disk usage.)) | PragmaticPulp wrote: | > why shouldn't I be allowed to do that? | | Short answer: Because that's the contract. | | Companies that provide data for offline use will have a | separate licensing modeling, usually with subscriptions for | updates or perhaps a finite license term. MaxMind's GeoIP | database is a popular example. | OJFord wrote: | That's not really an answer though, that was the starting | point. | | And this isn't a one-off dataset, we're discussing an API | pricing model - there _will_ be new podcasts, existing | podcasts ' metadata _will_ change; people using this API | _will_ want to make repeated calls, they just might also | reasonably want to cache results. | | If this were my service, I just wouldn't do pay-per-API- | call, or at least not _only_. Of course, the free tier | presents more of a problem then, but I 'd probably just | restrict it more making it less attractive, and have a | lower entry point than the $100pcm that's a flat-fee for | some but not all extra features, showing images at all | (and not in free), for example. | | As it is, I reckon loads of users cache results - not | maliciously, just because they haven't read that they're | not supposed to - and that OP has no idea (because how | would they). | hombre_fatal wrote: | Pay-per-use is just the simplest and most straightforward | and possibly fairest way to couple the value your API | gives someone with the amount they pay in return. | | Or, from the eyes of the user, they get full access to | the API yet don't have to pay much if their project gets | no traction. | | The downside is that users can lie, but it's mainly just | low-end users who would lie. Pay-per-user licenses are | similar: a startup or a hackathon is most likely to share | the license between a few people while larger companies | are going to be honest because (1) they can afford it and | (2) they don't want trouble at scale. | | So you can ignore most abuse. | | The problem with other payment structures for ListenNotes | is that it's a relatively small database. You can clone | the whole thing trivially. It doesn't even mirror/host | the audio feeds. Its only value is that it put in the | work of structuring and normalizing the metadata. | | If you built a business on top of ListenNotes, you'd save | more and more money as you grow bigger and bigger if you | were simply cloning the whole thing with your own | crawler. So the more value you would get from | ListenNotes, the less you're actually paying them. Or | ListenNotes would have to price their per-call fee so | high that they could somehow capture a fair price for | that value yet shut out smaller users. | | Turns out "courtesy agreements" generally do work at | scale as larger companies become less and less likely to | lie just like they become less and less likely to pirate | Photoshop. | | > have a lower entry point than the $100pcm that's a | flat-fee for some but not all extra features, showing | images at all (and not in free), for example. | | The downside of this is that now you limit what people | can build on cheaper tiers. In fact maybe they can't even | build their compelling product without whatever content | you're paywalling behind tiers they can't afford on day | 1, while the goal is to let someone build anything they | want on day 1 so that they are a large end-user on day | 1000. | | After all, the ideal isn't that you scale value with your | customer's income but rather you scale in price as they | convert value into income. It, of course, is all just | trade-offs. | nefitty wrote: | I worked on a food tracking PWA, and getting it to be useful | offline was horrible. We'd have to hit the API at least once a | day to grab commonly used foods and refresh our temporary | cache. The data did not change at all... eggs don't suddenly | have a different calorie count the next time you eat them lol | OJFord wrote: | A database of all of the world's foods though could easily be | larger than I'd like a calorie counting app on my phone to be | though, for example. So it's not necessarily silly - network | can be cheaper than disk. | xd1936 wrote: | > Note that the id and the pub_date of a podcast or an episode | are exempt from the caching restriction | | > it really limits what you could even use it for, not being | able to store the `id` even for a later reference | OJFord wrote: | :facepalm: - thanks, I'll (keep it but) edit my comment to | reflect that correction. | sjs382 wrote: | IIRC, Mapbox has similar terms for both their map tiles and | their geo lookup results. | Gaelan wrote: | At least for the actual audio, I understand that podcasters get | grumpy when people cache that server-side, because they depend | on server logs to get viewership numbers for advertisers, so if | a popular client downloads the audio once and distributes it to | all their customers, they can't make money off any of those | customers. | dtran wrote: | Great work Wenbin! What's the hardest part about maintaining this | API? | wenbin wrote: | Thanks for asking! | | The hardest part is to make small incremental improvements over | a long period of time :) | | Like most software projects, this API is never a finished | product. It's always work-in-progress. | | Small incremental improvements are not glamorous, typically not | newsworthy to share to the public. | | Some examples of small incremental improvements: | | 1. Improve API docs. I heard that many API-focused startups | have a dedicated team to maintain their API doc page. | | 2. Dealing with edge cases. As more apps/websites use our API, | we'll see some edge cases that we would never know, which could | be as simple as adding a data field in the response with 2 | lines code change, or changing search index that requires to | re-index the whole thing for a few days. There could also be | some strange edge cases with billing, e.g. what if a user | subscribe to the paid plan, then unsubscribe, then subscribe | again, then do something strange, then unsubscribe... | | 3. Customer support. This involves adding FAQ (tweaking the | texts) and preparing email templates to answer frequently asked | questions from users. | | 4. Doing things to keep the service robust & performant, e.g., | adding new alerts via Datadog/Pagerduty so we can know what go | wrong in time. We also need to have mechanism to be able to | know if a particular app sends tons of requests (e.g., send | request in an infinite loop) in a short amount of time and we | should be able to do something about it (e.g., suspend the | account). | hashamali wrote: | Are the docs custom or are you using a third party product? | Doesn't look like Swagger UI or Slate. | wenbin wrote: | It's built from scratch, which was easier than customizing | from some open source projects back then (early 2019). | | But the doc is codified in openapi format: | https://www.listennotes.com/api/docs/#openapi | | So you can feed the openapi spec into other doc viewers, | e.g., Postman, or redoc https://listen- | api.listennotes.com/api/v2/openapi.html | colinprince wrote: | When trying the Postman Web View I get: | | Profile cannot be found This public profile may have been | disabled or deleted | wenbin wrote: | Already contacted Postman customer support :) | jcims wrote: | I was tinkering a bit recently in an effort to build a simple | system that finds 'related' podcasts and see if I can see the | network effect play out over time. I did this by building a graph | of people (hosts/guests) and episodes and started folding in | tags/topics. None of this is in my wheelhouse, and I found: | | - It takes a _lot_ of work to curate a substantial collection of | podcasts. There are lists all over the place but it 's hard to | know what's really in there. | | - I attmpmted to use SpaCy and/or NLTK to do some 'Named Entity | Recognition' in order to extract topics/people/orgnaziations from | episode titles and descriptions. This was surprisingly brittle. | The string 'Sean Carroll', for example, wasn't detected as a | person by either framework (IIRC). It also seems quite brittle to | punctuation and other context (e.g. beginning or end of a | sentence). This was using the default models shipped with both. I | started off with just the english models but expanded as there | were _lots_ of names being skipped silently. That helped less | than I had hoped. | | - I have yet to find a good UI for exploring a graph. I used | Neo4j and the built in 'browser' is not intended for that | purpose. Gephi has good capability for filtering and analytics, | but it takes quite a bit of getting used to and the graph itself | isn't amenable to dynamic navigation. | | That's all. Bookmarking this as it would really help. | wenbin wrote: | Many people use our Listen Later playlists to curate podcasts / | episodes by topics. | | Here are some examples: https://www.listennotes.com/podcast- | playlists/ | | Each playlist has a rss feed. So you can subscribe to the | playlist on any podcast app (except for spotify or the like) | jcims wrote: | Yes this looks great, thanks!!! | eahlberg wrote: | Big fan of Listennotes in its entirety, but this feature is a | real gem! | cjlm wrote: | I've been thinking of doing the same for my graph visualization | newsletter source/target [0]. | | I'd love to connect if you're interested in collaborating! | sourcetarget@cjlm.ca | | [0] https://sourcetarget.email | pedro1976 wrote: | Cool project! I wonder if your price points are randomly picked | or if 100 is the sweet spot, which would be odd. | | Do you plan to add some text-to-speech magic, so one can search | for the actual podcast content? That would be a killer feature | for me :) | nshm wrote: | Yeah, modern open source speech recognition like Vosk can have | the cost like $2c per hour (70 times less than Google STT cost | $1.4/hour) and should be just enough for search. | | @wenbin do you need any help with it? | bredren wrote: | Great work. | | I first heard about Listen Notes when you were interviewed on the | Django chat podcast. | | Here's that episode if people want to learn more about the tech | behind the site. https://lnns.co/Td9vzk47qQ3 | ccvannorman wrote: | Super happy to see a BoostVC "cockroach" still at it 2.5 years | later! Keep up the good work. I use ListenNotes myself to | discover and play podcasts. | droopybuns wrote: | More interesting alternative: | https://podcastindex.org/?utm_source=podnews.net&utm_medium=... | dsco wrote: | We used ListenNotes a while back in a web based podcast player | and have only good things to say about the API. It's reasonably | priced, much easier to deal with than Apple's API and email | support is speedy! | [deleted] | pkamb wrote: | I'd like to see a crowdsourced "Genius for Podcasts". | | Most podcast producers are terrible about correctly adding | metadata: Chapters, images, episode notes, descriptions, etc. | | Let the superfans upload custom metadata to be displayed | alongside the episode as it's playing in your podcast player. | willcodeforfoo wrote: | I'm building something very similar to this! | praveenperera wrote: | I wish there was a question in the FAQ asking: | | "Why would I use this over the iTunes API?" | wenbin wrote: | Great question! Will add. | | For itunes api: 1. You can't search episodes 2. You can't get a | lot of search results of podcasts. 3. Their terms of use may | not allow you to do what you want to do | bredren wrote: | If that is what the official Podcasts app returns, it's bad | search results. | parondea wrote: | Love seeing development in the podcast space. One specific | problem I've been wanting solved for a long while is difficulty | with sharing podcasts with friends across podcast apps. If you're | not using the same podcast app as your friend, it's always a pain | to manually search and find the podcast in your own app. I'd love | a universal podcast url, something like `podcast://<podcast_url>` | that individual podcast apps can understand, which links you to | the podcast within your desired app, similar to the "default | browser" behaviour on mobile and desktop. Has anyone come across | something like this? | DoctorOW wrote: | I mean podcasts are an extension of RSS if I recall | correctly... I don't see why this wouldn't be possible. | adolph wrote: | This is a big problem for iOS. My spouse uses the default | Podcast app. I use Overcast. Anytime she sends me something to | listen iOS tries to open it in Podcasts. When I send something | from Overcast it gets sent as an Overcast URL. | Macha wrote: | Podcasts are just RSS feeds. Nothing stopping a app registering | itself as a handler for the RSS mime type, at least on | desktop/Android (I don't know how iOS works here). I doubt most | users would have a RSS reader installed at this stage, so most | users wouldn't even have a risk of getting it revealed as a | list of links to audio files by using the wrong app. | OJFord wrote: | > Trusted by 2,007 companies and developers. | | Haven't seen this before, an actual figure rather than 'these big | names' (and you have no idea if it's just some small team | somewhere for some toy test/demo, or a significant piece of the | whole organisation's puzzle). | | I'm (just idly) curious what number you waited for (assuming you | did) before making that public. Because, and obviously it'll vary | a bit for different people, there's going to be some number below | which it has negative impact, not just (probably some other, with | a 'meh' range between, number) above which it has the positive | impact that is it's raison d'etre. | cbowal wrote: | When I see usage figures touted in the form "Trusted by X | companies" I assume X is the total number of signups they've | had. | mrweasel wrote: | So like Podcastindex.org, but not free? ___________________________________________________________________ (page generated 2020-11-18 23:00 UTC)