[HN Gopher] Video streaming at scale with Kubernetes and RabbitMQ ___________________________________________________________________ Video streaming at scale with Kubernetes and RabbitMQ Author : thunderbong Score : 129 points Date : 2023-10-09 17:51 UTC (5 hours ago) (HTM) web link (alexandreolive.medium.com) (TXT) w3m dump (alexandreolive.medium.com) | com2kid wrote: | This is nice if you only have to deliver in one format, but as | soon as you want to show up on TVs you are stuck delivering in a | _lot_ of formats, and life gets complicated quickly. | | Throw subtitles in multiple languages, and different audio | tracks, into the mix, and all of a sudden streaming video becomes | a nightmare. | | Finally, if you are dealing with copyrighted materials, you have | to be aware as to what country your user is physically residing | in while accessing the videos, as you likely don't have a license | to stream all your videos in every country all at once. | | Throw this all into a blender and what is needed is a very fancy | asset catalog management system, and that part right there ends | up being annoyingly complicated. | grzes wrote: | "fancy asset catalog management system" - was thinking about | building such solution lately - do you know any open-source | solutions of this kind? | dbrueck wrote: | Oh, this is just the tip of the iceberg. Many parts of on- | demand video streaming are largely commoditized at this point. | Add in support for linear (live) streaming and ad insertion and | things start to get really interesting. :) | mannyv wrote: | Fuck K8. You literally don't need it. Maybe he needs it because | he's building on google cloud. | | AWS is easier, but you can do it with anything. The basic steps | are: | | 1. Upload the file somewhere 2. Transcode it 3. Put the parts | somewhere 4. Serve the parts | | You should really transcode everything into HLS. It's 2023, and | everything that matters supports it. If you want 4k you can use | HLS or the other thing (which I keep forgetting the acronym for). | | If you want to get fancy you can do rendition audio, which not | everything supports. Rendition audio means sharing one audio | stream amongst N number of video streams. | | You can use FFMPEG to transcode, but I'd suggest using AWS | MediaConvert. It's cheap, fast, and probably does everything you | want. Using FFmpeg directly works, but why bother. You will get | an option wrong and screw everything up. You don't want your | video to not work on some random device that 50k people are using | in some country you didn't think about. | | He's using RabbitMQ but you should use SQS, because SQS can | trigger lambdas...which means no polling required. But use | whatever queue you want. | | You can kick the process off by attaching a Lambda to S3, which | will start the process when the file is uploaded. | | You can kick your "availability activation" off by attaching a | Lambda to the S3 output bucket. | | Background: I help run a streaming service and built the backend | pipeline. | | This omits the entire "metadata management and analytics" side as | well. That's left as an exercise for the user. | [deleted] | jonnycoder wrote: | What would you recommend using as an alternative to being | locked into AWS? | jiggawatts wrote: | This post is somewhat unfairly voted down. | | Cloud services like S3 and Azure Storage were invented | specifically for hosting images and video. That's their origin | story, their foundation, their very reason for being. | | Similarly, cloud functions / lambda were invented for | background processing of blobs. The first demos were always of | resizing images! | | Building out this infrastructure yourself is a little insane. | Unless you're Netflix, don't bother. Just dump your videos into | blobs. | | It's like driving to your cousin's place, but step one is | building your own highway because you couldn't be bothered to | check the map to see if one already existed. | | PS: Netflix serves video from BSD running directly on bare | metal because at that scale efficiency matters! If efficiency | doesn't matter _that much_ , use blobs. Kubernetes is going to | be even worse. | totallyunknown wrote: | While the article provides guidance on utilizing standard | software and services to construct a basic video upload platform, | it lacks deeper insights into advanced scaling techniques. | andrewstuart wrote: | I have to ask, why bother with Kubernetes and all the associated | config and pain? Why not just start a new spot instance? I can't | see any reason for Kubernetes in this architecture even though | it's the title of the post. | | Also personally I wouldn't use rabbitmq ... it's pretty | heavyweight... there's lots of lightweight queues out there. | Overall this architecture looks like it could be simplified. | | Also, the post doesn't mention if the video encoding uses GPU | hardware acceleration. Makes a big difference especially if using | spot instances .... ffmpeg in CPU is extremely computationally | expensive. | | Presumably all input videos need reencoding to convert them to | HLS. | malux85 wrote: | This is what I was wondering, in the article it looks like | kubernetes is just used to launch the node containers - why is | the database and rabbitmq outside of kubernetes? This | architecture looks like it's been cobbled together by a junior | baq wrote: | There some of us who still perform four extra steps before | putting any DB in k8s and we have good reasons. | robertlagrant wrote: | Kubernetes loves stateless services. Zero wrong with moving | RabbitMQ or a database outside of it. | malux85 wrote: | Except kubernetes has a whole storage provisioning system | that gives you redundancy and automatic failover, if you're | going to the trouble of running kubernetes why not just run | your whole infra on it? | | I run https://atomictessellator.com solo, using kubernetes, | and my database, Minio object store, application servers, | quantum workers, everything is all on kubernetes, it's self | healing and much simpler to run all the infrastructure the | same. | | Recently I had a node failure while I was sleeping and the | whole system healed itself while I slept, the monitoring | system didn't even alarm me because the small blip of | increased latency while the pods rebalanced wasn't above | the alert threshold so it didn't even wake me up. | | What happens in the article infra when the rabbitmq or | database nodes fail? The whole system goes offline, which | seems very silly setup when you have kubernetes sitting | right there, who's primary function is to handle all of | this. | robertlagrant wrote: | What happens when your storage detaches from your k8s | cluster? Your services start 503ing, hopefully, because | you didn't design your system thinking that k8s == 100% | uptime. | malux85 wrote: | Anybody can invent random problems ad nauseam - that | doesn't prove anything. | | I'm not claiming that it's totally bullet proof, I never | said that - I'm saying that if you had a kubernetes | cluster anyway why not benefit from its abilities? | Especially when the alternative is single node, single | points of failure, which is clearly inferior. | | The "what if the storage detaches" argument could easily | apply to the single node VMs too, in which case the | outcome would be a total system failure. | | We are discussing the contrast between the articles | architecture and running everything on K8s ... and I'm | saying that running everything on K8s is clearly better | pyrophane wrote: | Why do you say RabbitMQ is heavyweight? What queues do you | consider more lightweight and what would be your go-to in a | situation like this? | alexandreolive wrote: | Hello, I'm the writer of the article. We are using Kubernetes | for our whole architecture, consisting of around 40 | microservices and cron jobs. I just wanted in this article to | give an example of asynchronous architecture using Kubernetes | and RabbitMQ. | | We are using RabbitMQ because it's my company target solution. | There might better so lighter solution that would fit us but | having just one for every solution is easier to maintain. | | Great comment about GPU hardware acceleration for encoding, I'm | going to look this up. | andrewstuart wrote: | So Kubernetes is only in this architecture because other | systems use it and its required by the parent company but not | needed. | | That's pretty important context. | alexandreolive wrote: | That's not what I said; sorry if that was not clear. The | parent company requires RabbitMQ, we are using Kubernetes | because managing 40 microservices without it would be hell. | In the article, I only showed 1 user-facing API, but it's | actually multiple services, I just did not want to | complicate it too much. | [deleted] | mihaitodor wrote: | I believe loads of auxiliary microservices have been omitted | for brevity. Of course, those also don't require Kubernetes, | but maybe they have some standardised deployment system which | keeps things manageable. Don't forget about Observability and | whatnot. | [deleted] | schott12521 wrote: | I thoroughly appreciated this article as I've been building a | short-form video content streaming service and the performance | hasn't been what I expected. | | Granted, I knew that my service needs to be able to scale at | different bottlenecks, but a lot of "build your own video | service!" tutorials start with: | | - Build a backend, return a video file | | - Build a frontend, embed the video | | And that leaves a lot to be desired in terms of performance. I | think the actual steps should be: | | - Build a backend that consists of: - Video | Ingestion service - Video Upload / Processing Service | that saves the video into chunks - Build a streaming | service that returns video chunks | | - Build a frontend that consists of: - Build or | use a video streaming library that can play video chunks as a | stream | | Edit: From the author's links, I found this website which is very | informative: https://howvideo.works/ | John23832 wrote: | I built a similar project, and had great results with | cloudflare stream. | mmcclure wrote: | I helped work on howvideo.works, fun to see it helping people! | The world of video is, I'd argue, one of those technical spaces | that is extremely iceberg-y. You can get decently far enough | using S3 + the HTML5 video tag, which I think creates a | perception among some that video is just images but a little | bigger, but that couldn't be further from the truth. You can | really pick just about any step along the video pipeline from | production to playback and go as deep for as many years as | you'd like. | | This is both a semi-shameless plug _and_ probably a few levels | deeper than what you 're looking for, but I organize a | conference for video developers called Demuxed. The YouTube | channel[1] has 8 years worth of conference videos about | streaming video (and the 9th year is happening in a couple of | weeks). The bullet points you mentioned are definitely covered | across a few talks, but it's certainly not in any kind of "how | to" format. | | [1]: https://youtube.com/demuxed | alexandreolive wrote: | I'm the writer of the article; I LOVE howvideo.works. It | helped me quite a lot when I started working on video | processing. I'm still a beginner and always fall back to it | when I'm unsure about something fundamental. Thanks for your | work. I'll take a look at your YouTube channel. | Uehreka wrote: | Something like this? | https://github.com/streamlinevideo/streamline | andrewstuart wrote: | >> I've been building a short-form video content streaming | service | | What does it do? | schott12521 wrote: | Right now I'm basically trying to just re-create the TikTok / | Youtube Shorts / Instagram Reels experience of infinitely | scrolling videos. | | Mostly just building for fun though. | alexandreolive wrote: | I'm the writer of the article; thanks for your lovely comment. | I skipped many essential parts of the architecture in the | article to keep it concise. The following articles will be | about the technical implementation of what I discussed in this | one. | FaisalMahmoud wrote: | For general video streaming, Mux.com has greatly decreased my | development time. Getting playback working is straightforward. | And for advanced use cases, like real time editing and preview in | a web browser, it works as expected and doesn't get in the way. | dvliman wrote: | I built a similar video pipeline, not on Kubernetes but using EC2 | instances for those hungry FFMPEG encoder. | | The system differ in that it was not user generated video | content. It was coming from the cameras in our fitness studio. | | Here is the article if anyone intereste to read about: | https://dev.to/dvliman/building-a-live-streaming-app-in-cloj... | devgoth wrote: | awesome article! curious -- why clojure? | robinduckett wrote: | Because CS Degree | dvliman wrote: | No specific reason. It could have been built in any language. | It was just the language we were using and enjoyed at that | time. | thomasjudge wrote: | OP mentions that "I would love to be a little mouse and peek at | YouTube's complete architecture to see how far we are from them." | You can occasionally find posts -often linked here- from another | player in streaming video which you might have heard of, | discussing technical architecture. For example, this might be a | little lower level that you may be interested in as it relates to | kernel optimizations to jack bit throughput rates, but I dig this | sort of thing - | | https://www.youtube.com/watch?v=36qZYL5RlgY | andrewstuart wrote: | Must be expensive to run on Google Cloud. | | Also looks pretty complex. | | The stabilization step presumably does a video encode .... that's | extremely expensive in terms of time, compute and money I wonder | why it's necessary. | tehlike wrote: | I was thinking the same. CF on the front would improve on it | but still. | | Hetzner or other bare metal providers would probably be a | better idea. | hotnfresh wrote: | CF meaning Cloudflare? If you're serving video through them, | then you're in "enterprise plan" territory. You can't do that | on the free or "self-serve" paid plans. $5k+/m depending on | bandwidth needs (and if you just need a cdn to push bits, CF | won't be competitive on price--their enterprise prices are | tailored for companies that want all sorts of managed | services and private networking stuff) | alexandreolive wrote: | Hello, I'm the writer of the article. Our solution gets videos | from random people who present products we sent them. We get | dodgy videos filmed on bad devices, and the process of | contacting the user and getting him to re-upload another video | in better quality is time-consuming for our team. We'd rather | spend a little bit more in computing to try and save time | overall. I hope this answers your question. | latchkey wrote: | Not necessarily. GCP, when used correctly, can be super cheap. | You also don't know the contractual deals they have with GCP. | klaussilveira wrote: | I wonder if it wouldn't be cheaper to run an on-prem farm of | BestBuy-grade "gamer PC" for smaller scale networks like that. | andrewstuart wrote: | Slap one of these puppies in.... | | AMD Alveo MA35D Media Accelerator | | https://www.xilinx.com/applications/data-center/video- | imagin... | goeiedaggoeie wrote: | Ive used xilinx a fair bit for encoding. once you get past | the pain of compiling your tooling for it it does speed up | VOD encode significantly. | [deleted] ___________________________________________________________________ (page generated 2023-10-09 23:00 UTC)