[HN Gopher] Linux's Stateless H.264 Decode Interface Ready to Be... ___________________________________________________________________ Linux's Stateless H.264 Decode Interface Ready to Be Deemed Stable Author : mfilion Score : 92 points Date : 2020-11-17 19:02 UTC (3 hours ago) (HTM) web link (www.phoronix.com) (TXT) w3m dump (www.phoronix.com) | zerocrates wrote: | Since H.264 has P-frames and B-frames and so on which I think | would be by definition stateful, I assume that processing is just | pushed out to the client/userspace somehow? | stormer2000 wrote: | Stateless refer to the HW Interface. The stateless HW can | accept decoding jobs in any order as long as all the | information is properly provided (parameters extracted and | deduces from the bitstream along with the previously decoded | references). As a side effect, it is trivial to multiplex | multiple streams using this type of HW. | | The V4L2 layer keeps a bit of the state (more like caching, to | avoid re-uploading too much information for each jobs). The | userspace is responsible for bitstream parsing and DPB | management (including re-ordering). | zerocrates wrote: | Oh so, you just provide whatever reference frames, if any, | are needed, and it's just on you to make sure you've decoded | what's necessary first? The difference here basically being | that the hardware will not do the "bookkeeping"? | stormer2000 wrote: | Correct. | zerocrates wrote: | Thanks for explaining. | vlovich123 wrote: | Are there performance implications of needing to upload the | entire state needed for a single frame? Or do none of these | encoders have such caching anyway & thus it's just pushing | the complex pieces of resource management out to user space | where it belongs better? | megous wrote: | I don't think anything is uploaded anywhere, you just need | more RAM to keep frames around as long as the are | necessary. The decoder operates on data in system's RAM. | vlovich123 wrote: | Is that generally true to be faster rather than having | dedicated RAM alongside the ASIC? Or are the unit | economics not worth it and generally unified memory | systems is the current dominating design? | londons_explore wrote: | Considering most pixels in the reference frames will be | read, on average, less than once per generated frame, it | makes no sense to have dedicated RAM. | bob1029 wrote: | I believe this is still an interframe approach, but something | regarding the actual underlying code is different than a | traditional implementation. | | From an information theory perspective, you absolutely must | have some way to pass intermediate frames around, otherwise you | are just talking about some variant of an intraframe technique | and all of the video efficiency losses that would go along with | that (i.e. encoding each video frame as JPEG). | stormer2000 wrote: | All H.264 (and other CODECs) prediction are covered _. As | stated correctly in previous comment, userspace do the | bookkeeping. Kernel is aware of the reference buffers and it | 's attachements (HW specific buffers). All references needed | for predictive decoding (regardless if it's B or P), are | programmed for each jobs. | | _ Exception to FMO and ASO mode, which is rarely supported in | HW, even FFMPEG sw decoder didn't bother implementing that | acchow wrote: | Should be the reverse - these are pulled into behind the | interface so you don't need to deal with them in userspace. | Matt3o12_ wrote: | Can someone explain the purpose of this decoder? As far as I | know, decoding H.264 is already pretty solid on Linux and I don't | know what benefits of making it stateless there are. I could | definitely see why a stateless encoder would be beneficial (i.e. | to spread out the load), but isn't decoding h.264 already a | solved problem? | [deleted] | CameronNemo wrote: | Certain hardware does not have accelerated video decoding on | mainline Linux. In particular, ARM chips with a rockchip VPU. | Pinebook Pro is one such device. | stagger87 wrote: | This isn't a decoder, this is an interface. The benefits of | stateless decoding are simpler hardware, and more flexibility | in decoding (among others). | Thaxll wrote: | h264 decoding in kernel in 5.11 isn't that too late? My 12y/o | laptop can decode h264 in hardware what's the point of adding | that in the kernel in 2020? | MaxBarraclough wrote: | From what I can tell this will be able to take advantage of | hardware acceleration, for that matter I'm not sure that | software decoding will be supported at all. The novel point | here is the _stateless_ part. | | Relevant reading: | https://www.kernel.org/doc/html/latest/userspace-api/media/v... | renewiltord wrote: | Each qualifier is important, my dude. 'Stateless' is important | here. | | Explanation in meme format below | | -------------------------------- | | Scientists: Alien life found! | | You: Life found? I am life. I've been life for 30 years. This | is not a big deal. | megous wrote: | So that I can enjoy HW decoding on my SBCs that use SoCs for TV | boxes (Allwinner H5, H6), on Pinebook Pro, and on Pinephone. | Your 12y/o laptop will not do that for me. | CameronNemo wrote: | I think/hope this will be reused for h265, VP9, and AV1. | stormer2000 wrote: | VP9 and HEVC (H265) kernel user API exist and are being | cleaned up. This takes a lot of time and a lot of testing, so | bear with us. We don't have any sillicon with enough spec we | could write a driver for that supports AV1 at the moment. | When this happens, we'll definitely get that up and running. | | Even though most ancient CODEC and it's existing content | decodes fine on CPU, the HW decoder uses less power and is | better for battery life. This work enables mostly lower power | SoC like Allwinner, Rockchip, i.MX8M, RPi4 (HEVC), Mediatek, | Microchip, and so on, but also higher capacity chips that can | be connted through PCIe to surpass your CPU capacity | (Blaize). | | Also, understand that difference between the V4L2 and the GPU | accelerators. GPU uses command stream channel, which need to | be centrally managed. That landed into DRM + Mesa, under the | VA-API. DRM drivers could have been an option, but would have | required per-HW userspace in Mesa. VA-API also being a miss- | fit for some of the sillicon (Hantro based) would have made | things more complex then needed. | CameronNemo wrote: | > bear with us | | Certainly. Very happy to see this work progressing. Hope I | can video call on my pinebook pro without it burning a | whole in my laptop one day haha. | reggieband wrote: | > but also higher capacity chips that can be connted | through PCIe to surpass your CPU capacity (Blaize). | | How do things like Nvidia Nvenc fit in? | eptcyka wrote: | Yes, and your 12y/o laptop can happily chug along with nvdpau | or libva. | exabrial wrote: | Sorry to ask a dumb question, but why is this in the kernel | instead of user space? | ahupp wrote: | It's a hardware decoder. | th0ma5 wrote: | Isn't most everything going on in custom decoding silicon now? | dahfizz wrote: | Yup, which is exactly why this is in the kernel and not just a | userspace library. The kernel is how programs interface with | hardware. | throwaway2048 wrote: | This would be a front-end to said silicon | CameronNemo wrote: | This can be really useful for decoding multiple video chat | streams. | stormer2000 wrote: | Indeed, this type of HW allow decoding an unlimited number of | streams (to a certain extend it won't be real time, but will | still work until you ran out of RAM). Also suspended streams | don't use any resources in some firmware that would prevent | other streams to decode in HW. | megous wrote: | This is most excellent and huge thanks to everyone working on | this. I tried it recently on Pinephone connected via USB-C dock | to an external monitor, and it's just awesome what it can do: | https://www.youtube.com/watch?v=dHOgVmxH_dA (using gstreamer) | Hopefully, the support will start trickling down to ffmpeg and | common players, like mpv, now that the API will be stable. | swiley wrote: | As a pinephone owner who would like to be able to charge the | phone while watching videos. What do you have to do to get this | working on PostmarketOS? (or will I have to recompile | everything?) ___________________________________________________________________ (page generated 2020-11-17 23:00 UTC)