[HN Gopher] Linux's Stateless H.264 Decode Interface Ready to Be...
       ___________________________________________________________________
        
       Linux's Stateless H.264 Decode Interface Ready to Be Deemed Stable
        
       Author : mfilion
       Score  : 92 points
       Date   : 2020-11-17 19:02 UTC (3 hours ago)
        
 (HTM) web link (www.phoronix.com)
 (TXT) w3m dump (www.phoronix.com)
        
       | zerocrates wrote:
       | Since H.264 has P-frames and B-frames and so on which I think
       | would be by definition stateful, I assume that processing is just
       | pushed out to the client/userspace somehow?
        
         | stormer2000 wrote:
         | Stateless refer to the HW Interface. The stateless HW can
         | accept decoding jobs in any order as long as all the
         | information is properly provided (parameters extracted and
         | deduces from the bitstream along with the previously decoded
         | references). As a side effect, it is trivial to multiplex
         | multiple streams using this type of HW.
         | 
         | The V4L2 layer keeps a bit of the state (more like caching, to
         | avoid re-uploading too much information for each jobs). The
         | userspace is responsible for bitstream parsing and DPB
         | management (including re-ordering).
        
           | zerocrates wrote:
           | Oh so, you just provide whatever reference frames, if any,
           | are needed, and it's just on you to make sure you've decoded
           | what's necessary first? The difference here basically being
           | that the hardware will not do the "bookkeeping"?
        
             | stormer2000 wrote:
             | Correct.
        
               | zerocrates wrote:
               | Thanks for explaining.
        
           | vlovich123 wrote:
           | Are there performance implications of needing to upload the
           | entire state needed for a single frame? Or do none of these
           | encoders have such caching anyway & thus it's just pushing
           | the complex pieces of resource management out to user space
           | where it belongs better?
        
             | megous wrote:
             | I don't think anything is uploaded anywhere, you just need
             | more RAM to keep frames around as long as the are
             | necessary. The decoder operates on data in system's RAM.
        
               | vlovich123 wrote:
               | Is that generally true to be faster rather than having
               | dedicated RAM alongside the ASIC? Or are the unit
               | economics not worth it and generally unified memory
               | systems is the current dominating design?
        
               | londons_explore wrote:
               | Considering most pixels in the reference frames will be
               | read, on average, less than once per generated frame, it
               | makes no sense to have dedicated RAM.
        
         | bob1029 wrote:
         | I believe this is still an interframe approach, but something
         | regarding the actual underlying code is different than a
         | traditional implementation.
         | 
         | From an information theory perspective, you absolutely must
         | have some way to pass intermediate frames around, otherwise you
         | are just talking about some variant of an intraframe technique
         | and all of the video efficiency losses that would go along with
         | that (i.e. encoding each video frame as JPEG).
        
           | stormer2000 wrote:
           | All H.264 (and other CODECs) prediction are covered _. As
           | stated correctly in previous comment, userspace do the
           | bookkeeping. Kernel is aware of the reference buffers and it
           | 's attachements (HW specific buffers). All references needed
           | for predictive decoding (regardless if it's B or P), are
           | programmed for each jobs.
           | 
           | _ Exception to FMO and ASO mode, which is rarely supported in
           | HW, even FFMPEG sw decoder didn't bother implementing that
        
         | acchow wrote:
         | Should be the reverse - these are pulled into behind the
         | interface so you don't need to deal with them in userspace.
        
       | Matt3o12_ wrote:
       | Can someone explain the purpose of this decoder? As far as I
       | know, decoding H.264 is already pretty solid on Linux and I don't
       | know what benefits of making it stateless there are. I could
       | definitely see why a stateless encoder would be beneficial (i.e.
       | to spread out the load), but isn't decoding h.264 already a
       | solved problem?
        
         | [deleted]
        
         | CameronNemo wrote:
         | Certain hardware does not have accelerated video decoding on
         | mainline Linux. In particular, ARM chips with a rockchip VPU.
         | Pinebook Pro is one such device.
        
         | stagger87 wrote:
         | This isn't a decoder, this is an interface. The benefits of
         | stateless decoding are simpler hardware, and more flexibility
         | in decoding (among others).
        
       | Thaxll wrote:
       | h264 decoding in kernel in 5.11 isn't that too late? My 12y/o
       | laptop can decode h264 in hardware what's the point of adding
       | that in the kernel in 2020?
        
         | MaxBarraclough wrote:
         | From what I can tell this will be able to take advantage of
         | hardware acceleration, for that matter I'm not sure that
         | software decoding will be supported at all. The novel point
         | here is the _stateless_ part.
         | 
         | Relevant reading:
         | https://www.kernel.org/doc/html/latest/userspace-api/media/v...
        
         | renewiltord wrote:
         | Each qualifier is important, my dude. 'Stateless' is important
         | here.
         | 
         | Explanation in meme format below
         | 
         | --------------------------------
         | 
         | Scientists: Alien life found!
         | 
         | You: Life found? I am life. I've been life for 30 years. This
         | is not a big deal.
        
         | megous wrote:
         | So that I can enjoy HW decoding on my SBCs that use SoCs for TV
         | boxes (Allwinner H5, H6), on Pinebook Pro, and on Pinephone.
         | Your 12y/o laptop will not do that for me.
        
         | CameronNemo wrote:
         | I think/hope this will be reused for h265, VP9, and AV1.
        
           | stormer2000 wrote:
           | VP9 and HEVC (H265) kernel user API exist and are being
           | cleaned up. This takes a lot of time and a lot of testing, so
           | bear with us. We don't have any sillicon with enough spec we
           | could write a driver for that supports AV1 at the moment.
           | When this happens, we'll definitely get that up and running.
           | 
           | Even though most ancient CODEC and it's existing content
           | decodes fine on CPU, the HW decoder uses less power and is
           | better for battery life. This work enables mostly lower power
           | SoC like Allwinner, Rockchip, i.MX8M, RPi4 (HEVC), Mediatek,
           | Microchip, and so on, but also higher capacity chips that can
           | be connted through PCIe to surpass your CPU capacity
           | (Blaize).
           | 
           | Also, understand that difference between the V4L2 and the GPU
           | accelerators. GPU uses command stream channel, which need to
           | be centrally managed. That landed into DRM + Mesa, under the
           | VA-API. DRM drivers could have been an option, but would have
           | required per-HW userspace in Mesa. VA-API also being a miss-
           | fit for some of the sillicon (Hantro based) would have made
           | things more complex then needed.
        
             | CameronNemo wrote:
             | > bear with us
             | 
             | Certainly. Very happy to see this work progressing. Hope I
             | can video call on my pinebook pro without it burning a
             | whole in my laptop one day haha.
        
             | reggieband wrote:
             | > but also higher capacity chips that can be connted
             | through PCIe to surpass your CPU capacity (Blaize).
             | 
             | How do things like Nvidia Nvenc fit in?
        
         | eptcyka wrote:
         | Yes, and your 12y/o laptop can happily chug along with nvdpau
         | or libva.
        
       | exabrial wrote:
       | Sorry to ask a dumb question, but why is this in the kernel
       | instead of user space?
        
         | ahupp wrote:
         | It's a hardware decoder.
        
       | th0ma5 wrote:
       | Isn't most everything going on in custom decoding silicon now?
        
         | dahfizz wrote:
         | Yup, which is exactly why this is in the kernel and not just a
         | userspace library. The kernel is how programs interface with
         | hardware.
        
         | throwaway2048 wrote:
         | This would be a front-end to said silicon
        
       | CameronNemo wrote:
       | This can be really useful for decoding multiple video chat
       | streams.
        
         | stormer2000 wrote:
         | Indeed, this type of HW allow decoding an unlimited number of
         | streams (to a certain extend it won't be real time, but will
         | still work until you ran out of RAM). Also suspended streams
         | don't use any resources in some firmware that would prevent
         | other streams to decode in HW.
        
       | megous wrote:
       | This is most excellent and huge thanks to everyone working on
       | this. I tried it recently on Pinephone connected via USB-C dock
       | to an external monitor, and it's just awesome what it can do:
       | https://www.youtube.com/watch?v=dHOgVmxH_dA (using gstreamer)
       | Hopefully, the support will start trickling down to ffmpeg and
       | common players, like mpv, now that the API will be stable.
        
         | swiley wrote:
         | As a pinephone owner who would like to be able to charge the
         | phone while watching videos. What do you have to do to get this
         | working on PostmarketOS? (or will I have to recompile
         | everything?)
        
       ___________________________________________________________________
       (page generated 2020-11-17 23:00 UTC)