[HN Gopher] Guide to Adopting AV1 Encoding
       ___________________________________________________________________
        
       Guide to Adopting AV1 Encoding
        
       Author : andyfrancis
       Score  : 124 points
       Date   : 2023-11-03 15:17 UTC (7 hours ago)
        
 (HTM) web link (bitmovin.com)
 (TXT) w3m dump (bitmovin.com)
        
       | jampekka wrote:
       | Maybe this is not such a concern to audience of this article, but
       | at least for me I'm stuck with h264 because VP9/AV1 encoding is
       | really really slow. I'd love to use the codecs that are open and
       | technically better, but when the video encodes at 1fps, it's too
       | just too convenient to use the magnitudes faster h264.
       | 
       | I'm probably not up to date with newer/hardware encoders and
       | please let me know if my view is outdated.
        
         | brucethemoose2 wrote:
         | The hardware encoders are very fast and generally better than
         | x264 (but not by as much as you'd think with the x264 slow
         | preset).
         | 
         | In addition, there are fast threaded AV1 encoders you may be
         | overlooking, like SVT-AV1. For non-realtime, my favorite is
         | av1an, which also yields better quality than is possible from
         | aomenc and works with pretty much any encoder/codec:
         | https://github.com/master-of-zen/Av1an
        
           | Osiris wrote:
           | Hardware encoders are fine for streaming but not archiving. A
           | CPU encoder can produce a file that is half the size or less
           | for the same quality.
        
             | brucethemoose2 wrote:
             | Yes, but AV1 hardware encoding beats x264 software
             | encoding, if thats the only other option (albeit not by
             | much).
        
               | netol wrote:
               | Do you have a way to reproduce this?
        
               | brucethemoose2 wrote:
               | This comparison is not mine, and its excellent:
               | 
               | https://giannirosato.com/blog/post/nvenc-v-qsv/
               | 
               | x264 used the medium 10 bit preset, which is a bit of an
               | oddball because 10 bit AVC is "unofficial," and many
               | hardware decoders don't support it.
        
               | mihaic wrote:
               | The benchmark should be against x265 though, which is
               | mainstream now
        
               | bick_nyers wrote:
               | x264 opponents are earlier versions of AV1, namely VP8
               | and VP9.
               | 
               | x265 should be what is compared against AV1 when
               | discussing quality and encode speeds.
               | 
               | We use x264 for compatibility purposes, if your device is
               | intended to play video, it will decode x264. x265
               | decoders are in a lot of devices at this point, and AV1
               | is just now starting to see representation.
               | 
               | x264 is like .jpeg and will probably never die.
        
               | brucethemoose2 wrote:
               | Yeah y'all are preaching to the choir, but AVC is still
               | the default for... Well, everything, including OP.
               | 
               | It will probably eventually be like mp3, where users are
               | still reflexively encoding to it without a good reason.
        
           | querez wrote:
           | av1an is essentially a "wrapper" around other encoders. I've
           | played around with it lot in the past, but have never seen
           | any real quality gains from using it (as measured by VMAF)
           | instead of just using the encoder directly. Am I doing
           | something wrong? What exactly does av1an buy me (except maybe
           | for better independent threads?)
        
             | brucethemoose2 wrote:
             | First some background. Av1an works by splitting videos into
             | scenes/chunks, which it then encodes in parallel. But this
             | also allows it to change encoding settings for each scene.
             | 
             | Specifically, it has a "VMAF target" mode that encodes a
             | few frames from each scene as samples, measures their VMAF,
             | and then boosts or reduces the encoding preset for that
             | individual scene based on the result. It also has a "black
             | boost" feature to allocate more bitrate to dark scenes,
             | which encoders and various metrics tend to misrepresent.
             | 
             | Also, the possibilities with the VapourSynth support are
             | infinite. Some simple examples include denoising a noisy
             | video on the GPU for better quality than the encoders' CPU
             | denoising, or deblocking with custom parameters, or
             | upscaling with some model in PyTorch. And that's just the
             | start:
             | 
             | https://vsdb.top/
        
         | malloc-0x90 wrote:
         | 1. Compile this: https://gitlab.com/AOMediaCodec/SVT-AV1
         | 
         | 2. ffmpeg -i infile.mp4 -map 0:v:0 -pix_fmt yuv420p10le -f
         | yuv4mpegpipe -strict -1 - | SvtAv1EncApp -i stdin --preset 6
         | --keyint 240 --input-depth 10 --crf 30 --rc 0 --passes 1
         | --film-grain 0 -b infile.ivf
         | 
         | 3. ffmpeg -i infile.ivf -i infile.mp4 -map 0:v -map 1:a:0 -c
         | copy outfile.mp4
        
         | dylan604 wrote:
         | Tricks of the trade: why have one computer compress the file
         | when you can split it up into logical segments and have each
         | segment sent to its own encoder?
        
           | tombert wrote:
           | Back when I still cared about saving disk space, I made a
           | cluster of NVidia Jetson Nanos running in a docker swarm
           | configuration [1] to compress my blu-ray rips, but honestly
           | even when you have six computers working at once, H264 on a
           | single computer is still often faster.
           | 
           | On the Jetson Nanos I was lucky to get maybe 1fps in ffmpeg
           | using VP9. Multiply that by six boards and that's about 6fps
           | in total; ffmpeg running x264 _in software mode_ was getting
           | around 11fps on a single board, not even counting using the
           | onboard encoder chip, meaning that I was getting better
           | performance from one board using x264 than all six using VP9.
           | 
           | Now obviously this is a single anecdote on specific hardware,
           | so I'm not saying that this applies to every single case, but
           | it's a big reason why I personally have not used VP9 for
           | anything substantial yet.
           | 
           | [1] https://gitlab.com/tombert/distributed-transcode
        
             | dylan604 wrote:
             | h.264 is from the 90s, so of course it's fast after ~30
             | years of use. hell, when I first got into encoding, we had
             | dedicated expansion cards to do MPEG-1/MPEG-2 encoding
             | because it was so difficult at the time. New codecs always
             | take time in the beginning while the encoding software is
             | tweaked/optimized. Eventually, it becomes part of the CPU
             | hardware and then we all make comments like "remember when
             | ____ was so slow?" one day, you'll regale the young whiper
             | snappers on internet forums about how painfully slow AV1
             | encodes were when they start complaining how
             | newHotnessEncoder5000 is so slow.
        
               | tombert wrote:
               | Oh definitely, no argument here, I'm 100% ok with AV1
               | becoming the standard "video codec to rule them all", but
               | I'm saying that in the short term, it's difficult to
               | recommend AV1 or VP9 over h264 (at least for personal
               | use). H264 encodes 10x faster, still gives reasonably
               | decent compression, is supported by basically every
               | consumer device [1] and browser out of the box, and very
               | soon will have all the patents for it expired meaning
               | that it will be truly royalty-free. x264 in particular is
               | extremely nice in my experience, doing a lot to really
               | squeeze out a lot of quality in a relatively small amount
               | of space.
               | 
               | That said, AV1 is very obviously the future, and I'm
               | perfectly happy with it taking over the market from h264,
               | and I think that due to the bandwidth savings it's only a
               | matter of time before all the major video services make
               | it the default, especially as the speed of encoders
               | increases to a useable level, which I'm sure it will soon
               | enough.
               | 
               | [1] I know the most recent Raspberry Pi doesn't have a
               | decoder chip for h264, but I think it's fast enough to do
               | it in software.
        
               | padenot wrote:
               | Raspberry Pi have had hardware decoder for h264 for as
               | long as they've existed (I think?), but dropped in the
               | most recent version. I don't understand why.
               | 
               | They've recently contributed non-trivial patches to
               | Firefox to use the embedded Linux API for video hardware
               | acceleration (V4L2, vs. VAAPI on desktop that we also
               | support), and are shipping the h264ify extension with
               | their Firefox build to get that codec often for their
               | users so that the experience is good on older devices.
               | 
               | Maybe the 5 is that much faster than it's not needed as
               | much, but h264 represent so much content that it feels a
               | bit surprising anyways.
               | 
               | But I'm just a software person, hardware is complicated
               | differently.
        
               | orra wrote:
               | > h.264 is from the 90s, so of course its fast after ~30
               | years of use.
               | 
               | If only! Then the patents would have expired. But H.264
               | is newer than MPEG-4 Part 2.
               | 
               | But you're right: H.264 has had the advantage of time, to
               | gain fast hardware support.
        
           | bick_nyers wrote:
           | This is done particularly when you are implementing adaptive
           | bitrates (the thing that Netflix uses where it automatically
           | sends you a higher or lower quality picture depending on your
           | Internet connection).
           | 
           | In adaptive bitrate world, you split a video up into
           | fragments, say 2-10 seconds large, and encode each segment in
           | multiple bitrates, so that every say 5 seconds the video
           | player can make a decision to download a different quality
           | for the next 5 seconds.
           | 
           | Ok, but why not split the file up for standard encoding?
           | Well, you can't just concatenate two .mp4 together without
           | re-encoding and have it make sense to most media players (as
           | far as I am aware), and moreover, it's inefficient from a RAM
           | perspective when doing that. 1 second of RAW uncompressed 4k
           | (24 fps) video is about 600MB. Source content for a single
           | episode/movie at Netflix (I don't work there, just something
           | I read once) can reach into the terabytes easily.
        
             | dylan604 wrote:
             | okay, i don't agree with anything in your reply. segmenting
             | a video file for HLS/DASH delivery is not at all the same
             | thing I'm suggesting. Just for the sake of round numbers,
             | i'm saying to take a 90 minute feature into nine 10-minute
             | segments. fire up 10 separate instances to encode each
             | 10-minute segment. you've just increased the encode time
             | 10x. also, DASH/HLS does not require segmented files. you
             | can have a single contiguous file like an fmp4 as a
             | DASH/HLS source.
             | 
             | >Ok, but why not split the file up for standard
             | encoding?<snip>
             | 
             | at this point, you would be better served by just writing
             | an elementary stream rather than a muxed mp4 file since
             | it's just a segment anyways so why waste the cycles on
             | muxing? you then absolutely 100% can concat those streams
             | (even if you did mux them into a container). if you think
             | you can't, you clearly have not tried very hard.
             | 
             | >I don't work there, just something I read once
             | 
             | I don't work there either, but do have 30+ years of
             | experience with this subject. Sadly, you're not as well
             | informed as you might think. People don't tend to encode to
             | AV1 from RAW. They instead are dealing with a deliverable
             | file most typically a ProRes in today's world after the
             | post process has been completed. No where near terabytes
             | for a feature. More closely to a couple hundred gigabytes
             | for UHD HDR content. You seem to be unnecessarily
             | exaggerating.
             | 
             | Edit: it's a 10x increase in encode speed, not time. that
             | would be opposite effect.
        
               | bick_nyers wrote:
               | Why did the encode time increase by 10x in that instance?
               | Can't you just seek in the video to the I-frame before
               | the cut point and start your encode there?
               | 
               | I've never tried merging streams across computers so was
               | naively just thinking that your output from each computer
               | would be an MP4 but that makes sense.
               | 
               | I pulled that info. from a Netflix talk, perhaps video
               | cameras back from when that talk occured didn't compress
               | the video for you? Besides, isn't IMAX all intra-encoded?
               | It was my understanding that IMAX films are actually just
               | a series of J2K images, so I would imagine that the video
               | cameras used there would also be intra-encoded.
        
               | dylan604 wrote:
               | s/increase/decrease/
               | 
               | i was thinking increased the encode speed 9x, but typed
               | increased encode time. i also swapped the number of
               | segments by segment duration. 9 segments of 10mins = 9x
               | increase in performance.
               | 
               | Sounds like you are confusing Netflix' recommended
               | formats for acquisition vs delivery. Cameras capture RAW
               | formats (rarely is it uncompressed though), and the post
               | houses use that as sources. The post house/color
               | correction will the create the delivery formats which is
               | typically ProRes. RAW is not a friendly format for
               | distribution in the slightest. The full workflow from
               | camera original to what ends up being streamed to the end
               | viewer changes formats multiple times through the
               | process.
        
               | bick_nyers wrote:
               | Gotcha, that makes much more sense
        
             | matsemann wrote:
             | For certain containers / codecs you can concatenate files
             | without re-encoding. Do it quite often with ffmpef using
             | -c:copy and it's basically at the speed of the disk.
        
               | bick_nyers wrote:
               | The parent comment I was responding to was discussing
               | splitting encodes across multiple computers then re-
               | combining which is what I was referring to. Still sounds
               | like it is possible and I was incorrect.
        
             | scottlamb wrote:
             | > Ok, but why not split the file up for standard encoding?
             | Well, you can't just concatenate two .mp4 together without
             | re-encoding and have it make sense to most media players
             | (as far as I am aware)
             | 
             | You can't just literally `cat foo-[123].mp4 > foo.mp4` with
             | old-school non-fragmented .mp4 files, but you just have to
             | shuffle the container stuff around a bit. You don't need to
             | re-encode.
             | 
             | One downside is if you decide ahead of time that you're
             | going to divide the video into fixed 5-second
             | fragments/segments/chunks to encode independently, you're
             | going to end up with that-length closed GOPs that don't
             | match scene transitions or the like. IDR frame every 5
             | seconds. So no B/P frames that reference stuff 10 seconds
             | ago, no periodic incremental refresh, nothing fancy.
        
           | afvid wrote:
           | yep, that's what bitmovin does
        
             | dylan604 wrote:
             | I'm pretty sure that's what everyone does after the first
             | time they try a test encode and see the dismal speeds. It's
             | a trick as old as time. The trick is to make that segment
             | decision better than something like the YT algo that
             | decides where to place an ad break.
        
         | shmerl wrote:
         | Haven't tested it yet. I'm waiting for OBS to enable hardware
         | accelerated AV1 encoding on Linux.
        
         | sergiotapia wrote:
         | How are you getting 1fps encoding? I see 332fps on my 12900K
         | with preset 10.                   -vf scale=1280:720 -c:v
         | libsvtav1 -crf 30 -preset 7 -c:a libopus -b:a 96k -ac 2
        
           | RealStickman_ wrote:
           | Preset 10 is pretty fast, but you're disabling a lot of the
           | features that allow better quality and compression ratios
           | (see readme on the svt-av1 repo). For archiving 3 or 4 is
           | probably good.
        
         | blihp wrote:
         | AV1 hardware encode just started rolling out with the most
         | recent GPU architectures so the majority of hardware still has
         | to do software encoding. I'd also guess that a very large
         | fraction of the hardware out there doesn't even have hardware
         | decode for AV1 since it was only the last generation of GPUs
         | that got that. AV1 is mainly solving problems for the platform
         | owners (google/netflix/facebook etc) and h264 will probably
         | serve typical users for years to come especially if they have
         | that one device they want to keep using that only supports
         | h264.
        
           | rb2k_ wrote:
           | In the past (with h265 / h264 at least), hardware encoding
           | always ended up with visibly worse quality (and often even
           | bigger file sizes) compared to a software encoder like
           | x264/x265.
           | 
           | Do you happen to know if that's still the case?
           | 
           | (I guess for use-cases such as live streaming it doesn't
           | matter that much, but for video that ends up in some archive,
           | it's probably less acceptable)
        
             | blihp wrote:
             | That's usually the case as the hardware encoders tend to
             | make tradeoffs in the direction of lower transistor count /
             | faster frame processing while software encoders have the
             | luxury of going for higher quality.
        
             | bick_nyers wrote:
             | It's around 5% (maybe 10%?) larger file sizes for same
             | visual quality at the moment. For archival I think that's
             | fine, as storage is cheap, it can still be a problem when
             | you pay for outbound bandwidth to users.
        
             | adgjlsfhk1 wrote:
             | hardware encoding gives up a little quality and filesize,
             | but hardware encoding of AV1 will generally beat software
             | encoding of X264 on all axes.
        
           | andyfrancis wrote:
           | Yeah, it will still be a while before anyone can fully get
           | away from h264, but with Apple adding AV1 decoders to their
           | latest chips, hopefully all the wheels are at least in motion
           | now.
        
           | clouddrover wrote:
           | > _I 'd also guess that a very large fraction of the hardware
           | out there doesn't even have hardware decode for AV1 since it
           | was only the last generation of GPUs that got that_
           | 
           | Don't underestimate dav1d. It's a highly optimized software
           | AV1 decoder:
           | 
           | https://code.videolan.org/videolan/dav1d
           | 
           | On my nine year old system, 1080p60 AV1 video was unwatchable
           | with early releases of dav1d due to too many dropped frames.
           | 
           | Eventually dav1d got enough AVX2 optimizations to play the
           | same video on the same hardware with zero dropped frames.
           | 
           | It was an impressive demonstration of what can be achieved
           | when software makes the most of the available hardware.
        
         | aidenn0 wrote:
         | I care mainly about archiving, so 1FPS encodes for 24fps
         | content is within the range that I consider okay.
         | 
         | However, I ran into issues with decoder speed too; whatever
         | codec Kodi was using 18 months ago struggled to decode high-
         | bitrate 1080p AV1 video last time I tried. Maybe this weekend
         | I'll try again on the current version of Kodi.
        
           | bick_nyers wrote:
           | Try encoding with a "low-latency" or a "fast-decode"
           | parameter, and see if that is acceptable to you. Keep in mind
           | not all AV1 encoders are created equal.
           | 
           | If you're on Windows I recommend using StaxRip for encoding.
        
             | aidenn0 wrote:
             | I'm on Linux, currently using ffmpeg for encoding. I'll
             | look for options that might improve decodability, but I
             | seem to recall I had to go pretty aggressive on both of the
             | two av1 encoders I tried to get any savings vs H.265
        
               | bick_nyers wrote:
               | It depends on what bitrate you are targeting, and the
               | source content.
               | 
               | AV1 does not outperform H265 at high bitrates (and in
               | certain cases, medium bitrates). What is considered "high
               | bitrate" is dependent on source content, but a good rule
               | of thumb is 40 MBit (think BluRay quality) or more for 4k
               | content almost always goes to H265.
               | 
               | For AV1, depending on what you are encoding, and how much
               | extra time you want to dedicate to experimentation, take
               | a look at grain synthesis (you will want to test decode
               | capabilities on that one).
        
         | crazygringo wrote:
         | That sounds crazy slow. Look into what flags you're using, and
         | choose a setting that reduces CPU in exchange for compressing
         | less.
         | 
         | All the major codecs have flags for you to balance speed
         | against quality and compression, and it's up to you to pick the
         | right tradeoffs for your use case.
         | 
         | And for most purposes, you want to use software encoders
         | because they're much more flexible in terms of flags/options
         | than hardware encoders. (Hardware encoders are usually
         | optimized for speed rather than quality -- they're for live
         | capture more then for video conversion.)
        
         | turtlebits wrote:
         | Same here. I'm not willing to pony up money for
         | hardware/compute to encode AV1.
         | 
         | I encode x265 on a $2.5/month VPS w/ 4gb ram during the off
         | hours. AV1 is almost an order of magnitude slower.
        
         | cbhl wrote:
         | My understanding is there is some support for GPU-accelerated
         | AV1 encoding in top-of-the-line latest-gen discrete GPUs that
         | can be used for live streaming with OBS Studio 29.1 or higher.
         | Helpful if you are in a situation where you have extra GPU and
         | are tight on bandwidth.
         | 
         | In my opinion it's still in the early-adopter phase though, and
         | it's perfectly valid to use tried-and-true codecs for user-
         | interactive rendering and encoding cases, or where the existing
         | codec meets your requirements for the compute vs disk/bandwidth
         | trade-off.
        
       | shmerl wrote:
       | _> YouTube for example, encodes content in H.264 /AVC, VP9 and
       | AV1_
       | 
       | Why don't I see AV1 in many Youtube videos though? Checking with
       | yt-dlp. It looks like they were planning to use it, but didn't
       | really roll it out.
        
         | Almondsetat wrote:
         | Av1 is almost always present in >2k videos
        
         | adgjlsfhk1 wrote:
         | the new codecs tend to only get deployed for higher resolution.
         | 5% better 360p video doesn't help if you are also sending 4k
         | video.
        
           | shmerl wrote:
           | Ah, that makes sense. Just checked one with high resolution
           | available and I can see AV1 there.
        
           | andyfrancis wrote:
           | Yeah, that's true. I did see something from Meta recently
           | where they were delivering lower resolution/bandwidth AV1 to
           | devices that didn't have AV1 hardware decoders, but had
           | enough CPU for software decoding of the smaller renditions.
        
         | Jap2-0 wrote:
         | My (not super informed) understanding is that it may also
         | depend on how popular the video is - the increase in encoding
         | time is less worth it if there are few people seeing quality or
         | bandwidth benefits (depending on how they tune it).
        
       | dmillar wrote:
       | OPUS should have an honorable mention here. Same case could be
       | made for OPUS vs AAC/AC3/etc.
        
         | Dwedit wrote:
         | OPUS is everywhere where WebRTC is used, as a Mandatory to
         | Implement codec.
        
         | poser-boy wrote:
         | I too wish Opus got more love in the video space. Outside of
         | YouTube and VOIP, it doesn't get used.
         | 
         | Shame since Opus is smarter with bitrate allocation with
         | surround sound.
        
       | beastman82 wrote:
       | The only part of this that is a "guide to encoding" is them
       | pitching their encoding mechanism. Quite a misleading title
        
       | Ayesh wrote:
       | Instead of being a "guide" as it says in the title, this is
       | merely comparing AV1 to other codecs, and ends by promoting their
       | own thing. I don't think AV1 needs any "convincing", it's
       | theoretically just better in every aspect. It's the tooling and
       | hardware that needs work.
        
         | tverbeure wrote:
         | One of my take-aways after going through it is the cross-over
         | point between h.264 and AV1: it depends on the numbers of
         | expected view. AV1 is computationally more intensive and
         | there's a dollar number assigned to that.
        
       | mastax wrote:
       | I'm surprised the HEVC Support figure is so low: 15%! I suppose
       | HEVC is practically only supported by relatively modern devices
       | which have hardware support for it. None of the OS or browser
       | makers want to ship software support because they'd have to pay
       | the license fee. (I paid $1 for the HEVC extension in the Windows
       | Store but I'm sure that's exceedingly rare). Browsers can ship
       | software AV1 support without needing to pay.
        
         | clouddrover wrote:
         | In addition to codec licensing fees, another problem with HEVC
         | is the threat of content licensing fees:
         | 
         | https://streaminglearningcenter.com/codecs/codec-royalties-o...
         | 
         | It's just not worth it. Royalty-free formats (like AV1) are the
         | way to go on the web.
        
       | poser-boy wrote:
       | I'm sure AV1 will have it's time. But for now I'm preferring x265
       | HEVC for hi-fi video (movies etc.).
        
       ___________________________________________________________________
       (page generated 2023-11-03 23:00 UTC)