[HN Gopher] Replacing WebRTC: real-time latency with WebTranspor...
       ___________________________________________________________________
        
       Replacing WebRTC: real-time latency with WebTransport and WebCodecs
        
       Author : kixelated
       Score  : 180 points
       Date   : 2023-10-30 14:38 UTC (8 hours ago)
        
 (HTM) web link (quic.video)
 (TXT) w3m dump (quic.video)
        
       | tamimio wrote:
       | I got excited for a second that a new something will replace
       | webrtc for media/videos.. it is not. Around 2 years ago in a
       | project I wanted to transfer a raw 4K stream with low latency
       | (sub 50ms) over cellular network, WebRTC performed poorly it was
       | a no go, I ended up making my own algorithm that enables FEC
       | (Forward Error Correction) based on IETF payload scheme 20, and
       | piped it through UDP with Gstreamer and managed to make it work,
       | but obviously wasn't in the browser.
        
         | englishm wrote:
         | That sounds like it was an interesting project! Jean-Baptiste
         | Kempf also presented something at Demuxed last week using FEC
         | and QUIC datagrams to get similarly low latency delivery over
         | WebTransport into a browser. There's also a draft[1] for adding
         | FEC to QUIC itself so it's quite possible Media over QUIC (MoQ)
         | could benefit from this approach as well.
         | 
         | I'm not sure why you say "it is not." We have a working MoQ
         | demo running already on https://quic.video that includes a
         | Rust-based relay server and a TypeScript player. The BBB demo
         | video is being ingested using a Rust-based CLI tool I wrote
         | called moq-pub which takes fragmented MP4 input from ffmpeg and
         | sends it to a MoQ relay.
         | 
         | You can also "go live" from your own browser if you'd like to
         | test the latency and quality that way, too.
         | 
         | [1]: https://www.ietf.org/archive/id/draft-michel-quic-
         | fec-01.htm...
        
           | tamimio wrote:
           | Thanks, it was a big project and streaming 4k in realtime
           | from a flying drone was one of the challenging parts! I have
           | some write up about it although nothing too technical but
           | some videos there demonstrating some differences(1)
           | 
           | > I'm not sure why you say "it is not."
           | 
           | Pardon my ignorance it looked like it isn't replacing WebRTC
           | entirely yet, but glad I was wrong, I never tried anything
           | QUIC for media related, would love to try the MoQ tool you
           | did, and like the fact it's rust based too as the one I did
           | was written in rust. I will give it a test for sure, it's
           | been two years and I wasn't following any updates so
           | hopefully there's an improvement compared what it was back
           | then.
           | 
           | > takes fragmented MP4 input from ffmpeg and sends it to a
           | MoQ relay.
           | 
           | Just a quick question, is ffmpeg a "requirement" per se for
           | that CLI tool? As I remember I had to ditch ffmpeg in favor
           | of gstreamer since the former one was eating up a lot of
           | resources compared to Gstreamer, and it was crucial issue
           | since the server was basically an SBC on a flying drone.
           | 
           | (1) https://tamim.io/professional_projects/nerds-heavy-lift-
           | dron...
        
             | englishm wrote:
             | The current moq-pub implementation only requires valid fMP4
             | (a la CMAF) to be provided over stdin. I haven't tested,
             | but I imagine you can do the same with gstreamer.
             | 
             | Separately, I've been working on a wrapper library for moq-
             | rs that I've been calling 'libmoq'. The intent there is to
             | provide a C FFI that non-Rust code can link against. The
             | first integration target for libmoq is ffmpeg. (I have a
             | few bugs to work out before I clean up and shout about that
             | code, but it does mostly work already.)
             | 
             | I gave a presentation about some of this work last week at
             | Demuxed, but the VoDs probably won't be available on
             | YouTube until Decemberish.
             | 
             | Also, I understand the gstreamer project has better support
             | for Rust so I'll be looking at that soon, too.
        
         | Sean-Der wrote:
         | That's a cool problem!
         | 
         | I could see how WebRTC out of the box would perform poorly. It
         | wants to NACK + delay to give a good 'conferencing experience'.
         | 
         | I bet with FlexFEC [0] and playout-delay [1] you would get the
         | behavior you were looking for. The sender would have to be
         | custom, but the receivers (browsers) should just work! If you
         | are interested in giving it a shot again would love to help :)
         | 
         | [0] https://datatracker.ietf.org/doc/html/rfc8627
         | 
         | [1] https://webrtc.googlesource.com/src/+/main/docs/native-
         | code/...
        
           | tamimio wrote:
           | It was an interesting problem indeed! I had some write up
           | about it (and the whole project) in a link I have in the
           | comment above, might gives more context.
           | 
           | > I bet with FlexFEC [0]
           | 
           | The previous draft (1) of this was my basis when I did the
           | FEC sender. I didn't manage to have it streamed 4K into the
           | browser though, my client was OBS with gstreamer as it was
           | far more performant than a browser in my tests, have you/any
           | demo you did that I can try to stream it into the browser?
           | That would be really major improvement! And appreciate the
           | help, O would definitely give it another shot!
           | 
           | (1) https://datatracker.ietf.org/doc/html/draft-ietf-payload-
           | fle...
        
             | Sean-Der wrote:
             | I would love too! Join https://pion.ly/slack and I am Sean-
             | Der
             | 
             | If you prefer discord there also is a dedicated space for
             | 'real-time broadcasting' https://discord.gg/DV4ufzvJ4T
        
         | imtringued wrote:
         | Well, a raw 4k stream has a bit rate of 11.9 gbps. I would be
         | surprised if that worked over a cellular network at all.
        
           | vlovich123 wrote:
           | At 60fps but yeah, I think they mean they still passed the
           | raw 4k stream through a lossy codec before putting it over
           | cellular.
        
           | tamimio wrote:
           | I think it was around 3gbps or even less if I remember
           | correctly, it wasn't 60fps and I can't remember the color
           | depth either, and the cellular network was a SA (Stand Alone)
           | mmwave private network, not a commercial one, so it did work
           | in our project eventually.
        
         | modeless wrote:
         | Yeah. From what I can see WebRTC is basically a ready made
         | implementation of a video calling app, bolted on the side of
         | the browser, that you can slap some UI on top of. If you want
         | to do anything that isn't a straight Zoom style video calling
         | app (or Stadia, RIP), you'll immediately run into a wall of 5
         | year old known but unfixed bugs, or worse.
         | 
         | To be fair, video calling is an incredibly important use case
         | and I'm glad it can be done in the browser. I am grateful to
         | the WebRTC team every time I can join a Zoom call without
         | installing a bunch of extra crap on my machine. I just hope
         | that WebRTC really can someday be replaced by composing
         | multiple APIs that are much smaller in scope and more flexible,
         | to allow for use cases other than essentially Zoom and Stadia.
         | 
         | I guess I'm just repeating what the article said, but it's so
         | right that it's worth repeating.
        
       | DaleCurtis wrote:
       | Thanks for the nice write up! I work on the WebCodecs team at
       | Chrome. I'm glad to hear it's mostly working for you. If you (or
       | anyone else) has specific requests for new knobs regarding "We
       | may need more encoding options, like non-reference frames or
       | SVC", please file issues at
       | https://github.com/w3c/webcodecs/issues
        
         | vlovich123 wrote:
         | There's a few that would be neat:
         | 
         | * maybe possible already, but it's not immediately clear how to
         | change the bitrate of the encoder dynamically when doing
         | VBR/CBR (seems like you can only do it with per-frame
         | quantization params which isn't very friendly)
         | 
         | * being able to specify the reference frame to use for encoding
         | p frames
         | 
         | * being able to generate slices efficiently / display them
         | easily. For example, Oculus Link encodes 1/n of the video in
         | parallel encoders and decodes similarly. This way your encoding
         | time only contributes 1/n frame encode/decode worth of latency
         | because the rest is amortized with tx+decode of other slices. I
         | suspect the biggest requirement here is to be able to cheaply
         | and easily get N VideoFrames OR be able to cheaply split a
         | VideoFrame into horizontal or vertical slices.
        
           | DaleCurtis wrote:
           | * Hmm, what kind of scheme are you thinking beyond per frame
           | QP? Does an abstraction on top of QP work for the case you
           | have in mind?
           | 
           | * Reference frame control seems to be
           | https://github.com/w3c/webcodecs/issues/285, there's some
           | interest in this for 2024, so I'd expect progress here.
           | 
           | * Does splitting frames in WebGPU/WebGL work for the use case
           | here? I'm not sure we could do anything internally (we're at
           | the mercy of hardware decode implementations) without
           | implementing such a shader.
        
             | vlovich123 wrote:
             | > what kind of scheme are you thinking beyond per frame QP
             | 
             | Ideally I'd like to be able to set the CBR / VBR bitrate
             | instead of some vague QP parameter that I manually have to
             | profile to figure out how it corresponds to a bitrate for a
             | given encoder. Of course, maybe encoders don't actually
             | support this? I can't recall. It's been a while.
             | 
             | > Does splitting frames in WebGPU/WebGL work for the use
             | case here? I'm not sure we could do anything internally
             | (we're at the mercy of hardware decode implementations)
             | without implementing such a shader.
             | 
             | I don't think you need a shader. We did it at Oculus Link
             | with existing HW encoders and it worked fine (at least for
             | AMD and NVidia - not 100% sure about Intel's capabilities).
             | It did require some bitmunging to muck with the NVidia H264
             | bitstream to make the parallel QCOM decoders happy with
             | slices coming from a single encoder session* but it wasn't
             | that significant a problem.
             | 
             | For video streaming, supporting a standard for Webcams to
             | be able to deliver slices with timestampped information
             | about the rolling shutter (+ maybe IMU for mobile use
             | cases) would help create a market for premium low-latency
             | webcams. You'd need to figure out how to implement just in
             | time rolling shutter corrections on the display side to
             | mitigate the downsides of rolling shutter but the extra IMU
             | information would be very useful (many mobile camera
             | display packages support this functionality). VR displays
             | often have rolling shutter so a rolling shutter webcam +
             | display together would really make it possible to do "just
             | in time" corrections for where pixels end up to adjust for
             | latency. I'm not sure how much you'd get out of that, but
             | my hunch is that if you knock out all the details you
             | should be able to shave off nearly a frame of latency glass
             | to glass.
             | 
             | Speaking of adjustments, extracting motion vectors from the
             | video is also useful, at least for VR, so that you can give
             | the compositor the relevant information to apply last-
             | minute corrections for that "locked to your motion" feeling
             | (counteracts motion sickness).
             | 
             | On a related note, with HW GPU encoders, it would be nice
             | to have the webcam frame sent from the webcam directly to
             | the GPU instead of round-tripping into a CPU buffer that
             | you then either transport to the GPU or encode on the CPU -
             | this should save a few ms of latency. Think NVidia's Direct
             | standards but extended so that the GPU can grab the frame
             | from the webcam, encode & maybe even send it out over
             | Ethernet directly (the Ethernet part would be particularly
             | valuable for tech like Stadia / GeForce now). I know the HW
             | standards for that don't actually exist yet, but it might
             | be interesting to explore with NVidia, AMD, and Intel what
             | HW acceleration of that data path might look like.
             | 
             | * NVidia's encoder supports slices directly and has an
             | artificial limit on the number of encoder sessions on
             | consumer drivers (they raised it in the past few years but
             | IIRC it's still anemic). That however means that the
             | generated slices have some incorrect parameters in the
             | bitstream if you want to decode them independently. So you
             | have to muck with the bitstream in a trivial way so that
             | the decoders see independent valid H264 bitstreams they can
             | decode. On AMD you don't have a limit to the number of
             | encoder.
        
               | DaleCurtis wrote:
               | > Ideally I'd like to be able to set the CBR / VBR
               | bitrate
               | 
               | What's wrong with the existing VBR/CBR modes?
               | https://developer.mozilla.org/en-
               | US/docs/Web/API/VideoEncode...
               | 
               | > I don't think you need a shader...
               | 
               | Ah I see what you mean. It'd probably be hard for us to
               | standardize this in a way that worked across platforms
               | which likely precludes us from doing anything quickly
               | here. The stuff easiest to standardize for WebCodecs is
               | stuff that's already standardized as part of the relevant
               | codec spec (e.g, AVC, AV1, etc) and well supported on a
               | significant range of hardware.
               | 
               | > ... instead of round-tripping into a CPU buffer
               | 
               | We're working on optimizing this in 2024, we do avoid CPU
               | buffers in some cases, but not as many as we could.
        
         | kixelated wrote:
         | Thanks for WebCodecs!
         | 
         | I'm still just trying to get A/V sync working properly because
         | WebAudio makes things annoying. WebCodecs itself is great; I
         | love the simplicity.
        
           | padenot wrote:
           | https://blog.paul.cx/post/audio-video-synchronization-
           | with-t... has some background, https://github.com/w3c/webcode
           | cs/blob/main/samples/lib/web_a... is part of a full example
           | that you can run using web codecs, web audio, audioworklet
           | SharedArrayBuffer, and does A/V sync.
           | 
           | If it doesn't answer your question let me know because I
           | wrote both (and part of the web audio spec, and part of the
           | webcodecs spec).
        
             | kixelated wrote:
             | I'm using AudioWorklet and SharedArrayBuffer. Here's my
             | code: https://github.com/kixelated/moq-
             | js/tree/main/lib/playback/w...
             | 
             | It's just a lot of work to get everything right. It's kind
             | of working, but I removed synchronization because the
             | signaling between the WebWorker and AudioWorklet got too
             | convoluted. It all makes sense; I just wish there was an
             | easier way to emit audio.
             | 
             | While you're here, how difficult would it be to implement
             | echo cancellation? The current demo is uni-directional but
             | we'll need to make it bi-directional for conferencing.
        
         | hokkos wrote:
         | I use the WebCodecs API with VideoDecoder with a very specific
         | use case, to get data arrays using the great compression of
         | video codec and the data having temporal coherency. Demo here :
         | https://energygraph.info/d/f487b4fd-45ad-4f94-8e7e-ea32fc280...
         | 
         | And I have some issues with the copyTo method of VideoFrame, on
         | mobile (Pixel 7 Pro) it is unreliable and output all 0
         | Uint8Array beyond 20 frames, to the point I am forced to render
         | each frame to an OffscreenCanvas. Also the many formats of
         | frame output around RGBA/R8 with reduced range 16-235 or full
         | range 0-255 makes it hard to use in my convoluted way.
        
           | DaleCurtis wrote:
           | Please file an issue at https://crbug.com/new with the
           | details and we can take a look. Are you rendering frames in
           | order?
           | 
           | Android may have some quirks due to legacy MediaCodec
           | restrictions around how we more commonly need frames for
           | video elements, frames only work in sequential order since
           | they must be released to an output texture to access them
           | (and releasing invalidates prior frames to speed up very old
           | MediaCodecs).
        
             | hokkos wrote:
             | It will try to do a simple reproduction, and yes the frame
             | are decoded in order.
        
         | jampekka wrote:
         | I'm currently working with WebCodecs to get (the long awaited)
         | frame-by-frame seeking and reverse playback working in the
         | browser. And it even seems to work, albeit the VideoDecoder
         | queuing logic seems to give some grief for this. Any tips on
         | figuring out how many chunks have to be queued for a specific
         | VideoFrame to pop out?
         | 
         | An aside: to work with video/container files, be sure to check
         | the libav.js project that can be used to demux streams
         | (WebCodecs don't do this) and even used as a polyfill decoder
         | for browsers without WebCodec support!
         | 
         | https://github.com/Yahweasel/libav.js/
        
           | DaleCurtis wrote:
           | The amount of frames necessary is going to depend on the
           | codec and bitstream parameters. If it's H264 or H265, there's
           | some more discussion and links here: https://github.com/w3c/w
           | ebcodecs/issues/698#issuecomment-161...
           | 
           | The optimizeForLatency parameter may also help in some cases:
           | https://developer.mozilla.org/en-
           | US/docs/Web/API/VideoDecode...
        
             | jampekka wrote:
             | Thanks. I appreciate that making an API that can be
             | implemented with the wide variety of decoding
             | implementations is not an easy task.
             | 
             | But to be specific, this is a bit problematic with I-frames
             | only videos too, and with optimizeForLatency enabled (that
             | does make the queue shorter). I can of course .flush() to
             | get the frames out but this is too slow for smooth
             | playback.
             | 
             | I think I could just keep pushing chunks until I see the
             | frame I want coming out but it will have to be done in an
             | async "busy loop" which feels a bit nasty. But this is done
             | also in the "official" examples I think.
             | 
             | Something like "enqueue" event (similarly to dequeue) that
             | more chunks after last .decode() are needed to saturate the
             | decoder would allow for a clean implementation. Don't know
             | if this is possible with all backends though.
        
           | Rodeoclash wrote:
           | Wow, great to see some work in this space. I've been wanting
           | to do reverse playback, frame accurate seek and step by step
           | forward and back rendering in the browser for esports game
           | analysis. The regular video tag gets you _somewhat_ of the
           | way there but navigating frame by frame will sometimes jump
           | an extra frame. Likewise trying to stop at an exact point
           | will often be 1 or 2 frames off where you should be. Firefox
           | is much worse, when pausing at a time you could +-12 frames
           | where you should be.
           | 
           | I must find some time to dig into this, thanks for sharing
           | it.
        
             | jampekka wrote:
             | I have it working with WebCodecs, but currently i-frames
             | only videos and all the decoded frames are read to memory.
             | Not impossible to lift these restrictions, but the current
             | WebCodec API will likely make it a bit brittle (and/or
             | janky). For my current case this is not a big problem so I
             | haven't fought with it too much.
             | 
             | Figuring out libav.js demuxing may be a bit of a challenge,
             | even though the API is quite nice as traditional AV APIs
             | go. I'll put out my small wrapper for these in a few days.
             | 
             | Edit: to be clear I don't have anything to do with libav.js
             | other than happening to find it and using it to scratch my
             | itch. Most demuxing examples for WebCodecs use mp4box.js
             | which really makes one a bit uncomfortably intimate with
             | guts of the MP4 format.
        
         | krebby wrote:
         | Encoding alpha, please!
         | https://github.com/w3c/webcodecs/issues/672
         | 
         | Thanks for the great work on WebCodecs!
        
       | fenesiistvan wrote:
       | I don't get this. After the initial ICE negotiation you can just
       | send raw RTP packets (encrypted with SRTP/DTLS). There is no need
       | for any ACK packets. FEC can be done at codec level. What I am
       | missing?
        
         | kixelated wrote:
         | Are you referencing this line? > 2x the packets, because
         | libsctp immediately ACKs every "datagram".
         | 
         | The section is about data channels, which uses SCTP and is ACK-
         | based. Yes, you can use RTP with NACK and/or FEC with the media
         | stack, but not with the data stack.
        
           | the8472 wrote:
           | TCP can coalesce acks for multiple packets, can't SCTP do the
           | same?
        
             | Sean-Der wrote:
             | SCTP can (and does) a SACK[0] isn't needed for each DATA
             | chunk.
             | 
             | [0]
             | https://datatracker.ietf.org/doc/html/rfc4960#section-3.3.4
        
             | kixelated wrote:
             | The protocol can do it, but libsctp (used by browsers) was
             | not coalescing ACKs. I'm not sure if it has been fixed yet.
        
         | pthatcherg wrote:
         | From a server or a native client, you can send whatever RTP
         | packets you want, but you cannot send whatever RTP packets you
         | want from a web client, and you cannot have access to the RTP
         | packets from a web client and do whatever you want with them,
         | at least not very easily. We are working on an extension to
         | WebRTC called RtpTransport that would allow for just that, but
         | it's in early stages of design and standardization.
        
           | Sean-Der wrote:
           | That's so exciting! I had no idea you were working on this :)
           | 
           | Here [0] is a link for anyone that is looking for it.
           | 
           | [0] https://github.com/w3c/webrtc-
           | rtptransport/blob/main/explain...
        
       | gvkhna wrote:
       | Unfortunately Safari is still a major hold out on WebTransport
       | with no clear update, considering all other browsers have now
       | supported it in GA for 3+ years.
        
         | englishm wrote:
         | I don't know the timeline, but Apple has committed [1] to
         | adding WebTransport support to WebKit.
         | 
         | [1]: https://github.com/WebKit/standards-
         | positions/issues/18#issu...
        
           | kixelated wrote:
           | Eric Kinnear (linked post; Apple) is the author of the HTTP/2
           | fallback for WebTransport, so it's safe to say that
           | WebTransport will be available in WebKit at some point.
        
         | gs17 wrote:
         | Yeah, I have a project where WebTransport would be a huge
         | improvement, but not being able to support Safari is a
         | dealbreaker.
        
         | doctorpangloss wrote:
         | Google Chrome never implemented trailers, so no gRPC. Every
         | browser is guilty.
         | 
         | > I spent almost two years building/optimizing a partial WebRTC
         | stack @ Twitch using pion. Our use-case was quite custom and we
         | ultimately scrapped it, but your millage may vary.
         | 
         | So many words about protocols. Protocols aren't hard or
         | interesting. QA is hard. libwebrtc has an insurmountable lead
         | on QA. None of these ad-hoc things, nor WebRTC implementations
         | like Pion, will ever catch up, let alone be deployed in Mobile
         | Safari.
        
           | Sean-Der wrote:
           | I am not at Twitch anymore so can't speak to what the state
           | of the things today are.
           | 
           | WebRTC was deployed and is in use. For twitch.tv you have
           | Guest Star [0]
           | 
           | You can use that same back-end via IVS Stages[1]. I always
           | call it 'White-label Twitch', but it is more nuanced then
           | that! Some pretty big/interesting customers were using it.
           | 
           | [0] https://help.twitch.tv/s/article/guest-
           | star?language=en_US
           | 
           | [1] https://docs.aws.amazon.com/ivs/latest/RealTimeUserGuide/
           | wha...
        
           | fidotron wrote:
           | I agree with the general thrust of this, but libwebrtc has a
           | tendency to have Google convenient defaults which are
           | distinctly non obvious, such as the quality vs framerate
           | tradeoffs being tied to what Hangouts needs to implement as
           | opposed to useful general purpose hooks. (I hope they have
           | updated that API since I last looked). Once you know how to
           | poke it to make it do what you want, which tends to require
           | reading the C++ source, it's definitely got a massive head
           | start, especially around areas like low level hardware
           | integration. Even bolting on ML inference and so on is not
           | hard. The huge point though is everyone knows everyone has to
           | talk to libwebrtc at some point, so it is the de facto
           | implementation of SDP etc.
           | 
           | Curiously I was working on a webrtc project for about 18
           | months which also hit the wall, however, since then I have
           | learned of several high profile data only libwebrtc
           | deployments, which really just use it in the way classic
           | libjingle was intended to be used, P2P, NAT punching etc. I'd
           | go so far as to say if you don't have a P2P aspect to what
           | you're doing with libwebrtc you're missing the point.
           | 
           | The big picture though is there seems to be a general denial
           | of the fact that web style CDNs and multidirectional media
           | streaming graphs are two totally different beasts.
        
           | johncolanduoni wrote:
           | I think they'll catch up because libwebrtc is huge and can't
           | take advantage of most of the existing browser protocol
           | stack. WebTransport is a pretty thin layer over HTTP3, which
           | already gets used much more often than WebRTC's SCTP-over-
           | DTLS ever will. Not to mention the fact that it takes like
           | two orders of magnitude less code to implement WebTransport
           | than the whole WebRTC stack.
        
         | mehagar wrote:
         | It's being worked on now:
         | https://github.com/WebKit/WebKit/pull/17320
        
       | sansseriff wrote:
       | What's a good place to learn about all the networking jargon in
       | this article?
       | 
       | Also, a nitpicky point: As a community it would be nice to stop
       | thinking of 'links to wikipedia pages, mdn docs, or github
       | issues' as useful or pertinent forms of citations or footnotes.
       | If I'm immersed in an article and I come across a term I don't
       | know like 'SCTP', do you think a link to some dense proposal
       | standard webpage is appropriate in this context for providing the
       | background info I need?
       | 
       | Of course academia is guilty of this too. Canonical citations
       | don't tell the reader much else beyond 'I know what I'm talking
       | about' and 'if you want to know more spend 30 minutes reading
       | this other article'
       | 
       | But since we're web based here I think we can do better. Tooltips
       | that expand into a helpful paragraph about e.g. SCTP would be a
       | start.
        
         | Sean-Der wrote:
         | For the WebRTC jargon check out
         | https://webrtcforthecurious.com/
         | 
         | If that still doesn't cover enough I would love to hear! Always
         | trying to make it better.
        
         | englishm wrote:
         | Here are a couple resources which may be helpful:
         | 
         | - https://www.mux.com/video-glossary
         | 
         | - https://howvideo.works/
         | 
         | Also, if you do want to take the time to read a longer and
         | denser article, but come away with an understanding of much of
         | the breadth of modern streaming tech, there's a really great
         | survey paper available in pre-print here:
         | 
         | https://doi.org/10.48550/arXiv.2310.03256
        
         | IKantRead wrote:
         | I really like _High Performance Browser Networking_ by Ilya
         | Grigorik. It 's published by O'Reilly but is also free online
         | [0]. What's particularly great about it is, unlike most other
         | networking books, it focuses on the issue from the browser/web
         | developer perspective which is particularly helpful for WebRTC
         | and generally applicable to daily web dev work.
         | 
         | 0. https://hpbn.co/
        
       | seydor wrote:
       | > Back then, the web was a very different place. Flash was the
       | only way to do live media and it was a mess.
       | 
       | Not sure why people are saying that. WebRTC is far harder to make
       | it work. Peer-to-peer is a cpu black-hole-level hog
        
         | kixelated wrote:
         | I maintained the Flash video player at Twitch until I couldn't
         | take it any longer and created an HTML5 player. Flash was a
         | mess. :)
        
           | seydor wrote:
           | is html5 video part of webrtc?
        
             | yalok wrote:
             | no, they are not related
        
       | mehagar wrote:
       | An additional benefit of WebTransport over WebRTC DataChannels is
       | that the WebTransport API is supported in Web Workers, meaning
       | you can send and receive data off the main thread for better
       | performance.
        
         | kixelated wrote:
         | Absolutely!
         | 
         | I'm doing that in my implementation: the main thread
         | immediately transfers each incoming QUIC stream to a WebWorker,
         | which then reads/decodes the container/codec and renders via
         | OffscreenCanvas.
         | 
         | I didn't realize that DataChannels were main thread only.
         | That's good to know!
        
       | znpy wrote:
       | > The best and worst part about WebRTC is that it supports peer-
       | to-peer.
       | 
       | I hope P2P stays around in browsers.
        
         | englishm wrote:
         | > I hope P2P stays around in browsers.
         | 
         | I do, too.
         | 
         | There was a W3C draft [1] ~pthatcherg was working on for P2P
         | support for QUIC in the browser that could have maybe become a
         | path to WebRTC using QUIC as a transport, but I think it may
         | have been dropped somewhere along the line to the current
         | WebTransport spec and implementations. (If Peter sees this, I'd
         | love to learn more about how that transpired and what the
         | current status of those ideas might be.)
         | 
         | A more recent IETF individual draft [2] defines a way to do
         | ICE/STUN-like address discovery using an extension to QUIC
         | itself, so maybe that discussion indicates some revived
         | interest in P2P use cases.
         | 
         | [1]: https://w3c.github.io/p2p-webtransport/ [2]:
         | https://datatracker.ietf.org/doc/html/draft-seemann-quic-add...
        
       | Fischgericht wrote:
       | A wonderful write-up, thank you very much.
       | 
       | However, it needs to be said that with currently available
       | technologies, there is no need to have a 5 seconds buffer.
       | 
       | Have a look at the amazing work of the OvenMedia team (not
       | affiliated). Using their stack, and using Low-Latency HLS (LLHLS)
       | I have been able to easily reach an end-to-end latency between
       | the camera on site to the end-user viewing it in the browser of
       | <500ms. At 20 MBit/s, and using either SRT or RTMP for stream
       | upload.
       | 
       | I understand that most likely your huge buffers come from you
       | expecting your streamers to be using RTMP over crappy links, and
       | therefore you already need to buffer THEIR data. Twitch really
       | should invest in supporting SRT. It's supported in OBS.
       | 
       | Anyway:
       | 
       | Once you have the stream in your backend, the technology to have
       | sub-second latency live streaming using existing web standards is
       | there.
       | 
       | https://airensoft.gitbook.io/ovenmediaengine/
       | 
       | But all of this being said: What you are doing there is looking
       | amazing, so keep up the good work!
        
         | kixelated wrote:
         | Glad you liked it!
         | 
         | It's really difficult to compare the latency of different
         | protocols because it depends on the network conditions.
         | 
         | If you assume flawless connectivity, then real-time latency is
         | trivial to achieve. Pipe frames over TCP like RTMP and bam,
         | you've done it. It's almost meaningless to compare the best-
         | case latency.
         | 
         | The important part is determining how a protocol behaves during
         | congestion. LL-HLS doesn't do great in that regard; frankly it
         | will perform worse than RTMP if that's our yardstick because of
         | head-of-line blocking, large fragments, and the playlist in the
         | hot path. Twitch uses a fork of HLS called LHLS which should
         | have lower latency, but we were still seeing 3-5s in some parts
         | of the world.
         | 
         | But yeah, P90 matters more than P10 when it comes to latency.
         | One late frame ruins the broth. A real-time protocol needs a
         | plan to avoid queues at all costs and that's just difficult
         | with TCP.
        
       | moffkalast wrote:
       | > The core issue is that WebRTC is not a protocol; it's a
       | monolith.
       | 
       | > The WebRTC media stack is designed for conferencing and does an
       | amazing job at it. The problems start when you try to use it for
       | anything else.
       | 
       | The main problem with WebRTC is that it's designed to be
       | overcomplicated and garbage on purpose. A way to leverage UDP for
       | video streaming without any possibility for anyone to ever send
       | an actual UDP packet... so that it's not possible to turn every
       | mildly popular website into a god tier DDOS botnet.
       | 
       | WebRTC is what we get when we can't have nice things.
        
         | englishm wrote:
         | I very much disagree with the characterization of WebRTC being
         | "designed to be overcomplicated and garbage on purpose" but I
         | do think your point about needing to not open the door to DDoS
         | botnet capabilities is something worth highlighting.
         | 
         | There are a number of very challenging constraints placed on
         | the design of browser APIs and this is one of them that is
         | often grappled with when trying to expose more powerful
         | networking capabilities.
         | 
         | Another that's particularly challenging to deal with in the
         | media space is the need to avoid adding too much additional
         | fingerprinting surface area.
         | 
         | For each, the appropriate balance can be very difficult to
         | strike and often requires a fair bit of creativity that can
         | look like "complication" without the context of the constraints
         | being solved for.
        
       | mosfets wrote:
       | This is not going to replace the p2p functionality of webrtc
       | right?
        
       ___________________________________________________________________
       (page generated 2023-10-30 23:00 UTC)