[HN Gopher] How to build large-scale end-to-end encrypted group ...
       ___________________________________________________________________
        
       How to build large-scale end-to-end encrypted group video calls
        
       Author : jiripospisil
       Score  : 147 points
       Date   : 2021-12-15 20:06 UTC (2 hours ago)
        
 (HTM) web link (signal.org)
 (TXT) w3m dump (signal.org)
        
       | johnisgood wrote:
       | Great, now they should just stop using telephone numbers as
       | identifiers.
        
         | maxwell wrote:
         | What do you suggest?
        
           | sam_lowry_ wrote:
           | A login+password, like in IRC.
        
             | tptacek wrote:
             | IRC tracks metadata serverside!
        
               | johnisgood wrote:
               | I do not think that OP was referring to implementing it
               | the same, or even similar way, but to use a
               | username/password pair. OP is free to correct me if I am
               | wrong though.
        
           | zamadatix wrote:
           | Signal has had standard usernames on the roadmap for years.
        
           | johnisgood wrote:
           | Usernames work. You could even use UUIDs these days as QR is
           | an increasingly common way of sharing data. But yeah,
           | usernames would be a great improvement.
        
             | tptacek wrote:
             | Usernames _do not just work_. The Signal team is not
             | unaware of usernames and Signal is not a weird scheme to
             | get all your phone numbers. The difference between Signal
             | and systems that use usernames (or email addresses) is that
             | Signal deliberately doesn 't operate a serverside directory
             | or buddy list service. By contrast, other relatively
             | popular messengers essentially keep a plaintext database of
             | who talks to who on their service.
             | 
             | What phone numbers allow Signal to do is to piggyback off
             | the contact lists people already have on their devices.
        
               | kitkat_new wrote:
               | > is that Signal deliberately doesn't operate a
               | serverside directory or buddy list service.
               | 
               | how do people again discover each other on Signal most of
               | the time?
               | 
               | Anyways, nothing prevents Signal form creating it's own
               | contact list within the app, perhaps bootstrapped from
               | the existing one
        
               | tptacek wrote:
               | They can do that, but then when you switch devices, you
               | lose your contact list. That's not what happens with the
               | built-in contact list.
               | 
               | This issue has been rehashed dozens of times on HN before
               | (use the search bar below) and has basically nothing to
               | do with the article.
        
               | kitkat_new wrote:
               | actually, the contact list could include the signal
               | identifiers
        
               | stormbrew wrote:
               | I mean, phone numbers also don't really "work." Do you
               | know how many old phone numbers I have in my phone's
               | contact list that aren't actually owned by the person
               | they're listed on anymore? Using signal I get "Person You
               | Knew 10 Years Ago Is On Signal!" notifications every now
               | and then and.. yeah I can assure you that's not them.
               | 
               | For example, I have literally 6 phone numbers in my phone
               | for my sister because every time she job hops she ends up
               | with a new number. I'm not even sure which one is
               | actually her.
               | 
               | Phone numbers are not permanent identities, any more than
               | usernames or email addresses are. There's no single
               | perfect answer to identity online and if there is, I'm
               | sorry, it's not a number that can be changed, stolen,
               | lost, etc.
        
               | [deleted]
        
       | remus wrote:
       | I don't know what their threat model is but it's interesting that
       | they don't seem too bothered about reducing meta data collection
       | potential on the server. I bet you could put together some pretty
       | interesting graphs of who is talking to who, how much they talk
       | and when.
        
         | tptacek wrote:
         | Their messaging substrate is Signal itself, for whatever that's
         | worth, so at least the signaling component of the system should
         | inherit the guarantees Signal already makes. But it's a good
         | question.
        
       | Naac wrote:
       | >> There is no off the shelf software that would allow us to
       | support calls of that size while ensuring that all communication
       | is end-to-end encrypted, so we built our own open source Signal
       | Calling Service to do the job
       | 
       | But wasn't there Jitsi? [0]
       | 
       | I think its great we have competition among Free Software
       | projects so that both can improve. But sometimes I feel like
       | maybe duplicated efforts create two 5/10 solutions. Instead what
       | we really want is one 8/10 solution, or better.
       | 
       | [0] https://meet.jit.si/
        
         | estaseuropano wrote:
         | While I love jitsi, i don't think it is E2E?
        
           | dest wrote:
           | AFAIK it's E2E for 1:1 video chats, but not when more are
           | there.
        
             | bilal4hmed wrote:
             | Jitsi does support e2ee for groups as well
             | https://jitsi.org/e2ee-in-jitsi/
        
               | [deleted]
        
           | Naac wrote:
           | AFAIK this _was_ a work in progress[0]. I am not sure what
           | the status of this is now.
           | 
           | [0] https://jitsi.org/blog/e2ee/
        
           | jkepler wrote:
           | I think Jitsi group calls can be end to end encrypted,
           | provided all participants use Chromium 83, per
           | https://jitsi.org/security/.
        
         | Vinnl wrote:
         | It's the first of the links where they say "When building
         | support for group calls, we evaluated many open source SFUs",
         | so I suppose it's either not one of the two with "adequate
         | congestion control", or is the one that did not reliably scale
         | past 8 participants?
        
         | landstrom wrote:
         | Daily.co has a developer friendly offering that accomplishes
         | this as well. Many offerings available and many reasons to not
         | take on this added complexity.
        
         | jcelerier wrote:
         | As much as I like Jitsi conceptually, it has consistently
         | performed much more poorly than Zoom starting from 5/6 ppl
        
         | skybrian wrote:
         | There is some duplication of effort but sometimes progress
         | happens via rewrites and that might actually be a faster way to
         | an 8/10 system than direct collaboration?
         | 
         | Also I think it's interesting to see how this builds on
         | Google's work (the googcc algorithm). Which of course builds on
         | previous open source work. The underlying technical
         | collaboration happens even with quite different organizational
         | goals and different codebases.
        
         | [deleted]
        
         | johnisgood wrote:
         | There is also https://jami.net/. I have no clue how group video
         | calls are implemented though. It seems like it is not an easy
         | thing to do.
         | 
         | https://wire.com/en/ seems to support it, too, although not
         | exactly "large-scale". Audio calls allow for up to 100
         | participants, for one.
        
       | 1vuio0pswjnm7 wrote:
       | "Full mesh: Each call participant sends its media (audio and
       | video) directly to each other call participant. This works for
       | very small calls, but does not scale to many participants. Most
       | people just don't have an Internet connection fast enough to send
       | 40 copies of their video at the same time.
       | 
       | Server mixing: Each call participant sends its media to a server.
       | The server "mixes" the media together and sends it to each
       | participant. This works with many participants, but is not
       | compatible with end-to-end encryption because it requires that
       | the server be able to view and alter the media.
       | 
       | Selective Forwarding: Each participant sends its media to a
       | server. The server "forwards" the media to other participants
       | without viewing or altering it. This works with many
       | participants, and is compatible with end-to-end-encryption."
       | 
       | Imagine an end user who is interested in "very small calls" with
       | friends and family. She is not interested in communicating to an
       | infinitely large audience ("broadcasting"). She never has group
       | calls on Signal with 40 people. We have to use our imagination
       | because this user does not actually exist.
       | 
       | The imaginary user reads this blog post and she thinks to herself
       | "Full mesh sounds like the best design. There is less/no reliance
       | on a third party, traffic does not need to be sent to a third
       | party server." With full mesh, there is no need to mention the
       | caveat "without viewing or altering it" (or selectively choosing
       | not to forward it to certain recipients). Full mesh seems to give
       | the user the most control and require the least dependence on
       | third party servers (not necessarily none, but the least).
       | 
       | Then she reads this line: "Because Signal must have end-to-end
       | encryption _and scale to many participants_ , we use selective
       | forwarding."
       | 
       | The make-believe user wonders "Why must Signal scale to many
       | particpants." For this user, "scal[ing] to a many participants"
       | appears to be an artificial constraint. She has no such need.
       | "Perhaps Signal is not designed for users like me. Maybe Signal
       | is trying to compete with Facebook, TikTok, Zoom, etc. Signal is
       | supposedly non-commercial and should be free from such pressures
       | to compete. Does this mean that if I make a call to two people,
       | the traffic has to be sent to third party servers so they can
       | "forward" the audio/video the appropriate recipients."
       | 
       | "Why can't I be the one to choose at run-time whether full mesh
       | or selective forwarding is used."
       | 
       | Finally she comes to her senses. "This blog post was not written
       | for me. It seems to be a form of show and tell by the people
       | working at Signal not an birectional dialogue with Signal users."
        
         | prophesi wrote:
         | Just an FYI full mesh would still require communicating with a
         | third-party server, at the very least for initial networking
         | when joining/leaving a group call.
         | 
         | The whole point of E2E encryption is so that passing data
         | through a third party shouldn't matter in the first place.
         | 
         | And lastly, even when you have just a 1:1 video chat, sending
         | and receiving full resolution/quality multimedia can still be
         | way too much for some peoples' internet connections. UX is
         | extremely important for Signal, as unreliable video chat is a
         | surefire way for those less caring about privacy to hop back
         | over to a privacy-violating alternative.
         | 
         | I feel sorry for those working on bringing security/privacy to
         | everyone, as they have to appease power users and privacy
         | absolutists, along with one's grandmother and the TikTok
         | generation.
        
       | sneak wrote:
       | They have the bandwidth for relaying video streams to 40 people
       | but won't let me send full res jpegs in 1:1 messages?
       | 
       | And no, I can't just rebuild my client, because I'm on iOS and
       | non-official builds won't receive push notifications from the
       | official developers.
        
         | Vinnl wrote:
         | That's not really related to this article, but I can select
         | photo quality if I send a photo on Android. Appears to have
         | been added in May.
        
           | sneak wrote:
           | The article specifically mentions that they operate the
           | infrastructure for relaying encrypted video streams for up to
           | 40 participants.
           | 
           | I can also select media quality on iOS. My options are
           | "compressed way too much" and "compressed too much". I assume
           | you have the same options.
           | 
           | I would like to be able to attach images as files and have
           | them come though unmodified. It is a general purpose
           | communications tool, it should not be editorializing over my
           | attachments.
           | 
           | I use Signal to communicate privately with my attorney. Why
           | does anyone think tampering with evidence in transit is okay?
           | 
           | Apple also doesn't support open source in the App Store, so I
           | can't fix the problem myself.
        
       | wyager wrote:
       | How does signal get money to cover costs of running compute-
       | intensive services?
        
         | sandstrom wrote:
         | They recently added support for in-app donations:
         | https://www.theverge.com/2021/12/2/22814934/signal-launches-...
         | 
         | I hope they'll take it a step further and require payment for
         | certain functionality (maybe video calls?, or desktop client
         | support?).
        
         | keewee7 wrote:
         | One of the the WhatsApp founders, Brian Acton, donated $100
         | million to them as an unsecured loan due to be repaid in 2068:
         | 
         | https://en.wikipedia.org/wiki/Brian_Acton#Signal
         | 
         | https://en.wikipedia.org/wiki/Signal_(software)#Developers_a...
        
           | sorenjan wrote:
           | How long does that last? Telegram uses a few hundred million
           | dollars each year, although they are significantly larger.
           | 
           | > As Telegram approaches 500 million active users, many of
           | you are asking the question - who is going to pay to support
           | this growth? After all, more users mean more expenses for
           | traffic and servers. A project of our size needs at least a
           | few hundred million dollars per year to keep going.
           | 
           | https://t.me/durov/142
        
             | new_stranger wrote:
             | > needs at least a few hundred million dollars per year to
             | keep going
             | 
             | I'm pretty sure that is not server cost. This is probably
             | the standard approach of companies hiring tons of personal
             | and spending tens of thousands or hundreds of thousands on
             | ads every single day.
        
       | benlivengood wrote:
       | To scale to thousands (is this even useful?) of e2e users build a
       | tree of participants who can remix each other's video.
       | 
       | Pick a handy mixing ratio like 4:1 or 9:1 (a square helps, since
       | they compose nicely if downscaled to a grid vs. active talker
       | stays fullscreen) and nodes with the highest bandwidth and lowest
       | latency take M-1 streams and add it to their own to make an M:1
       | mix which can be forwarded to a node closer to the root which
       | produces another M:1 stream, and the root sends a single mixed
       | stream down the tree until every participant has the mix. Max
       | bandwidth at each node is M down and M up. Minimal spanning tree
       | with max M edges per node recomputed as participants leave and
       | join. Build 3 or 4 distinct trees and leave the connections open
       | for more rapid switching if intermediate nodes stop
       | participating.
        
       | JoeAltmaier wrote:
       | Oh this all brings back memories, of Sococo in the 2000's. We
       | faced all these problems and had similar solutions to them all.
       | 
       | We even had a rapidly adapting network make-and-break recovery
       | layer. You unplug your laptop from a wired connection, switch to
       | wireless - we recovered in milliseconds. You heard barely a
       | click.
       | 
       | The encryption issue is fun - we had a rotate-key message in-
       | band. The receiver loaded new keys and tried them in sequence to
       | ease the turnover time - out-of-order packets etc could make it
       | ambiguous for a short while which key to use. A cache and aging
       | keys out made it work pretty well.
       | 
       | Remixing on user stations proved to be problematic (mentioned
       | elsewhere on this thread). You'd think if 6 people at one site
       | were conferencing with a dozen elsewhere, you could elect one at
       | each site to mix-and-forward. But corporate networks made it hard
       | to determine who was 'adjacent' - they were often layered and
       | without uPNP (is that what the router protocol is called?) you
       | couldn't tell if somebody at the next desk was even in your
       | company.
       | 
       | We had up to 100 people in a conference, and our enter-the-
       | conference time was on the order of 100ms. Click into an all-
       | hands, and be able to hear everybody before you finger left the
       | mouse button. It was wonderful.
       | 
       | Sococo today is a sad shadow of that. They went open-source and
       | lost all our IP instantly. Just another WebRTC client last I
       | knew.
        
         | narush wrote:
         | > They went open-source and lost all our IP instantly.
         | 
         | Can you explain what this means? Like - other people copied
         | your work?
         | 
         | Genuinely wondering, OSS noob here...
        
           | JoeAltmaier wrote:
           | There was little or nothing in WebRTC to match what we'd
           | spend 5 years creating. So they were back to 1-5 people in a
           | conference, with 1-3 second connect times, and no resilience
           | to network changes.
           | 
           | The excuse they gave was "We can't rely on 6 people in Iowa
           | for our core IP". So they switched to some open source mix
           | node that was the pet project of 2 guys in Italy. Two
           | academics, who gave it hardly any attention. And it had zero
           | IP; just a collection of APIs stitched together to give you
           | the impression of having a mix node.
           | 
           | We said all that at the time. But such was the power of the
           | magic words "Open Source" that it all bounced off their
           | mental shields.
        
       | BitPirate wrote:
       | Are there any plans to add VP9 support?
        
       | kitkat_new wrote:
       | Next step: decentralizing encrypted group calls [0]
       | 
       | [0]: https://2021.commcon.xyz/talks/extending-matrix-s-e2ee-
       | calls...
        
       ___________________________________________________________________
       (page generated 2021-12-15 23:00 UTC)