[HN Gopher] How Google Meet's noise cancellation works
       ___________________________________________________________________
        
       How Google Meet's noise cancellation works
        
       Author : theanirudh
       Score  : 198 points
       Date   : 2020-06-09 16:29 UTC (6 hours ago)
        
 (HTM) web link (venturebeat.com)
 (TXT) w3m dump (venturebeat.com)
        
       | xchaotic wrote:
       | I am still waiting until AI reaches the ultimate in noise
       | cancelling- that meeting could have been an email. AI will
       | automatically send meeting cancellation and most likely meeting
       | notes.
        
       | jokoon wrote:
       | Weirdly, it seems the simplest phones already solved those
       | problem a long time ago.
       | 
       | Seems like over-engineering. The issue is either with the
       | microphone, with the hi-def stuff or something else.
       | 
       | Every normal phone never had an inch of a problem, so I'm really
       | confused why computers have this issue.
        
       | bigtones wrote:
       | I was not so impressed with this demo - especially when he was
       | scrunching his potato chip packet, the degradation in his voice
       | quality made it almost impossible to understand what he was
       | saying and his voice sounded very synthesized and processed, and
       | that's through a $200 Yeti professional microphone. Seems like
       | some of the other noise cancellation technology options from
       | Nvidia RTX and others are more effective.
        
         | jack_pp wrote:
         | However that wasn't a realistic scenario, you are unlikely to
         | talk while scrunching plastics, ideally you should get muted
         | 100% when you do it.
         | 
         | Only way this could happen is if someone is standing 3 feet
         | away from you and does it while you talk which would just be
         | rude and probably be stopped by you immediately anyway.
         | 
         | I'm more curious how this could work in the metro or with a
         | washing machine nearby
        
         | SquareWheel wrote:
         | Yeti microphones pick up _everything_. If anything it would
         | make the test more difficult.
        
         | vkou wrote:
         | You are correct. The sound quality of Nvidia RTX is amazing,
         | compared to this.
         | 
         | Unfortunately, you need a $400 graphics card, and >100 watts of
         | power to run RTX...
        
           | sp332 wrote:
           | Unofficially, it's pretty easy to get it to run on non-RTX
           | nvidia GPUs. https://www.pcgamer.com/nvidia-rtx-voice-
           | performance/
        
             | bhauer wrote:
             | Not only that, but even when running RTX Voice on a GTX,
             | the amount of GPU horsepower used is miniscule; it hardly
             | even registers. The fans on my GTX 9-series GPU don't turn
             | on when running RTX Voice.
        
               | bentcorner wrote:
               | +1. I'm using it on a 1060 and I've monitored sensors
               | with OpenHardwareMonitor while RTX Voice is running. The
               | only observable effect is that my GPU's clocks are turned
               | up, there isn't even any significant load.
        
           | cecja wrote:
           | My RTX 2080ti uses 5 Watt idle and 15 Watt with RTX Voice.
           | Stop spewing FUD.
        
         | crazygringo wrote:
         | Microphone quality has very little to do with it. Noise
         | separation is just an incredibly hard problem, particularly
         | when a noise is loud. Scrunching potato chips, there's no
         | scenario where his voice won't become degraded unless you can
         | isolate the scrunching sound separately (microphone beamforming
         | can help here, but is still never perfect).
         | 
         | Running this economically on servers at scale in realtime, I
         | consider this very impressive. I can't say how it compares with
         | RTX, but I wonder if it has anything to do with the amount of
         | computing resources that can be dedicated to it. A single
         | expensive card dedicated to one audio stream, versus a single
         | Google server than needs to process hundreds (thousands?) of
         | audio streams.
        
       | miki123211 wrote:
       | To all those here who complain about algorithms messing with
       | their audio when they don't want them to. Use an app called
       | TeamTalk. It lets you disable all that processing, so it works
       | great for high-quality music transmission etc. I have no
       | affiliation with them, I have been using it for a few years and
       | I'm very happy.
        
       | ben7799 wrote:
       | I've been doing online guitar lessons since Covid-19 started and
       | all these algorithms just suck hard for that. Even in a 1:1 call.
       | 
       | Two repeated notes and the noise cancellation just immediately
       | shuts you down... we've been using Zoom and luckily you can turn
       | all the audio processing off if you go in "Advanced" and enable
       | "Turn on original audio".
        
       | Vaslo wrote:
       | Or you could just mute your damn phones
        
       | pier25 wrote:
       | As someone with 2 dogs this is going to be a good reason to
       | switch to Meet whenever possible.
        
       | ruffrey wrote:
       | Serious question - what's the risk that someone with a high
       | pitched, outside-the-norm voice will get denoised? If it filters
       | out kids in the background, will kids no longer be able to use
       | google meet?
        
         | bradstewart wrote:
         | I haven't been able to confirm this, but I swear it happens to
         | my mom on Zoom. When I video chat my family on Zoom and she
         | isn't sitting directly in front of the laptop, her words rarely
         | come through. I can see her lips moving, I can hear my dad
         | grunting in agreement next to her--but she's silent. If they
         | switch places, I can hear my dad without issue.
         | 
         | I don't know if it's a combination of her cheap hardware or
         | what, but it's... odd.
         | 
         | EDIT: grammar
        
         | arielserafini wrote:
         | I think the key here is "in the background". I would assume if
         | you're speaking close to the microphone, with no other voices
         | going on, it will not filter anything.
        
       | [deleted]
        
       | arielserafini wrote:
       | I'd say this is more like a demo. From the "how it works" in the
       | title I was expecting to see some implementation details.
       | 
       | Edit: I had only watched the video. The article does indeed
       | contain a lot more detail.
        
         | notatoad wrote:
         | from the "venturebeat.com" domain though, this is about what i
         | would have expected.
        
           | stefan_ wrote:
           | This is not the time to be snarky, I think we should
           | congratulate the Google Meet team on placing this great
           | article, not to mention the obnoxious integrations into other
           | beloved Google products.
           | 
           | Can't wait to see what the Google _Duo_ team will come up
           | with in response. I mean, we saw the blog post on their great
           | new video codec (AOMedia Video 1 was it?) but I personally
           | felt it left much to be desired.
           | 
           | What happened to the Hangout guys? Are they still in this
           | one? Product middle management wants to be wooed.
        
             | Orphis wrote:
             | Duo is targeted to the general public and has E2E
             | encryption though, so cloud denoising is not a possibility.
        
             | mav3rick wrote:
             | Ah the HN way keep putting down technical achievement just
             | because it goes against a personal agenda.
        
       | jpalomaki wrote:
       | Might be interesting to train the model using user's own voice.
       | Maybe this would help filtering out co-workers in open office or
       | family members.
       | 
       | Maybe you could also use this personal model to hide very short
       | network interruptions. Other party could use this model to
       | constantly predict my next piece of audio and switch to
       | prediction in case packet is lost.
        
       | crazygringo wrote:
       | Most people have no idea of the amount of incredibly advanced
       | signal processing that goes into echo cancellation and noise
       | cancellation in videoconferencing.
       | 
       | This post is on noise cancellation specifically, and it actually
       | has the potential to be a _huge_ step forward.
       | 
       | One of the big audio problems with group meetings is that the
       | background noise from each participant adds up, to a point where
       | it quickly becomes unbearable. For that reason, videoconferencing
       | generally only plays audio from one or two participants at most,
       | using a fairly simple estimation of whichever audio signal is
       | currently loudest. The problem is that this can make it really
       | hard to interrupt (people will literally not hear you), or tell
       | the difference between two people going "mm-hmm" versus the whole
       | group. If you've ever been in a group meeting where everybody
       | applauds something, this is why you _see_ everyone applauding but
       | only hear a smattering.
       | 
       | But if this noise cancellation really succeeds, it could be a
       | huge leap forward because audio cues and overlap will actually
       | work for the first time -- hearing the "mm-hmms", hearing
       | everyone pipe up, and so on. Videoconferencing will feel more
       | like an actual single shared audio environment, rather than the
       | kind of "walkie-talkie" effect it so often feels like now.
       | 
       | I'm really looking forward to this.
        
         | amelius wrote:
         | Hasn't the problem been solved decades ago, with car-kits?
         | 
         | It seems that what's old is new again ...
        
           | qmmmur wrote:
           | No, not even close. The problem space is still mostly
           | unsolved.
        
             | krapht wrote:
             | Of all companies poised to solve it, I think Apple could do
             | it if they wanted. They could embed microphones in the
             | laptop frame and integrate it with the camera as a premium
             | feature, since they control the hardware and software
             | stack.
             | 
             | It's more practical than a touch bar, at least.
        
           | throwaway9482 wrote:
           | What's a car-kit? Just googled but didn't find anything
           | relevant
        
             | amelius wrote:
             | Those adapters that let you use your phone through the
             | audio system of your car (adapter does the noise-
             | cancelling).
        
               | throwaway9482 wrote:
               | Ah I see never heard of them
        
           | pathseeker wrote:
           | No, as evidenced by the fact that you can almost always hear
           | when a participant calls in from their car. When they start
           | talking, you hear the road noise in the background.
           | 
           | The only thing car-kits seemed to do was add minimum cut-offs
           | before transmitting and make use of directional microphones.
        
           | hn_throwaway_99 wrote:
           | What do you mean by car-kits?
        
         | microcolonel wrote:
         | Any thoughts on spatialization or panning? I feel like it could
         | help a lot, but also making it a good experience could involve
         | head tracking, since most people are (hopefully) not accustomed
         | to speaking in a different direction than the person they're
         | speaking to.
        
           | squeaky-clean wrote:
           | I used to use panning back when I played WoW "seriously" and
           | did 25 man raids, it makes hectic audio chat sooo much more
           | clear. Some gaming voice apps can dynamically pan voices in
           | 3D to match where the characters are, but I didn't use any of
           | that. Just simply putting Guild Leader center, Tanks ~25%
           | right, Healers ~25% left, and then randomly throwing everyone
           | else somewhere wider in the stereo field. It sounds like an
           | actual group of people rather than a single overlapping mono
           | mess.
        
             | frosted-flakes wrote:
             | A podcast I listen to with 4-5 people talking in a room
             | does that with its audio. It sounds like a good idea, but
             | in practice, it drives me nuts and makes me feel like I
             | have plugged ears. I always enable mono audio when I listen
             | to it.
             | 
             | I think if it was dynamic, where turning my head towards
             | the person speaking balanced the audio (like in real life),
             | I would not have a problem with it. A super simple form of
             | virtual reality that would only require a simple head-
             | mounted gyroscope or motion sensor.
             | 
             | Another podcast I listen to has two people with very
             | similar voices, and I sometimes have a hard time figuring
             | out who's speaking, so I welcome any advancements in this
             | space.
        
             | tomlagier wrote:
             | How did you set this up?
        
               | squeaky-clean wrote:
               | It was built into whichever voice-chat software we were
               | using, just a simple right-click action. This was a long
               | time ago, so I don't totally remember, probably 2008-2012
               | or so? Trying to jog my memory with Google and I think it
               | was TeamSpeak and the "3D Sound" feature. I feel like
               | Mumble may also have been able to do this.
        
               | tomlagier wrote:
               | That's a really cool idea, thanks for the heads up.
               | Didn't realize it was possible but man would it make a
               | world of difference in meetings.
        
             | baq wrote:
             | i hope product managers of ms teams/zoom/meets are reading
             | this thread, this is pure gold right here
        
           | krapht wrote:
           | I dream of the day when our laptops come with an integrated
           | microphone array with automatic beamforming based on head
           | tracking.
        
             | lowdose wrote:
             | I wish Tinder came with the ability to beam me up.
        
             | microcolonel wrote:
             | I mean, automatic beamforming microphones are actually
             | fairly common now, in laptops. Head tracking is probably a
             | detour if your goal is just to get good clear voice input.
        
         | asdfman123 wrote:
         | This will be a game changer on online gaming, too (pun
         | intended, I guess). I don't even like playing games that
         | require headsets and teamwork because the background noise
         | makes my ears physically hurt after long enough.
        
         | rb808 wrote:
         | I wish everyone would just get a headset. It drives me nuts
         | when people call me on speakerphone, of course they never hear
         | the problem.
        
           | pas wrote:
           | Tell them you can't hear them due to noise, tell them to pick
           | up the phone or plug in a headset, etc.
        
         | the_af wrote:
         | > _The problem is that this can make it really hard to
         | interrupt (people will literally not hear you)_
         | 
         | This is driving me crazy with Google Meet in these COVID19
         | times. Even in a relatively small conference, I have a really
         | hard time interrupting someone to ask a quick question, even
         | when the speaker is expecting interruptions. It's always
         | "excuse me!"; delay as person continues speaking; I stop; the
         | other person says "yes, please ask away"; when I restart my
         | question the other person already assumed I've changed my mind
         | and continues speaking; repeat ad infinitum. And this is _if_
         | they even hear me over the audio breaking up.
         | 
         | It's very, very frustrating. If they solve this it would hugely
         | improve quality of life in remote conferencing for me.
        
           | 01100011 wrote:
           | It would be nice if there was a 'raise your hand' button
           | which put you in a queue to speak. Even better if it let you
           | take a quick note in case you forget what you wanted to say.
        
           | closeparen wrote:
           | I always feel this with Zoom. Interestingly it did not seem
           | to be an issue on a recent Discord call.
        
             | kenhwang wrote:
             | Discord targets gaming which absolutely prioritizes low-
             | latency. Zoom has a very noticeable amount of latency which
             | makes it really awkward to have multiple people talking at
             | the same time.
        
               | closeparen wrote:
               | Is Zoom picking some other point on an optimization
               | curve, and if so, what's more important to it?
               | 
               | Or is it just worse?
        
               | kenhwang wrote:
               | Zoom seems to be optimizing for bandwidth use, and by
               | extension, cost to them. Its typical use case is a shared
               | office internet connection.
               | 
               | Discord users are more likely to have a dedicated fast
               | internet connection and doesn't seem to care about
               | profitability at the moment.
               | 
               | It's just the difference in designing for a 100/10
               | connection to yourself vs sharing a 100/100 connection
               | with 20 other people. Zoom reasonably gracefully degrades
               | on choppy/slow connections while Discord becomes straight
               | up unusable.
        
               | matsemann wrote:
               | We're trying karaoke through Zoom tomorrow as a standup
               | gag, wonder how that will work with the delay haha.
        
               | notatoad wrote:
               | my perception with zoom (only based use, not actual
               | knowledge of how it works) is that it has two modes: one
               | where it tries to isolate the speaker and auto-mute
               | everybody else, and another where it can't figure out who
               | the speaker is and just lets all audio through. so if
               | everybody on the call is singing together, it should all
               | come through.
        
               | gxqoz wrote:
               | Does anyone have more details on Zoom vs. Discord
               | latency? We've been experimenting with Zoom and Discord
               | for online trivia tournaments where if one participant
               | had better latency than another that would give a big
               | advantage. I'm sure that has to happen on any platform,
               | but if there's a bigger variance on one platform vs. the
               | other that would be good to know.
        
           | fossuser wrote:
           | Yeah, I wish there was a simple non-verbal option to signal
           | intent-to-talk.
           | 
           | I want to just be able to hit my self-view and have it have a
           | big icon on it or something so the person currently speaking
           | (and everyone else) can see that I want to say something.
           | Maybe sort these in chronological order so the speaker can
           | see who wanted to talk first?).
           | 
           | In theory you could do this with a good chat, but for some
           | reason the chat in Zoom and the others is kind of an
           | afterthought and nobody uses it.
           | 
           | One of the reasons I prefer text based chat is multiple
           | people can talk at the same time without needing to deal with
           | interrupting audio. If you can type well, the bandwidth is
           | higher for group communication (and you get a log).
           | 
           | At least with video you can kind of tell when someone is
           | waiting to speak by seeing their expression. Audio only is
           | worse (but maybe wouldn't be, if you had good intent-to-speak
           | tools built into the app?)
        
             | erichurkman wrote:
             | We rolled out
             | https://chrome.google.com/webstore/detail/nod-reactions-
             | for-... to all of our devices. The quick 'raise hand'
             | button is great for what you're looking for.
        
             | ghaff wrote:
             | >At least with video you can kind of tell when someone is
             | waiting to speak by seeing their expression. Audio only is
             | worse (but maybe wouldn't be, if you had good intent-to-
             | speak tools built into the app?)
             | 
             | Which is one good reason to use video. At least with
             | smaller meetings, someone can raise their hand or just look
             | really pained. (Bigger meetings, you probably need to use
             | chat.)
        
             | kooshball wrote:
             | >Yeah, I wish there was a simple non-verbal option to
             | signal intent-to-talk.
             | 
             | this is a solved problem. webex has had a hand raise
             | feature for a decade now. trivial for google to just copy.
        
             | boogies wrote:
             | I really like how Jitsi Meet puts the hand raising/lowering
             | button right on the bottom bar, where there's just empty
             | black/white space in Zoom/Google Meet, and not buried
             | inside a menu labeled "Participants" (???), where it's a
             | hassle to access (Zoom).
        
             | llampx wrote:
             | Microsoft Teams has a "raise your hand" function that's
             | pretty handy.
        
           | crazygringo wrote:
           | If the speaker did hear you interrupt, then that's actually a
           | latency issue, not a noise/mixing issue.
           | 
           | When a conference call is made up of people all in the same
           | city on decent internet connections, latency is usually not a
           | big issue.
           | 
           | But when a conference call has people from New York, San
           | Francisco, and Japan on it, even if it's only 3 participants,
           | latency can be bad just because of the speed of light,
           | essentially (on top of what is otherwise reasonable
           | hardware/software latency). Latency may be bad even if you're
           | talking with a colleague in the same city, since the audio is
           | "mixed" on the server, and that server might be across the
           | world if a participant from across the world started the
           | meeting. (Counterintuitively, the latency with your local
           | colleague could be _twice_ as bad as with the colleague from
           | across the world.)
        
             | the_af wrote:
             | You're probably right! Though I've just experienced this
             | issue with three participants, all in the same city (not in
             | the US though). It's really annoying.
        
               | avianlyric wrote:
               | I've experienced some pretty horrific latency in Google
               | Meet that seemed to originate from my local device, where
               | only my connection would suffer from high latency.
               | 
               | Typical restart-all-the-things usually made it go away.
               | But it wasn't unusual for 500ms of latency to slowly
               | build up during a 30min call. Unfortunately I have
               | nothing more useful to add, the issue resolved itself
               | before I could track down a definitive cause.
        
               | bdamm wrote:
               | Bluetooth.
               | 
               | Stop using it.
               | 
               | Without even looking at your setup, I would bet $100
               | minimum that it's Bluetooth latency. It adds a lot of
               | latency, 500ms is not unusual, and many folks have no
               | idea that all that latency is really just the last 18
               | inches. This is why you're seeing more and more cases
               | where people are using good old iPhone wired earphone for
               | conference calling, especially when skyping a TV
               | interview.
        
               | close04 wrote:
               | This is also a problem of people not understanding that
               | audio conferences aren't just a regular conference but
               | with headphones.
               | 
               | There are a few things that most meetings could benefit
               | from. Having an organizer who's aware of the differences
               | between leading an in person meeting and a remote
               | meeting, cutting video to save bandwidth if the meeting
               | doesn't absolutely need it (the organizer can usually
               | just disable the function), _muting when you 're not
               | speaking_ (by far _the best_ quality of life improvement,
               | can be done silently by the organizer if someone is just
               | doing their Vader impersonation throughout the meeting),
               | using the  "raise hand" function (again, the organizer
               | plays a huge role here), using the native app instead of
               | the web one usually provides better quality and
               | performance, using a wired connection instead of wireless
               | if possible, sometimes even starting meetings at non-
               | standard hours (like 15 to/past the hour) helps avoid the
               | rush of people logging in at the same time, etc.
        
               | blackoil wrote:
               | You can check your WiFi and try with an ethernet cable.
               | Wifi has tendency to add unpredictable latency.
        
               | dddddaviddddd wrote:
               | This applies to everyone on the call
        
           | jmole wrote:
           | The biggest reason for this is people not using headsets. If
           | someone is just using their laptop speakers and mic, Meet
           | will prioritize the mic if they're talking and will duck any
           | audio that comes through the speakers.
        
             | jonpurdy wrote:
             | 100%. I wrote a post about this (among other basic tweaks)
             | a couple of months ago: http://jonpurdy.com/2020/03/how-to-
             | improve-your-zoomskype-te...
             | 
             | I have some screenshots of waveforms showing laptop mic vs
             | headset, and the signal-to-noise ratio with the headset
             | destroys even good noise-cancelling using a laptop mic
             | that's farther away from one's mouth.
        
             | znpy wrote:
             | I have headsets and tried pretty much everything. there's
             | always background noise from me. I'm even using the audio
             | cable with my bluetooth (!) headsets, turning bluetooth
             | off.
             | 
             | I don't know what else I can do.
        
               | rob-olmos wrote:
               | I use a Plantronics Legend bluetooth headset, which is
               | pretty good at cutting out background noise. Tested with
               | a phone.
               | 
               | Cheaper bluetooth headsets seem to pick up everything
               | around them. Had that issue with a coworker where the
               | headset was worse than using the internal mic.
               | 
               | Biggest and annoying issue though is consistent bluetooth
               | disconnect/reconnect issues even on different MacOS
               | machines. Latest firmware and such. Pretty sure it's not
               | 2.4ghz interference.
        
               | gxqoz wrote:
               | I've heard that the original Bluetooth standard is pretty
               | terrible for audio, especially for microphones. On
               | Windows PCs at least, old protocols can cause a bad
               | experience:
               | 
               | "Modern high-end Bluetooth headsets support AptX, an
               | audio codec compression scheme that offers better sound
               | quality. But AptX is only enabled if it's supported on
               | both the transmitter and receiver. When using a Bluetooth
               | headset with a PC, it only works if your PC's hardware
               | and drivers are compatible."
               | (https://www.howtogeek.com/354321/why-bluetooth-headsets-
               | are-...)
               | 
               | Not sure if this applies to a Mac though.
        
               | rectang wrote:
               | There can still be background noise, but if you're
               | wearing a headset there is not a feedback loop where
               | noise from the _other_ participants gets looped through
               | your mic and speakers.
               | 
               | Participants often don't realize that they're the culprit
               | when _somebody else_ sounds terrible.
        
             | cellularmitosis wrote:
             | So much this. I'm almost at the point of stating that echo
             | cancellation has done more harm than good, because we are
             | now in a situation where 80% of people have no idea that
             | wearing earbuds could make a tremendous difference in call
             | quality, and everyone just expects the software to
             | magically take care of it.
             | 
             | Sadly, the software does not just magically take care of
             | it. Anytime two people talk, a typical echo canceler just
             | starts decimating frequencies until both of them are
             | unintelligible.
             | 
             | Add in a couple of clueless teams who mount a camera/mic
             | against a conference room wall and introduce massive
             | amounts of room echo into the mix, and I'm at the point
             | where a conference call becomes an absolutely mentally
             | exhausting experience just trying to decipher what is being
             | said. I have no hope of contributing, because I can only
             | hear 2/3 of the syllables, and my brain is running on
             | overdrive trying to turn those back into words. By the time
             | I've figured out what they just said, they're half-way into
             | the next sentence. What a stressful hellscape.
             | 
             | Ironically, if we had no echo cancellation, it would force
             | everyone to use ear buds, and the average call quality
             | would be a lot better.
        
           | skybrian wrote:
           | Some folks find that switching from WiFi to Ethernet for your
           | home office can help:
           | 
           | https://www.jefftk.com/p/ethernet-is-worth-it-for-video-
           | call...
        
             | bdamm wrote:
             | 100% agree. Latency is a killer to natural conversation so
             | if you want to be your best on a conference call, no Wi-Fi
             | and no Bluetooth.
        
           | bentcorner wrote:
           | I think part of the problem is that the tooling and the
           | societal norms still need to evolve. The tooling is getting
           | there - Zoom/Teams (I don't know about Meets) have buttons to
           | communicate out-of-band beyond just text chat. We need to
           | have more of that, I imagine eventually we'll have a wide
           | range of ways to express ourselves (and customs/norms to
           | match). Although I don't know if that'll happen before most
           | people stop working from home.
        
           | jfim wrote:
           | > Even in a relatively small conference, I have a really hard
           | time interrupting someone to ask a quick question, even when
           | the speaker is expecting interruptions. It's always "excuse
           | me!"; delay as person continues speaking; I stop; the other
           | person says "yes, please ask away"; when I restart my
           | question the other person already assumed I've changed my
           | mind and continues speaking; repeat ad infinitum.
           | 
           | One way to solve this is to have the speaker name the person,
           | and then wait until that person speaks. For example, if
           | someone interrupts:                   Speaker: [Talks]
           | Person A: Excuse me!         Speaker: Yes, Mr. A? [waits]
           | Person A: What about X?
           | 
           | Or if there are two people talking at the same time
           | Speaker: [Talks]         Person A and B: Excuse me!
           | Speaker: Yes, Mr. A? Mr. B, I'll come back to you after A.
           | [waits]         Person A: What about X?         Speaker:
           | [Talk about X]. Mr B, you were saying?         Person B: What
           | about Y?
           | 
           | Treat it as a synchronization problem, with the speaker
           | breaking the ties. As long as it's obvious to everyone whose
           | turn it is to speak, it works well (assuming people aren't
           | too rowdy/impolite).
        
         | m463 wrote:
         | there are products that do this now with all kinds of apps:
         | 
         | https://www.nvidia.com/en-us/geforce/guides/nvidia-rtx-voice...
         | 
         | I think you can do it not only to your microphone (outgoing
         | audio), but to the other participants in the meeting (incoming
         | audio)
        
         | Terretta wrote:
         | See also https://krisp.ai
         | 
         | Mute your or participants' background noise in any
         | communication app
         | 
         | https://krisp.ai/technology/
        
         | dmos62 wrote:
         | That's exciting. Isn't this difficulty one of the reasons why
         | open-source VOIP clients are rare?
        
         | vorpalhex wrote:
         | > videoconferencing generally only plays audio from one or two
         | participants at most
         | 
         | I have noticed this and I __hate __it. It makes normal
         | conversation absolutely impossible.
         | 
         | Discord, which is an audio first product, is much better than
         | other solutions in this regard and their video conferencing
         | while new has been very enjoyable to use.
        
         | ghaff wrote:
         | >Most people have no idea of the amount of incredibly advanced
         | signal processing that goes into echo cancellation and noise
         | cancellation in videoconferencing.
         | 
         | We pretty much take echo cancellation for granted at this
         | point. Using something better than your laptop microphone on a
         | call is still a good idea but I'm not sure that wearing
         | headphones/earphones is that big a deal at this point.
         | 
         | You don't need to go back _that_ far until speakerphones other
         | than very expensive Polycoms and the like were pretty mediocre
         | at cutting out because of echo.
        
       | neximo64 wrote:
       | Any battery life tests of this tech on phones?
        
         | kccqzy wrote:
         | None. Because the processing doesn't happen on a phone.
         | 
         | > When you're on a Google Meet call, your voice is sent from
         | your device to a Google datacenter, where it goes through the
         | machine learning model on the TPU, gets reencrypted, and is
         | then sent back to the meeting.
        
       | jdm2212 wrote:
       | I might be unusual, but my experience with videoconferencing has
       | been that ambient noise is rarely a major problem. The big issue
       | is audio cutting out due to a shaky network. When ambient noise
       | is a problem, it's not so much someone typing as their spouse
       | talking in the background or a fire engine going by -- and at
       | that point the solution is for them to hit mute.
        
         | gav wrote:
         | Most of the issues I have with ambient noise on call could be
         | solved with people investing in a better headset. It's a
         | significant improvement over using your laptop's built in one.
         | 
         | The issue I find a bigger problem is lag causing people to talk
         | over one another. I've been on a lot of calls where the call
         | quality was fine but conversations were difficult because it
         | was hard to judge when the other person had stopped talking.
        
           | Wowfunhappy wrote:
           | All voip seems to have terrible latency and it's so
           | frustrating! Mumble does a _really_ good job with this and
           | has for years, why can 't we get that in a mainstream
           | solution?
        
         | kyriakos wrote:
         | In my experience kids randomly talking is worse than any
         | constant background noise.
        
         | adrianmonk wrote:
         | > _at that point the solution is for them to hit mute_
         | 
         | From a technical point of view, that is really the best thing.
         | It works, and sometimes it's the only thing that works.
         | 
         | But if you try to get people actually do it, you run into
         | problems:
         | 
         | (1) They don't realize it's them. AFAIK the system doesn't play
         | their audio back to them, so while everyone else hears the
         | noise, they don't. The one person who needs to take action is
         | the one person who doesn't know action is necessary.
         | 
         | (2) They are distracted. When their spouse is talking, they are
         | focused on whatever their spouse is saying, not on how it
         | affects the meeting audio. Or the meeting is boring and they're
         | not paying attention.
         | 
         | (3) They just don't care enough. They are there to attend a
         | meeting, not fiddle with computer stuff. Some people will never
         | take the time to learn where the mute button is in the
         | software.
         | 
         | Perhaps #1 could be improved, though, with some kind of
         | blindingly obvious indicator in the UI. If "YOUR MIC IS WHAT
         | EVERYONE IS HEARING RIGHT NOW" flashes when your mic takes the
         | floor, maybe you'd notice it lighting up when you didn't intend
         | for it to.
        
           | ShroudedNight wrote:
           | AWS Chime has significant drawbacks, but one of the things I
           | most liked about it was that anybody could mute anybody else.
           | The number of calls where that significantly cut down on
           | audio discomfort was surprising.
           | 
           | For those wondering, unmuting is a privileged operation that
           | only the user could do themselves.
        
           | dddddaviddddd wrote:
           | An attentive moderator can address all these issues, but it's
           | frustrating to be just a regular participant who can't mute
           | others.
        
       | trboyden wrote:
       | Not very well. Watched a Google Meet meeting for the neighbor's
       | Honor Society induction and the quality was horrible. Video kept
       | freezing and audio cut in and out. Was probably only about a
       | dozen attendees in the meeting room. Wasn't the neighbor's
       | connection either, they have a solid Fios 200/200 service.
        
         | thebeefytaco wrote:
         | Did you even read the article? This is talking about a new
         | noise cancelation feature being introduced today.
        
       | xeno42 wrote:
       | I've been using https://krisp.ai/ to great effect with Zoom while
       | sitting outside on the laptop with road traffic, birds, etc
       | nearby - My team really had a "wow" moment when i turned it on
       | the first time
        
         | BadassFractal wrote:
         | It would be amazing if there was a tool like Krisp that could
         | automatically noise cancel outside noise in your headphones for
         | people who work with audio in loud environments. Not clear if
         | that's at all possible without your headphones having
         | microphones built into them to accurately detect incoming
         | outside signal.
        
           | bradstewart wrote:
           | How is this different from noise cancelling headphones
           | currently available? Or do you mean something like this to
           | add the feature to non-noise cancelling headphones?
        
             | woofcat wrote:
             | >Not clear if that's at all possible without your
             | headphones having microphones built into them to accurately
             | detect incoming outside signal.
             | 
             | I'm guessing they mean to add the feature set to standard
             | headphones. Leveraging say the laptop microphone to provide
             | active noise canceling to someone with a standard set of
             | earbuds.
        
               | nuccy wrote:
               | Noise cancelling works by shifting the sound waves of
               | noise, which come into your ears. The ups and downs (of
               | pressure) in the sound wave are added together,
               | cancelling the wave altogether. Each ear get different
               | noise, so the microphones should be as close as possible
               | to each ear and work absolutely independently. Thats why
               | microphone of your laptop is not of any help here, it
               | simply gets completely different noise, which cannot
               | cancel out one getting into your ears. This is more
               | physics than software.
        
               | dmurray wrote:
               | With two different microphones on the laptop, you could
               | triangulate sources of noise and figure out what will
               | reach your ears. With three or more, even better. This
               | sounds like a difficult and interesting signal processing
               | problem, but I wouldn't rule it out.
        
               | StavrosK wrote:
               | It would also have to know where each of your ears is in
               | relation to the microphone with millimeter accuracy.
        
           | Kirby64 wrote:
           | It's not possible. The only reason ANC works is because the
           | microphones are located (physically) to your ears and so are
           | the speakers/headphone drivers. If they're in some random
           | location you can't inject anti-noise and you can't detect the
           | noise accurately.
        
         | meritt wrote:
         | Krisp is embedded into Discord (enable beta settings) and the
         | voice chat quality far exceeds any of the "business" focused
         | software I've ever used.
         | 
         | Not to mention the screensharing is infinitely better as well.
         | It's pretty pathetic of the busines sapps, we went through a
         | day where I was trying to screenshare something and my remote
         | coworkers kept complaining of lag, blurriness, or the app would
         | just crash (slack). We went through ms teams, zoom, slack, and
         | google meet. All had issues. Convinced everyone to install
         | Discord and suddenly I was able to shared my desktop perfectly
         | at 1080p without noticeable lag and crystal clear audio.
        
           | the_pwner224 wrote:
           | +1
           | 
           | Discord's lack of lag in audio makes a huge difference for
           | voice comms. I've only used it for gaming, but you can really
           | tell the difference when you switch to the game's voice chat
           | feature which has probably a third of a second of latency.
           | And of course Zoom et. al. have a lot more lag and it really
           | hurts the experience. In addition to low latency, the sound
           | is also very good quality.
        
           | Kirby64 wrote:
           | I will say, using Krisp, it has the same problem that
           | basically all these 'AI' based noise cancelling seem to
           | exhibit: sound quality deteriorates when outside noise is
           | suppressed, and people seem to sometimes not meet the
           | threshold and get completely cut out from talking in some
           | scenarios.
           | 
           | It's still better than food noises, but I have noticed that
           | as a disadvantage.
        
       | kemayo wrote:
       | > Google also made a conscious decision to put the machine
       | learning model in the cloud, which wasn't the immediately obvious
       | choice.
       | 
       | Oh good. Meet is already a _huge_ battery-hog on my laptop, so
       | adding fancy signal processing client-side was worrying me.
        
       | dekhn wrote:
       | What I'd really like to see is effective source seperation and
       | nulling. For example, if you could mute the screaming baby in the
       | background of a VC speaker (this has been fairly common
       | occurrence now that we are WFH and it's hard to get day care).
        
         | cyrux004 wrote:
         | I was really looking for the baby noise test in the demo, but I
         | guess for now its human vs non-human cancellation ?
         | 
         | Edit: apparently can also remove kids crying; just not included
         | in demo
        
       | david_draco wrote:
       | It would be fun if it canceled out screaming.
       | 
       | We somehow have this sexist social expectation that women who
       | show their feelings (crying, screaming) are "hysterical" (really
       | a nasty word) and not taken seriously. If so, men screaming
       | should be equally considered a sign of immaturity and lack of
       | self-control.
       | 
       | Also could help with customers ("Sorry, I can't hear you!").
        
       | skybrian wrote:
       | > A musical instrument will probably also get filtered out. "To a
       | pretty large degree, it does," Lachapelle said. "Especially
       | percussion instruments. Sometimes a guitar can sound very much
       | like a voice -- you're starting to touch the limits there. But if
       | you have music playing in the background, usually it'll cut it
       | all out."
       | 
       | This is a big issue with hearing aids. The whole industry is
       | focused on optimizing for voice intelligibility and as a musician
       | you end up doing trial-and-error with the audiologist to turn all
       | that stuff off.
       | 
       | We need more open source hearing aids - I've read of a few but
       | they're not mainstream.
        
         | aaronAgain wrote:
         | First, I'll say this to everyone: Get hearing aids if you need
         | them. They can change your world.
         | 
         | About music, this is getting much better in hearing aids. I've
         | been from analog thru digital over 15+ years of hearing aids,
         | and my latest (3 months ago) pair from Phonak (no affiliation)
         | is an honest leap forward. It has a built in Music profile that
         | disables all sound optimizations in general, while still
         | attempting to correct the hearing ranges that you have a
         | deficit in. I was on the verge of no longer being able to hear
         | with hearing aids, that has probably been extended by 3-5 years
         | with these new models. At that point I will be approaching
         | cochlear implant level hearing loss. I happily embrace my
         | cyborg future!
         | 
         | On top of Music, I have a Walking profile that attempts to
         | focus on the person that is walking to the left or right of my
         | and can pick with side on the fly. And they make great ear
         | plugs when things are loud.
         | 
         | The Normal program, auto-magically selects between 8'ish
         | profiles to pick the best one for the environment. And it has
         | finally got it right. Older models I would daily need to force
         | it into the best mode because it guessed wrong. The latest
         | model I only have to tell it what to do once every few weeks.
         | 
         | And to the original topic, noise cancellation, hearing aids
         | bluetooth'ed to the phone/PC for conference calls is hands down
         | the best possible audio experience. Built in noise
         | cancellation, amazing microphones that can be used for your
         | voice portion of the call, tuned to your hearing, with some of
         | the finest sound output possible. Just amazing. These things
         | are so good these days that they are finally being labeled as
         | assistive devices for people without hearing loss. They can
         | give someone with normal range hearing essentially bionic
         | hearing. Tinnitus? They play customized white noise to make the
         | ringing less noticeable. Doesn't help everyone, but it's really
         | nice for me. I hear more ringing when I take my aids out.
         | 
         | Oh, and it does all of this on a device that fits in your ear
         | with a battery the size of a few grains of rice and all in a
         | few milliseconds so your brain sees the mouth move at the same
         | time it actually hears the audio.
         | 
         | Again, get them if you need them.
        
           | skybrian wrote:
           | They have some great features and I agree that people should
           | get them (or upgrade), but they are not optimized for
           | musicians.
           | 
           | I just got a similar model I assume (M90-R) and it's
           | definitely not switching to music mode automatically when I
           | play music. (Maybe it's different for listening.) I just had
           | the audiologist add a music mode that I can switch to
           | manually, but getting acceptable timbre for the instruments I
           | play (accordion, melodica, and piano) is work in progress.
           | Making an expensive instrument sound like cheap trash is
           | disappointing, though of course I can take them out.
           | 
           | Having Bluetooth is nice, particularly for phone calls, but I
           | find the sound quality is unsatisfying for listening to
           | music, so it won't be replacing speakers for me.
        
         | tonystride wrote:
         | This is also a problem with teaching online piano lessons,
         | sometimes the piano gets filtered out and makes it hard to hear
         | what the student is doing.
        
           | secabeen wrote:
           | Zoom has a "use original audio" toggle that helps a lot with
           | this.
        
           | MengerSponge wrote:
           | I don't know if a similar option is available for other
           | products, but Zoom has a direct audio option:
           | https://support.zoom.us/hc/en-
           | us/articles/115003279466-Enabl...
        
       | pierrebai wrote:
       | As far as my experience goes, the single best way to deal with
       | background noise is... the mute button.
       | 
       | In every video conf I've been, you can instantly tell when "one
       | of them" who can't be bothered to mute themselves joins. The
       | audio quality immediately goes down the drain. It's always the
       | same subset of people who do it, too. As soon as they're enjoined
       | to please mute, the audio quality is restored.
       | 
       | No amount of magic signal processing will ever match it.
       | 
       | While perhaps misguided to use it that way, the mute button thus
       | act as a social-clueness meter.
        
         | leokennis wrote:
         | Wholeheartedly agree. I often Skype-call with a group of 10 and
         | we all have perfect mute-discipline. It's just like a regular
         | in person meeting, except there is just audio.
        
         | Terretta wrote:
         | "Hold space to talk" is not a bad solve for this, also makes
         | folks ramble less.
        
       | GhostVII wrote:
       | I wonder how much benefit you would get from targeting specific
       | microphone/speaker setups for noise cancelling rather than
       | treating everything the same. I would imagine that the noise
       | cancellation requirements are far different for someone video
       | conferencing over a laptop mic and speaker versus a good pair of
       | Bose headphones. If you could specify what type of device you are
       | using it could tune the noise cancellation accordingly - if I am
       | using a good pair of headphones, I don't need echo cancellation,
       | but I still need to filter out some amount of background noise.
        
       | newfeatureok wrote:
       | None of this fancy technology is necessary IMO.
       | 
       | Just implement push to talk with mute-by-default. 90% of the
       | audio issues would be resolved. Another 5% could be solved by
       | buying everyone a decent headset which hopefully has a push-to-
       | talk button on it as well.
        
         | buttersbrian wrote:
         | In theory yes. But what about users that are mobile on a call
         | and even when they "push" to talk, ambient noise from the
         | metro, crowd, or traffic is present enough to be troubling?
         | 
         | You don't always get to choose if background noise is present.
         | 
         | Also, you just asked that people push a button, and wear a
         | headset. That is a lot, and this is about lowering the bar
         | needed to get a good experience.
        
       | monkey26 wrote:
       | Funny timing. Just got off my first Google Meet call an hour ago
       | and was thinking they need to add noise cancellation. It was
       | awful.
        
         | washadjeffmad wrote:
         | We've been holding 25+ participant Teams sessions, and the only
         | rules are that only one person can speak at a time and no one
         | unmutes unless they're speaking. The noise cancellation might
         | as well not even exist.
         | 
         | Comparatively, I was impressed that we could even have a Meet
         | without everyone needomg to be on mute.
        
         | the_af wrote:
         | I feel you! Did you also have your kid shouting in the
         | background? If they found a way to specifically mute kids
         | crying or shouting it would be a huge deal :P
        
       | JoeAltmaier wrote:
       | Everybody makes a stab at this, and very little of it works
       | consistently. I applaud Google for attacking this head-on! It is
       | a big issue and deserves attention.
       | 
       | My biggest issue (when I worked in videoconferencing) was
       | echoing, and locking onto the delay window where echoes could
       | occur. Depending on the distance from a conference room speaker
       | to all the walls, echoes could occur at one or more offsets
       | (appear at microphone input with some delay after presenting at
       | the speaker). And ambient noises could masquerade as echoes. The
       | filters tend to be IIR filters, and get wound up easily. It was
       | awful.
        
       | m0zg wrote:
       | > How Google Meet's noise cancellation works
       | 
       | Very poorly. Of all the available alternatives (Zoom, Skype,
       | FaceTime), Google Meet seems to have the worst audio _and_ video
       | quality. This is inexplicable for a company very easily capable
       | of technological and product leadership in both of those things.
        
         | thebeefytaco wrote:
         | Did you even read the article? This is talking about a new
         | noise cancelation feature being introduced today.
        
           | m0zg wrote:
           | Quite obviously not, just like almost everybody else in the
           | comments.
           | 
           | Shouldn't the title be "How the _new_ Google Meet noise
           | cancellation works" then?
        
       | pkaye wrote:
       | I need to get hearing aids soon and heard about all the advances
       | and limitations. Particularly the problems with noise
       | cancellation. I hope this kind of technology trickles into
       | hearing aids also.
        
       ___________________________________________________________________
       (page generated 2020-06-09 23:00 UTC)