[HN Gopher] NoiseTorch: Real-time microphone noise suppression o... ___________________________________________________________________ NoiseTorch: Real-time microphone noise suppression on Linux written in Go Author : ClawsOnPaws Score : 314 points Date : 2020-07-18 09:32 UTC (13 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | hu3 wrote: | I'm curious about the impact of Go's Garbage Collection in a | real-time project like this. | | From reading past comments in other Go related threads I was led | to believe this was impossible to achieve with Go. | | I'm talking about threads like this: | https://news.ycombinator.com/item?id=21036037 | gingerlime wrote: | Anything similar for MacOS ? I tried krisp.ai which is nice but | seems too heavy on my 2015 MacBook Air together with zoom | wenc wrote: | Very nice. Krisp.ai is a commercial option, and NVIDIA RTX is | free but requires a CUDA card, so this is a great alternative. | | Noise suppression is becoming more and more common. My Jabra | headset has it built in. | kbouck wrote: | When testing Krisp.ai, I recorded myself speaking inches away | from a noisy water boiler. In the playback, I could not even | hear the water boiler, but voice came through clearly. Signed | up for the service immediately after that. | orware wrote: | I signed up for it too last weekend after coming across it | after doing some research (I had been making a bunch of video | recordings a few days prior, and once the videos were added | into Camtasia and the audio played back I noticed a lot of | background hum coming from my HVAC return outside of the room | I'm in). | | Was impressed with the Krisp.ai tech as well and probably | works similarly to this tool and the other Nvidia solution | that I can't try out since I don't have an RTX card (main | difference might be the overall training set that Krisp has | already run their algorithm through?). | | I haven't had any Zoom meetings since purchasing Krisp, but I | had been using the built-in mic from my LG Tone headset for | those meetings. | | Since making those video recordings I've been using my blue | Yeti mic (and a pair of headphones connected to the mic for | listening) as my primary and I've continued running a bunch | of small tests to try and see if I can be happy with using | Krisp enabled all the time. | | Currently, I don't feel comfortable with leaving it on all of | the time though for recordings, particularly with something | like the blue Yeti mic which is able to capture pretty rich | audio. In my testing, Krisp did a great job of eliminating | the background HVAC humming noise, but replaced that issue | with two others: some minor (but distracting) hiss/noise | between words as I'm playing back the recorded audio, and | also currently is limited to 16000mhz frequency (not sure if | mhz is correct or not in this case...this is what support | shared with me when I asked about audio quality degradation). | The support person did respond though and say that the team | is working on the increasing the frequencies they are able to | work with though so I guess there might be some improvements | in the near future on it? | | After seeing the latency figures on the NoiseTorch page it | makes me wonder if the Krisp latency is similar or not (so | far I haven't noticed any latency issues with Krisp). | | As far as remaining thoughts...I kind of wish there was a bit | more configuration options available for Krisp, but the | simplicity of it is also a benefit (for others that might not | be as technical and just want a simple solution that does | appear to work overall). I haven't gotten it to work for | playback needs (it has the toggle for it, but nothing seems | to happen when I try and toggle that on). Also, still not | sure what the overall differences/improvements with Krisp | Rooms enabled (I am recording in a room, but after reading | their description/blog announcement page it kind of seems | like it's more for conference rooms where multiple people are | speaking and extra echo cancellation might be useful? ref: | https://krisp.ai/blog/krisp-rooms-launch/) | | Since I'm already out with a year subscription with them I'll | continue to try and figure out how to use it effectively, but | not as excited about it at the moment compared to how I was | last weekend initially (impressive overall though...hopefully | it continues to improve :-). | fred123 wrote: | 16kHz sample rate (= max frequency 8kHz) should be enough | for speech only. Human voice is mostly <0.5kHz. You may | hear some difference for hisses or for room sounds etc. but | I'm sure you're unable to hear any difference to higher | sample rate in a voice chat setting | tazjin wrote: | I've recently built the inverse of this using NSFV | (https://github.com/werman/noise-suppression-for-voice), i.e. | suppressing noise in _incoming_ audio. | | A lot of people - despite being forced to work from home - simply | don't seem to care about the way their audio sounds. Many don't | even try to tackle these problems after it's been pointed out to | them that they're being a nuisance in online meetings. | | I gave up on trying to help people fix their setups, or | convincing them that it matters, and switched to doing this on | the receiver end. It's been a massive quality-of-life | improvement. | | If you're interested in the setup, you basically just need a | small script that loads the pulseaudio plugin and wires up the | sources/sinks correctly. | | My setup script is here: | https://cs.tvl.fyi/depot@canon/-/blob/tools/nsfv-setup/defau... | | And some more context: https://cl.tvl.fyi/c/depot/+/578 | sgt wrote: | Would this work as a general suppressor against noisy neighbors | across the street listening to bass? Can't hear the music, only | feel the bass. | g_p wrote: | For anyone wanting to try this, it is pretty straightforward - | first install noise-suppression-for-voice (AUR package | available, binaries available via Github releases). You want to | have librnnoise_ladspa.so and librnnoise_lv2.so available. | | Then identify your current output sink by running `pactl list | sinks short`. One will be "RUNNING", and this is your active | sink. Keep this name to hand. | | Create and enable an output sink using this plugin: | | pacmd load-module module-ladspa-sink | sink_name=denoise_sink_for_apps.stereo | sink_master=YOUR_OUTPUT_SINK_FROM_ABOVE_HERE | label=noise_suppressor_stereo plugin=librnnoise_ladspa.so | control=0 | | The value of sink_master should be the output sink name from | above, and the control=0 parameter can be adjusted - that seems | to be the voice auto detect threshold. | https://github.com/werman/noise-suppression-for-voice/issues... | suggested 0, but I found it benefited from being higher. You | can compare before/after by changing your pulseaudio output | sink at system level (or application level) back and forth. | tazjin wrote: | Thanks for these notes! | | If you have Nix installed, you can also try my script | directly with this command: | | nix-build -E '(import (builtins.fetchGit | "https://cl.tvl.fyi/depot") {}).tools.nsfv-setup' | basilgohar wrote: | I think this is an out-of-sight, out-of-mind kind of issue. | They simply don't understand how their noise, which they do not | perceive, can be so detrimental to others. Moreover, a lot of | people simply can't grasp the difference good hardware or even | just a different setup (moving away from noise sources like | fans, open windows, appliances running, etc.) can impact the | quality of their sound. Lastly, a lot of people either cannot | or think they cannot do anything about it, so they dismiss | others' concerns because "everyone else has problems too", | equating their noise to be the same as others'. | g_p wrote: | Another issue I've seen is people using their device's built- | in speaker and microphone (to form a loopback-fest) that the | onboard echo cancellation tries its best to deal with. | | I think there's certainly a part around "but I can't see the | difference" - it's hard to get rapid feedback on if it's | better or worse, since you won't notice the difference in | change to setup. | | In any case, being able to create a pulseaudio sink that puts | the audio from one application through a noise removal chain | sounds to me like a decent "quick fix" for me - I've tried to | listen to some webinars with such horrible audio it was | pretty much impossible to listen to, yet with otherwise | worthwhile content. I wonder if this would be enough to | improve it, or if the issues lie elsewhere (low quality | transcodes). | sdwvit wrote: | Or they simply don't care or don't want to invest effort into | solving it [?] | btashton wrote: | Why is this the company employees problem. Seems like work | should be supplying good audio hardware if this is a real | issue. | kazagistar wrote: | After getting good hardware it took hours to get it set | up just right. | basilgohar wrote: | This is always a possibility, but we can kill ourselves if | we try to figure who's sincere and who's not. | ponker wrote: | Why don't the services like Zoom and Teams fix this on the | server side? | draugadrotten wrote: | Teams have something in the pipe for this, that will cancel | out noise. I think it was a youtube video linked here a | while back. | | Here's another https://www.youtube.com/watch?v=oCrCkgjZEXQ | spacechild1 wrote: | Zoom actually does noise and echo cancellation by default, | but you can turn it off. | tazjin wrote: | Google Meet supports noise-cancelling in outgoing audio | (for business customers), but the user needs to enable it | once in their settings. In my experience this last bit is | already a hurdle ... | | (disclaimer: I work at Alphabet) | ralphm wrote: | If services are indeed doing, or moving to, end-to-end | encrypted media, there's nothing the server can do here. | bigiain wrote: | Doesn't mean they can't do anything about the noise, just | that they can't do it on the server. | | The RNNoise link from the bottom of that post runs the | noise suppression in real time in JavaScript. Zoom et al. | could do this client side while still doing proper E2E. | (Although Zoom already uses more cou that I think it | needs to...) | RMPR wrote: | Does it work with Pipewire? | closeparen wrote: | I find most of the nuisance on Zoom calls is from children and | pets. For example, the mailman usually comes to my boss's house | during standup, and he's out of commission for a solid five | minutes due to the barking. | Abishek_Muthian wrote: | Nicely done! | | I went through some core libraries being used in the project, | there's a pure Go pulseaudio implementation[1] which seems to | deserve few more stars and the GUI framework nucular[2] seems | support even metal rendering on macOS. I like how the native GUI | frameworks for Go are becoming viable alternative to Qt. | | Off-topic, Since this thread might attract audio programmers- | | I was looking at ambient noise cancellation, audio amplification | implementation for TWS earphones(BL 5.0) without those features | on Android[3], would the latency defeat the purpose because it | isn't implemented on device and does android bluetooth/audio APIs | provide necessary access to implement such features in an app? | | [1]https://github.com/lawl/pulseaudio | | [2]https://github.com/aarzilli/nucular | | [3]https://needgap.com/problems/22-enabling-hearing-aid- | feature... | freedomben wrote: | Is this using GTK? What bindings? | hu3 wrote: | Not GTK but https://github.com/aarzilli/nucular which is a Go | port of https://github.com/vurtun/nuklear | jcastro wrote: | I've been using this for the past few days and it's been | fantastic, every distro should just do this out of the box. | kochthesecond wrote: | This is pretty cool! | formerly_proven wrote: | Most noise suppression I've seen so far can shave off a few dB | (worth gold already), but when you try to suppress more noise it | always starts to impact the signal very negatively. Interesting | to see whether these ML approaches can do better. I suspect they | might depend even more on the type of your voice than | conventional noise suppression. | fred123 wrote: | Note that most state of the art machine learning based | denoising models perform MUCH better than rnnoise quality wise, | but they are mostly not tuned for real time use. | | If you're interested, have a look at some of the Interspeech | 2020 Deep Noise Suppression submissions. | fred123 wrote: | Some examples here: https://paperswithcode.com/task/speech- | enhancement | | Some of them have audio samples. | drblah wrote: | As far as I can see this uses RNNoise. If you haven't checked it | out yet you should, because it is simply amazing. It is a super | effective noise gate / noise removal tool that does not require | any configuration whatsoever. | | My study mates and I have been using it over the last four months | when working from home. It removes the noise of keyboards, | seaguls and vacuum cleaners. | | It is essentially the same as Nvidia RTX voice except it is much | lighter on the system and does not require an Nvidia GPU. In our | testing RNNoise performs similarly. | | This project looks super cool. It seems to make RNNoise much more | accessible. Normally you would have to manually set up the | pulseaudio plumbing for this to work. | swyx wrote: | do you need Linux to run your version? would love to get this | running on my Mac. | drblah wrote: | I have mainly used the built in RNNoise support in Mumble. | But you can use https://github.com/werman/noise-suppression- | for-voice/ and build the VST plugin (This is also what | NoiseTorch uses i think). Then use any application that can | load VST plugins to pipe your mic through. I have had | reasonably good luck with it on Windows with Equalizer APO. | tyfon wrote: | Someone also recently made a plugin [1] for OBS using this. | | [1] https://gitlab.com/gravydanger/obs-rnnoise/ | thomasfedb wrote: | I read NoseTorch, was intrigued. | dsteinman wrote: | This might be useful to use along side with DeepSpeech | (https://github.com/mozilla/DeepSpeech), which doesn't work very | well in noisy environments. | bhouston wrote: | This should be included in Linux by default it is this good. :) | | Or at least available via apt-get. | kaielvin wrote: | Alternatively there is the pulseaudio module: module-echo-cancel | (https://askubuntu.com/questions/18958/realtime-noise- | removal...), which I have been using so far. | | I haven't tried NoiseTorch yet. How do the two compare? | lawl wrote: | NoiseTorch uses RNNoise, which uses a mix of deep learning and | DSP to remove noise. I haven't used module-echo-cancel yet, but | it's probably "just" classical DSP, rnnoise may deliver better | results. | kaielvin wrote: | Indeed, after some testing, the filtering is much better. | lawl wrote: | Hey everyone author here! Awesome to see this on HN. | | I'm happy to answer any questions, but this is a slightly | inopportune moment to hit HN for me as I need to leave soon :) | Some responses might be delayed by a day or so! | kaielvin wrote: | Thanks for the work. | | Is incorporating NoiseTorch into pulseEffects something that | could be considered? The interest being to have all filters | managed under one app. | lawl wrote: | I've seen pulseeffects mentioned a few times, I must admit | that I don't know exactly what it is and will need to | research it first. | 42droids wrote: | Thank you for making this, I really can't wait to try it. In | fact, I am now shocked this didn't exist before... :) | nickjj wrote: | I'm not saying this tool is bad but I would be really careful | about using tools like this in an environment where audio quality | really matters (Youtube videos, podcasts, etc.). | | Noise reduction tools work by removing specific frequencies from | the source, some of which overlap with your natural voice. | | This is why you start to sound robotic and get weird cutouts if | you try to use tools to remove too much noise or background | sounds. It's one of those things where, if you're not used to | hearing your entire vocal range, you might not be aware at how | much is getting cut out from tools that reduce noise. | | It's too bad they don't have a before / after with a few voice | samples in the readme. | formerly_proven wrote: | Yes, classic noise suppression sounds very poor very quickly. | Noisy or poor audio is like blurred photos or videos, very hard | to fix, while noisy or shaky videos are easily fixed | (especially temporal de-noising on videos is akin to magic, it | can extend the performance of the camera by multiple stops with | very low IQ impact). | | That's why these ML tools are potentially huge, good ol' noise | suppression just isn't good. | bigiain wrote: | Click through to the RNNoise link at the bottom. Lots of | tweakable demos, and a real time JavaScript implementation to | play with too... | ipunchghosts wrote: | That's not how this works. It's much more sophisticated than | that. | brownbat wrote: | How long until we can get some kind of open AI project to take | in incoming bad quality voice and output clear noiseless human | speech (in our, or whoever's voice we want), so podcasters | don't have to buy expensive microphones and try to soundproof | their rooms anymore? | | I know we're not there yet, but I feel like we're about to | break "garbage in garbage out" with AI. | nickjj wrote: | I'm just a video course / podcaster who spent a decent amount | of time researching audio and I'm not a deep down audio | engineer. | | But based on the results I see with automated software tools | that only try to reduce noise, I would say we're no where | near there and a really good solution would involve things | that haven't been invented yet. I think we'll have manned | trips to Mars well before you have a software solution that | can emulate the sound of a moderately treated room with ~2ms | of latency or less. | | With that said, I think we're there today if all you want to | do is help reduce the noise of an air conditioner so you can | chat with a friend on Hangouts, Discord or Zoom. This is a | scenario where audio quality doesn't matter, but not hearing | an A/C or lawn mower is worth having the person talking sound | like a choppy robot. You probably won't even notice it too | much with earbuds. | manojlds wrote: | This demonstration with Nvidia RTX Voice sounds pretty good | | https://youtu.be/Q-mETIjcIV0 | nickjj wrote: | Definitely sounds better than I thought it would have and | I've watched tons of this guy's videos in the past. | | It really distorts his voice / range in some cases, such as | when he taps his desk with that orange hammer. The difference | there is night and day. It chops out his his natural voice's | range. It seems to degrade his voice the more intense the | background noise is, such as the leaf blower (lol), but | that's reasonable to expect. But at the same time, even the | mechanical keyboard has a very noticeable negative effect on | his range. | | It's one of those things where I wish so much that it worked | perfectly, but I couldn't realistically think about using it | for any recording work due to things like the above. There's | just too many common noises (typing, etc.) that drastically | distorts your voice. | | 9:23 in that video is hilarious though. Have to love Jerry! | amcoastal wrote: | I wonder if its the algorithm degrading his voice or if the | input sound is already degraded. Is it possible a leaf | blower or a hammer would cause enough "noise" to make it so | our ears couldnt hear his voice clearly as well? Then when | you subtract out the portion of the sound attributed to the | leafblower, youre hearing the parts of his voice that | werent being jumbled by the leaf blower? | nickjj wrote: | Hard to say because softer noises like typing still makes | his voice sound like it's cutting out unnaturally. It's | like the frequencies are being subtracted out of his | normal tone, but it's more subtle than the leaf blower so | you may not notice it without good headphones. It makes | him sound very choppy and mechanical. | jacobush wrote: | Like the blown out whites of a photograph. You can adjust | levels, but if the input peaked, there's just no | information left in the data. | simias wrote: | With the leaf blower I suspect that when it gets too | close the microphone/ADC is saturating, which clips his | voice. I wonder if it would've sounded better had he | attempted to lower the gain on the microphone. | rcxdude wrote: | Results can be mixed. Personally when I tried it it gave me a | lisp. | exhilaration wrote: | That's pretty amazing | asutekku wrote: | The difference in here is that RNNoise does not just remove | some specific frequency, it uses neural networks to remove it | which results in much higher quality compared to what you were | implying. | lawl wrote: | Hey (author here) | | I have personally not noticed voice quality suffering too much, | but you are of course right. And this is not what it was made | for. My personal use case is mostly voip where RNNoise (imo) | does an amazing job. | nickjj wrote: | Would it be possible to upload a few before / after samples | with varying degrees of background noise? Even if it's all | the same person that would be a huge help to gauge the | quality. | ClawsOnPaws wrote: | Here are some demos, I believe this is the same algorithm: | https://jmvalin.ca/demo/rnnoise/ | lawl wrote: | Yes and no. NoiseTorch also has VAD (Voice Activity | Detection). RNNoise also returns the probability of a | sound sample being voice, I use that to clamp the | microphone completely if its < the configured | probability. | | This works really well for situations like Discord or | Teamspeak where you're usually not constantly talking, | but doing things that can still set off "normal" voice | activation. RNNoise's model often knows it's not voice, | but cannot denoise it completely. | lawl wrote: | Yes! https://github.com/lawl/NoiseTorch/issues/19 | | I just wont get to it today unfortunately. | nickjj wrote: | Cool thanks. | | Just a suggestion if you do it, please include realistic | room noises in some of the samples. | | I looked at the RNNoise examples and it was pretty bad. I | mean, the audio quality of the speaker got completely | mangled but the background noise was also comically high. | It sounded like the person just sat down in the middle of | the street in NYC or was inside of a busy train terminal. | g_p wrote: | Looks excellent and keen to delve into the code a bit. | | One quick question since you'll clearly know the codebase - | do you think this could easily be adapted to create a | "playback-side" noise filter? | | Use-case rationale here is noisy and poor quality podcasts or | "other people's" audio - it would be awesome to be able to | configure your tool as the output for Chrome or Firefox or | whatever program I'm listening to, then route the cleaned | audio from your tool to the physical audio port. | | Is that something which would be feasible to do here? | lawl wrote: | > do you think this could easily be adapted to create a | "playback-side" noise filter? | | Yes, the hardest part about this is making the UI not | confusing when you now have two separate instances loaded | in PulseAudio. | g_p wrote: | Agreed, but now this has piqued my interest in a good | way. | | Having two instances loaded might be a bit confusing as | you say - I imagine it would need to be something like | "NoiseTorch for Recording" and NoiseTorch for Playback. | | I'd need to go and play around with Pulse but I guess it | would be possible to present 2 interfaces into Pulse with | different names, then hope users can see the distinction | when selecting a microphone versus the output device. | charliebrownau wrote: | Gday | | Anyone know of some good audio/sound tools for those still using | ALSA and stopped using or uninstalled PULSE ? | ACAVJW4H wrote: | It might be a stupid question but, aside from the obvious | benefits of saving bandwidth by omitting useless noise in | transport, doesn't it make sense to employ these technologies | server-side? One could maybe make Jitsi or BigBlueButton use | similar technologies? It would make it much more ubiquitous, | better platform support (would work on mobile or low CPU/GPU | clients) and also save on system provisioning as maybe the neural | net could be utilized better by running for different audio | sources concurrently | kaielvin wrote: | I believe Discords does a lot of noise filtering and cutting- | off. I suspect it is server-side (given that they have a web | app), but I am not certain. | spacechild1 wrote: | I know that Zoom does noise reduction and echo cancellation by | default, but I don't know if they do it client-side or server- | side (for peer-to-peer calls it has to be client-side, | obviously) | bufferoverflow wrote: | As a system owner, it makes financial sense to do it on the | client. Imagine you're managing Zoom. You will need tens of | thousands of GPUs running 24/7 just for noise suppression. | manojlds wrote: | Any of these remove dog barking noise? | speedgoose wrote: | I would guess. RTX Voice removes my cat's sounds. | manojlds wrote: | Yeah but with my rudimentary skills I struggled with dog | barks as they are closer to our speech. | sandworm101 wrote: | Does noise suppression work in reverse? Can I use it to isolate | the noise from the human voices? There are lots of situations | where someone might want to isolate and analyse background noises | or conversations. | fred123 wrote: | Yes. Noise suppression is very similar to speech separation | (separating multiple speaker voices that talk at the same | time). For example you can use ConvTasNet for both speech | separation and denoising; in the denoising case you set target | track 1 = speech, track 2 = noise, hence you get a noise-only | track. | | I guess you can also simply subtract the clean speech from the | original mixture to get the noise-only track. | captn3m0 wrote: | noisetorch-bin and noisetorch-git packages already on AUR: | https://aur.archlinux.org/packages/?O=0&SeB=nd&K=noisetorch&... | sahoo wrote: | Only if the sound card was detected in Linux. Sigh. | shock wrote: | What do you mean? NoiseTorch deals with PulseAudio, it doesn't | deal with hardware directly, so, yes, Linux needs to have a | driver for your soundcard. | formerly_proven wrote: | Are you implying I can't do any sound I/O without having a | driver for said I/O? Preposterous. | PaulDavisThe1st wrote: | That's trivially correct. You can't get anything on a screen | without a driver for your graphics card. You can't get any | input from a keyboard without a driver for the keyboard. | | However, Linux comes with drivers for more or less every | audio interface that is possible to use on Linux. That is, | there are essentially no 3rd party drivers - it either works | with the drivers in the kernel(1) or it doesn't. | | (1) depending on how your distro built the drivers. for the | most part, things are OK. ___________________________________________________________________ (page generated 2020-07-18 23:00 UTC)