[HN Gopher] Deep Learning for Guitar Effect Emulation
       ___________________________________________________________________
        
       Deep Learning for Guitar Effect Emulation
        
       Author : teddykoker
       Score  : 277 points
       Date   : 2020-05-11 12:02 UTC (10 hours ago)
        
 (HTM) web link (teddykoker.com)
 (TXT) w3m dump (teddykoker.com)
        
       | exabrial wrote:
       | Pretty cool! Is this how Kemper amplifiers work when they do a
       | capture?
        
         | ratww wrote:
         | AFAIK Kemper performs multiple passes of impulse-response
         | capture, all at multiple signal levels in order to model non-
         | linearities (like distortion). This is called dynamic
         | convolution. [1] [2]
         | 
         | There are other ways to do that, like Volterra Series, used by
         | Nebula plugins [3]
         | 
         | [1] https://www.uaudio.com/webzine/2004/july/text/content2.html
         | 
         | [2] http://www.sintefex.com/docs/appnotes/dynaconv.PDF
         | 
         | [3] https://en.wikipedia.org/wiki/Volterra_series
        
         | [deleted]
        
       | veenkar wrote:
       | B-but a simple convolution would do the same. Or for faster
       | operation - a transfer function obtained using least squares
       | method. NN is kinda overkill for this, but it's cool POC anyways
       | ;)
        
       | willis936 wrote:
       | A neat approach for sure. I am more interested in SPICE style
       | modeled VSTs though. There's no need to throw ML at a simple math
       | problem to get a bad approximation. I have not found many VSTs
       | that seem like they're doing proper simulation of analog
       | circuits. The VST space is filled with people claiming awesome
       | results, but never revealing the sauce. If you're making a
       | convincing sounding zener limiter, what are you actually doing?
       | There are a dozen different levels of approximations you could
       | make. Shouldn't a VST that is really simulating the analog
       | circuit advertise that? On paper it should be easy, right? I've
       | sat down with pen and paper to try to write out a simple
       | input/output equation for a zener limiter circuit and I decided
       | it was probably more worth my time to just plop a zener SPICE
       | model into some language that could evaluate expressions and
       | compile to VST (or use a systems of equations solver).
       | 
       | And then there's the real holy grail of analog simulation: the
       | tube amplifier. I'm not sure SPICE models really capture the
       | limiting behavior of tubes very well. You might need to implement
       | the spec sheet in code. All fun sounding problems, and I'm not
       | sure anyone has even done them yet.
        
         | ben7799 wrote:
         | Right..the Spice modeled version has a much better chance of
         | catching the oddball behavior of guitar effects across the wide
         | span of possible inputs.
        
       | saadalem wrote:
       | This is actually impressive, I'm wondering if we could transfer
       | the smartphone mic to a high quality one with AI
        
       | svantana wrote:
       | End-to-end modelling is very enticing for the lazy engineer,
       | unfortunately parameter control (knobs) are an important feature
       | of most audio effects, and sampling enough of the parameter space
       | will become prohibitive for more complex effects. That's why the
       | traditional approach is divide-and-conquer.
       | 
       | Also, I don't think this approach won't work well with time-
       | varying effects such as chorus, although I'm happy to be proven
       | wrong.
        
         | phn wrote:
         | Even without parameterisation, it might be interesting as a
         | "make my guitar sound like Jimmy Page" kind of tool.
         | 
         | Like you said, it will most likely have limitations, but it's
         | still one more tool in the belt, regardless.
        
           | tricky wrote:
           | You're right in the "this is one more tool in the belt" sense
           | but there are modelers like the Kemper and Fractal already
           | out there that make your guitar sound like... anyone... and
           | they are really convincing. I'd argue this is almost a solved
           | problem. Still cool, nonetheless.
        
             | luckydata wrote:
             | Building the model is not solved though. Kemper sort of
             | does that by not sure what, but an approach that simply
             | measures the effect and creates a complete model in hours
             | would radically change the industry. Companies like Yamaha
             | (line6) would be able to add hundreds of simulations in
             | months instead of a couple.
        
               | ratww wrote:
               | There's an Italian company already doing something like
               | that for offboard gear: Acustica Audio. But they're using
               | convolution instead of neural networks. It's instant
               | instead of taking hours.
               | 
               | An acquaintance that builds boutique studio gear had some
               | of his creation modeled by them, and we were quite
               | impressed.
               | 
               | https://www.acustica-audio.com/store/en
        
           | hashkb wrote:
           | > make my guitar sound like...
           | 
           | That isn't reasonable. There are too many variables beyond
           | the effect, like room, fingers, guitar, and amp. Without the
           | knobs, you haven't delivered the effect.
        
         | fxtentacle wrote:
         | Sadly, I believe you will be proven correct.
         | 
         | What that neural network learns is basically an approximation
         | of a static impulse response. So while it can simulate linear
         | time-invariant effects such as reverb quite nicely, it'll
         | surely have issues with chorus.
        
           | sk0g wrote:
           | Reverb is time invariant? You can set custom decay time, rate
           | etc, so the one not can be heard for say, 10 seconds if you
           | want to go full Devin Townsend. I'd think Chorus would work
           | better.
           | 
           | I wanted to do a very similar project, but with an overdrive.
           | Let's see if I get time anytime soon!
        
             | vnorilo wrote:
             | Reverb is indeed linear time invariant (sans some rarer
             | internal modulation techniques) but it's quite a high order
             | filter.
        
               | sk0g wrote:
               | Ah righto, the reverb pedal I'm most familiar with turns
               | out to not be just reverb - EQD Afterneath does a whole
               | bunch of funky stuff. Plain reverb though, yeah. I was
               | approaching this more from the angle of training a neural
               | network, where the input and output waves have to be
               | correlated over a great span of time/ samples.
        
             | InitialLastName wrote:
             | >Reverb is time invariant?
             | 
             | You might want to familiarize yourself with [0]. Time-
             | invariance is a specific property of a system, where the
             | output (for any given input) has no dependency on if the
             | input signal happens now or 1 second from now or 100 years
             | from now (except for the corresponding delay). Most reverb
             | models are, to a first approximation, time invariant,
             | because the effect will have the same sound for the same
             | guitar line, no matter when you play the line.
             | 
             | Chorus, on the other hand, has a (perhaps subtle) modulator
             | to get that warbly (scientific word!) sound. It doesn't
             | feel like a time-based effect, but it certainly is and that
             | makes it quite a lot more difficult to mimic with a system
             | that (as others have noted) boils down to an impulse
             | response.
             | 
             | [0]https://en.wikipedia.org/wiki/Time-invariant_system
        
               | TheOtherHobbes wrote:
               | Impulse response reverbs (Bricasti, etc) are based on
               | time-invariant convolution.
               | 
               | Studio reverbs famously aren't, and some of the most
               | popular models (notably Lexicon) have included time-
               | variant algorithms since the late 70s. The processing
               | power to handle IR convolution didn't exist, and it
               | turned out some time variation added lushness and density
               | to the sound that simpler models couldn't capture.
               | 
               | Modelling a chorus or time-variant reverb with any form
               | of convolution - including any convolution-based neural
               | net - is a complete waste of time, because most chorus
               | algos are trivial and convolution is completely the wrong
               | tool for the job.
               | 
               | It's literally about as useful as taking a still picture
               | of a 90 minute movie.
        
               | sk0g wrote:
               | Thanks for the info! Did some reading and I can see plain
               | reverb being time invariant indeed :) I never realised
               | chorus pedals did more than just stack frequency offsets
               | onto your signal, but I only really play distorted so
               | choruses are of limited use to me.
               | 
               | Pasting the other response below as well:
               | 
               | > Ah righto, the reverb pedal I'm most familiar with
               | turns out to not be just reverb - EQD Afterneath does a
               | whole bunch of funky stuff. Plain reverb though, yeah. I
               | was approaching this more from the angle of training a
               | neural network, where the input and output waves have to
               | be correlated over a great span of time/ samples.
        
           | luckydata wrote:
           | Then what you need is a multi dimensional matrix of impulse
           | responses, one for each combination of parameter. You can
           | further simplify the model by limiting to only useful
           | combination ranges etc...
        
         | Gatsky wrote:
         | Not saying that it is in any way practical or useful in the
         | real world, but I think there are approaches which are more
         | geared towards what we might understand as 'emulating' rather
         | than modelling an effect. It seems that emulators can be
         | learned from data with surprising efficiency [1]. These would
         | be amenable to parameter control.
         | 
         | [1] https://arxiv.org/abs/2001.08055
        
       | Tade0 wrote:
       | Sounds great and I had to listen to both of the samples to guess
       | correctly.
       | 
       | That being said the Tube Screamer is a somewhat simple effect:
       | it's just a distortion with the clipping diodes moved to the
       | feedback loop.
       | 
       | How possible would it be to get the famous A/B class amplifier
       | voltage sag and associated changes in parameters of the whole
       | amplifier, or in other words "will it chug"?
        
         | cesaref wrote:
         | I think this would be very possible - there was quite a bit of
         | discussion of using NN techniques for modelling fx discussed at
         | DAFx2019 (http://dafx2019.bcu.ac.uk/). There are a number of
         | papers discussing different techniques in the paper archive.
         | 
         | Many of the techniques discussed were variations on image
         | processing - transforming the input to the frequency domain
         | then converting this to an image, and applying standard
         | techniques to transform the image, then back to the time
         | domain. There are many compromises with this approach (loosing
         | phase information for example) but with a suitable overlap/add
         | the results were better than I expected, and certainly there's
         | room for further investigation to see if there's useful stuff
         | in there.
         | 
         | Another time domain approach that was applicable to your
         | amplifier model question was an attempt to determine hidden
         | variables in a circuit. Basically, the circuit under test is
         | examined, and rather that build a spice model (which can be
         | laborious) the technique was to expose the interval voltages
         | following components with memory (so capacitors for example).
         | These outputs were included in the NN training model, and so in
         | effect the normally hidden internal state was exposed and
         | allowed for a very good approximation.
         | 
         | Here's the paper:
         | 
         | http://dafx2019.bcu.ac.uk/papers/DAFx2019_paper_42.pdf
        
           | Tade0 wrote:
           | Thank you very much.
           | 
           | Do you know if there will be a DAFx2020? That would make it
           | the first conference in years that I would really want to
           | attend.
        
             | cesaref wrote:
             | Unfortunately not, it's been delayed. DAFx2020 was due to
             | be in Vienna, and i'm assuming they are still planning on
             | being there, but it's scheduled to be in 2021.
             | 
             | It's a great conference, well worth attending. It's heavy
             | on the maths, but that's DSP for you!
        
       | sdenton4 wrote:
       | It has been said that if we achieve the ability to fully simulate
       | the universe from initial conditions, the first application will
       | be creating a perfect recreation of Marvin Gaye's Roland 808 drum
       | machine in a 1982 performance.
        
       | ericfrederich wrote:
       | So this seems similar to an IR (impulse response) where you get a
       | snapshot of an amp mic'd up in a room with knobs fixed at a
       | particular position. In the end, you don't get knobs to fiddle
       | with.
       | 
       | Awesome, I'd love to hear Josh from JHS Pedal's opinion on this.
        
         | ratww wrote:
         | This is even more impressive since regular IRs can't duplicate
         | the distortion effect itself, only the frequency response
        
           | munificent wrote:
           | What is the difference between "distortion itself" and "only
           | the frequency response"? Are you saying the phase response is
           | important?
        
             | ratww wrote:
             | Impulse responses can only represent linear time-invariant
             | systems. Like delays, reverbs, equalization curves.
             | 
             | Distortion is non-linear, it is something like a _max(-1,
             | min(1, input))_ function (a waveshaper, like you said), and
             | it produces harmonics when applied to audio signals.
             | 
             | However guitar pedals also have some additional circuitry
             | to "sweeten" the distortion, removing the extra harmonics
             | added by the clipping diodes. Tubescreamers are notable for
             | cutting bass and enhancing mids. An IR is able to capture
             | this. This is important for guitar pedals, and the reason
             | multiple of them exist.
             | 
             | If you capture the impulse response of an overdrive pedal
             | you'll be capturing only the frequency response of a
             | distorted impulse. If you process clean guitar trough this
             | you'll simulate the frequency response but not the
             | distortion itself, so it will just be a clean guitar with a
             | tinny, shrill, sound, not an overdriven guitar sound.
             | 
             | One way around it (other than the idea in this article!) is
             | doing multiple passes of Impulse Response capture with
             | different amplitudes, this will capture this distortion
             | non-linearity. This is supposedly how a Kemper Profiler
             | works.
        
       | jelling wrote:
       | > We find that the model is able to reproduce a sound nearly
       | indistinguishable from the real analog pedal.
       | 
       | Maybe for the average person or buried in the mix, but the audio
       | samples were easy to distinguish for me as a guitarist. The NN
       | samples unnatural decay were a dead give away.
        
         | hashkb wrote:
         | Confirmation bias overrides ear training. Always have an
         | unbiased tone junkie do your blind test.
        
         | sailfast wrote:
         | Yeah - this was clearly audible on my phone speakers,
         | especially during more muddy / multi-note sequences.
         | 
         | While it may not be able to emulate a real pedal to create
         | one's own sound, it would be interesting / fun for amateurs
         | when applied as a post-filter with an interface that says "make
         | this sound like X famous incredible track" coming out of a
         | stock guitar signal.
        
         | mrob wrote:
         | I could also tell the difference, but I preferred the more
         | staccato sound of the NN version.
        
         | zwieback wrote:
         | Yeah, real pedal sounds much "better" but maybe we're just used
         | to how they sound.
        
         | jefftk wrote:
         | Not really a guitarist, but listening to them I couldn't hear a
         | specific difference. Yet I still liked one of them more. And
         | when I clicked "reveal" that one was the real one, turns out.
        
           | zuppy wrote:
           | the real one has longer fading tones, the one generated by
           | machine learning cuts the sound abruptly.
           | 
           | it seems easy for me to differentiate them and I'm a beginner
           | with guitars (~1 month, so I'm your average Joe). it's pretty
           | good though, I'm sure it can be improved greatly.
        
         | Tade0 wrote:
         | Also the pretty obvious quantization noise which sounds as if
         | the effect had a wide bandwidth, which is impossible with their
         | op-amps at these gains.
        
         | magicalhippo wrote:
         | Even as a regular Joe it was easy for me to distinguish them,
         | and though I was not very confident in my guess, I did guess
         | correctly as well.
         | 
         | It was close though, so maybe for say a beginner on a shoe-
         | string budget it would be perfectly acceptable.
        
         | finder83 wrote:
         | Agreed, the NN had that "digital" sound you typically get from
         | a simulated tube screamer, such as in a POD HD or something.
         | 
         | Very impressive given it's from a NN, but I specifically moved
         | to analog for that reason.
        
       | baylessj wrote:
       | Excellent writeup, I love seeing real engineering applied to
       | guitar pedals rather than black magic tone chasing.
       | 
       | I'd be really curious to see if the model could be expressed as a
       | transfer function and compared to the schematic for the pedal.
       | The Tubescreamer is a fairly simple circuit but the mystery
       | surrounding it indicates that there are some weird variables at
       | play with the component properties that would lead to additional
       | factors in the transfer function. Wonder if those variables could
       | be identified somehow.
        
         | hashkb wrote:
         | The "weird variables" may have to do with the various changes
         | in manufacturing over the years. "Tube screamer" refers to at
         | least 10 different units. Maxon, Ibanez, TS9, TS808, and
         | zillions of clones.
        
       | ben7799 wrote:
       | I play guitar and own a tube amp & a tube screamer.
       | 
       | All of this sounds horrible.. it doesn't even sound like his
       | input is an actual guitar, it sounds like he's using a synth
       | guitar sound or something. There's no dynamics, almost no
       | sustain, no articulations. The outputs barely even sound
       | distinguishable as a guitar through a tube screamer, even his
       | actual tube screamer samples. (Possibly cause his interface is
       | terrible?)
       | 
       | The conclusion is ridiculous given how simplistic everything is.
       | 
       | You can't use two tiny little clips to justify your model being
       | high quality.
       | 
       | The true test has to even allow a bunch of guitarists to move all
       | the knobs, plug the model into different amp & guitar
       | combinations, put other effects in front of and behind it, etc..
       | 
       | The Tube screamer is called a Tube screamer because it's intended
       | use case is to make the tubes in a tube amp "scream". Using it
       | with all the knobs at noon is not consistent with this, it
       | usually gets used with a tube amp that is already on the verge of
       | distortion, and then you use the TS with the volume turned up a
       | lot (3/4-max) and the gain quite low, this might be part of why
       | this sounds so bad to me.
       | 
       | There are actually two different trains of thought on guitar
       | effect modeling:
       | 
       | - Model it based on input & output waveforms like he's doing
       | 
       | - Actually model the circuit as an electrical simulation and then
       | pass the signal through that.
       | 
       | I have personally found the second approach to be way more
       | realistic and satisfying. The Yamaha THR amps work this way and
       | they're really amazing.
       | 
       | One of the tricks here is a listener might not be able to tell a
       | difference, but the guitar player picks up on a perceived change
       | in how the guitar feels with these effects. A tube screamer has a
       | lot of compression built into it for example. It causes
       | everything to play to sound a little dirtier for the same amount
       | of picking energy you put into the guitar. It will cause the
       | player to play a little more lightly than they would without the
       | effect. This is the kind of thing that makes a player reject the
       | model and want to stick with the real thing, whereas the guy in
       | the naive lab building the model thinks it's great cause they're
       | not even playing an actual guitar through it. Once a skilled
       | player tries it the "feel" is a dead giveaway which is which.
       | 
       | It's easy for some of this stuff to get lost on the electronics
       | crowd if the background is electronic music. An actual acoustic
       | piano is the only keyboard based instrument that has anywhere
       | near the nuance that a guitar has, and a guitar still has way
       | more weird stuff going on with dynamics and articulation. The
       | range of inputs you have to feed into any kind of computer model
       | to simulate guitar well is huge.
        
       | mgamache wrote:
       | It would be interesting to see how this responds to dynamics. For
       | example, a favorite guitar sound is a fuzz cranked, but with the
       | guitar volume turned down. This results in a compressed dirty
       | sound that can overdrive into distortion if you hit the strings
       | harder (attack).
        
       | SeanFerree wrote:
       | Love this! Great read!
        
       | [deleted]
        
       | EamonnMR wrote:
       | Add the ability to train on arbitrary effects as inputs and this
       | will a best-selling VST for whoever can make it first.
        
       | TrackerFF wrote:
       | Isn't this essentially just learning the case of learning one
       | function, with set parameters?
       | 
       | I.e, if you want to build a complete model of the tubescreamer,
       | you'd essentially have to train a model for each possible setting
       | on the pedal - or in other words, every combination of the knobs.
       | 
       | Sounds like a real chore, if you were to actually do that
       | physically - and in the end, don't you just want to learn the
       | impulse response of the circuit?
       | 
       | I know some tools - like the Kemper modelling gear, are made for
       | that exact purpose, and with extremely convincing results.
        
         | Scene_Cast2 wrote:
         | Not quite. As long as the knobs make consistent changes, just
         | feed some large amount of tests and the model should generalize
         | (smartly interpolate) the rest.
         | 
         | What I do have a problem with is that if the pedal is already
         | implemented digitally, then all the human interpretability,
         | along with the classic DSP machinery, is thrown out the window.
         | A better approach would be to build the pedal via a
         | differentiable programming language and then try to gradient
         | descent toward some analog "can't get this juicy tube sound
         | digitally" variant.
        
           | ben7799 wrote:
           | The knobs actually don't behave linearly on a tube screamer.
           | Even the "tone" knob (EQ) doesn't behave at all linearly like
           | you might expect out of consumer audio gear. Tube Screamers
           | have an S-curve potentiometer in use for that knob.
           | 
           | That would be part of the problem with this approach.
           | 
           | Also with this approach you pretty much have to train the
           | model with a near infinite collection of guitars in front of
           | the model and a near infinite number of other effects turned
           | on and off in front of the model.
        
             | Scene_Cast2 wrote:
             | The knobs don't have to be linear at all, just
             | differentiable - that's the beauty of ML.
             | 
             | As for the collection of guitars and samples - not
             | necessarily, it would depend on how you set up the
             | training.
        
       | ZoomZoomZoom wrote:
       | For anyone planning to try this, don't forget about impedance
       | matching and use a transformer/active reamper. Some pedals may
       | react very differently.
        
       | ateamtexas wrote:
       | https://ateam-texas.com/things-to-take-care-while-outsourcin...
        
       | mrob wrote:
       | This isn't bad, but the note decays sound noticeably different.
       | My guess is that the NN doesn't know that human ears have non-
       | linear response that makes them more sensitive to errors in the
       | decay than the attack, so it treats them equivalently. If this is
       | the case then it might be fixable by using logarithmic scale
       | audio samples instead of linear.
       | 
       | The non-linearity of the ear is frequency dependent[0], but in
       | practice I suspect it would be sufficient to pre-process the
       | linear PCM data with x=sqrt(x) and undo before playback with
       | x=x^2.
       | 
       | [0] https://en.wikipedia.org/wiki/Equal-loudness_contour
        
         | rubatuga wrote:
         | Why square root and not log?
        
           | mrob wrote:
           | Cheap and dirty fast calculation. I don't actually know what
           | the best mapping is, so I'd start with this.
        
       | wintermutestwin wrote:
       | "many purists argue that the sound of analog pedals can not be
       | replaced by their digital counterparts."
       | 
       | Truly effective modelling of analog pedals, tube amps and guitar
       | cabs has been around for years and is way more cost effective
       | from the bedroom to touring bands.
       | 
       | The "purists" are hipsters who value the rarity of some pedals,
       | massive pedalboards and their tube amps. I'm not knocking them -
       | I understand why there is a nostalgia factor and tweaking dials
       | is cool. As a computer guy though, I much prefer the ability to
       | make things like this in my bedroom:
       | https://i.imgur.com/OqMoBxz.png And when I want to tweak a dial,
       | I program an expression foot controller to tweak any parameter
       | (or multiple).
       | 
       | All that said, great to be looking at modelling techniques...
        
         | maeln wrote:
         | A lot of band stopped using analogue hardware for sound also
         | because they tend to be way less reliable than their digital
         | counter part. A lot of analog amp, pedals and synth will tend
         | to change their sound due to the analogue hardware aging.
         | Digital stay virtually the same. And the same can be said about
         | weather condition. Change in temperature and humidity affect
         | analogue hardware, not so much digital.
         | 
         | You will have the same sound from gig to gig and a lot of band
         | really value this.
        
         | willis936 wrote:
         | Effective modeling, yes, but not necessarily accurate modeling.
         | The analog circuits are imperfect in many subtle ways, and
         | component level simulation is rare (if it exists at all, I have
         | not seen it). It's all a bunch of high level approximations
         | that don't nail the feel to the point of beating blind tests.
         | 
         | It can be done. I don't know why we aren't there.
         | 
         | The audio world is halfway to to the alien truther community:
         | the closer a rational outsider looks at it, the crazier they
         | feel. Technically, it's a trivial field. Yet here we are with
         | snake oil saturation and subpar solutions.
        
           | wintermutestwin wrote:
           | https://www.fractalaudio.com/iii/
           | 
           | This guy has been doing component level simulation from the
           | beginning. I have one and it is accurate enough to convince
           | some pretty big players to ditch their tube amps.
        
         | selykg wrote:
         | I am by no means a musician or an experienced one at that. I
         | tinker and enjoy playing and learning. But I have limited
         | experience overall.
         | 
         | My personal experience with electronic tools is the lack of
         | feel. Can I make music with digital tools like AxeFX and
         | similar? Absofreakinglutely. No doubt about it.
         | 
         | But those digital tools feel VERY different to me than the real
         | thing. I'm not just talking about a speaker moving air, though
         | that's certainly part of it. My tube amp simply responds
         | differently than any digital model of a similar amp.
         | 
         | I find tools like the Kemper to be amazing, but they're just a
         | snapshot of an amp in a particular configuration in a
         | particular room.
         | 
         | From a technical standpoint, all this modeling stuff is super
         | cool. But it doesn't feel the same at the end of the day and
         | this is a personal opinion and preference on my part.
         | 
         | I look forward to the day that I can get an amp in a pedal
         | (like the Strymon Iridium) and it behaves the same as the real
         | amp. I think Fender's Deluxe Reverb (Tonemaster model) is as
         | close as it has ever gotten, but it very specifically emulates
         | a single amp and does so within a real amp cabinet rather than
         | pushing it out to an audio interface.
         | 
         | Anyway, anything that gets people playing guitar is, in my
         | opinion, a great thing. We live in a golden age of guitar
         | equipment. I don't think it can honestly get much better than
         | it is right now. It's an amazing time to be a guitar player and
         | incredible options are available at amazing prices.
        
           | wintermutestwin wrote:
           | I find that the AxeFX gets enough of the tube amp feel right
           | by modelling amp sag.
           | 
           | It is indeed an amazing time to be a guitar player!
        
           | melq wrote:
           | >My tube amp simply responds differently than any digital
           | model of a similar amp.
           | 
           | Can you expand on this a bit? Curious what you mean by
           | responds and what the difference is.
        
           | renaudg wrote:
           | >Anyway, anything that gets people playing guitar is, in my
           | opinion, a great thing. We live in a golden age of guitar
           | equipment. I don't think it can honestly get much better than
           | it is right now. It's an amazing time to be a guitar player
           | and incredible options are available at amazing prices.
           | 
           | It sure is a great time for guitar equipment, as the digital
           | revolution has made its way there too.
           | 
           | But being a guitar player is also increasingly lonely : https
           | ://www.washingtonpost.com/graphics/2017/lifestyle/the-s...
           | 
           | And it's arguably an opportunity cost for a kid to be pouring
           | so much effort today learning the iconic (but tired)
           | instrument of the boomer generation, when they could be
           | breaking new musical ground instead, mastering Ableton's Push
           | for instance. But to each their own, of course.
        
         | anodyne33 wrote:
         | In a similar vein, I worked (eg: interned) at a few recording
         | studios in my 20s. Most tracking in both was done to a 2" 24tk
         | analog tape deck and the majority of post and mixing was all
         | done digitally. I don't know what progress has been made in
         | plug-ins in 20 years, I suspect a lot, but at that point there
         | was nothing digitally that came close to the sound of electric
         | guitars overdriven into the tape deck and saturating the tape
         | to an extreme. Now I'm curious if anybody has gotten it right,
         | but there are fewer and fewer studios with 2" tape decks to do
         | a true A/B.
        
         | hashkb wrote:
         | This is ad hominem. You haven't included any data. You are just
         | as superstitious about your bedroom rig as I am about my
         | basement rig.
        
           | wintermutestwin wrote:
           | I guess you are technically right, but this part of the
           | discussion is highly subjective. I was merely pointing out
           | that the quoted statement was subjective and I wasn't using
           | "hipster" as a pejorative - I was actually being somewhat
           | sympathetic to their view.
           | 
           | Overall, my goal was to add to this discussion by pointing
           | out the massive progress that has been made and also to show
           | off my supercool signal path in the hopes that it would be
           | inspirational to fellow geeks like me.
        
             | hashkb wrote:
             | It would have gone over better with me (your average analog
             | hipster) if you'd just mentioned the positive aspects of
             | the thing you like. I'm eagerly awaiting the day when
             | modeling is actually good enough for me; and your (common)
             | attitude (that it is, obviously, and anyone who can't hear
             | it is nostalgic/supersitious/hipster) is one of the reasons
             | I don't give modelers a try more often.
        
               | melq wrote:
               | It wasn't an ad hominem attack. Ad hominem quite
               | literally refers to an attack against a specific person.
               | Not only was he not attacking a person, but referring to
               | 'hipsters' is not necessarily pejorative.
               | 
               | I believe he was incorrect to call guitar players who use
               | analog equipment hipsters, as using analog equipment is
               | the status quo, not some niche subculture outside of the
               | mainstream.
               | 
               | I would like to respectfully suggest being a little less
               | sensitive, though. Not giving new things a chance because
               | of other people attitudes seems very silly to me.
        
         | mikevm wrote:
         | I'm a guitar noob, but have been wanting to pick up an electric
         | for ages :).
         | 
         | Quick question - how does that Axe-FX compare to various Amp
         | emulators such as AmpliTube, Line 6 Helix Native, Guitar Rig,
         | Positive Grid BIAS Amp, S-Gear, etc... ?
        
           | luckydata wrote:
           | AxeFX is the most true to life, the Helix is quite a bit
           | simpler to use, the Kemper has the best "feel" of every
           | simulator. They achieve very similar results sound-wise, all
           | of them can be used in record production no problem.
           | 
           | IMHO for the bedroom player the Helix is the best solution as
           | it's good enough and significantly cheaper than the other
           | options.
        
           | wintermutestwin wrote:
           | As a guitar noob, I'd say that all of those options are great
           | (and yes, I've used them all). If you were more than a noob
           | and had specific needs, I might recommend a specific one to
           | match those needs. I went with the AxeFx because it is
           | insanely tweakable...
        
         | playingchanges wrote:
         | I think what you mean by 'hipsters' is 'professionals'. As
         | someone who's made several records and been in many recording
         | studios, I would challenge you to name a single record that
         | does not utilize an analog signal chain, for mastering at the
         | very least. VST modeling is great when you want a super clear
         | tone and is very popular in certain genres. But definitely not
         | ubiquitous and certainly not superior tech. Digital just don't
         | SLAP like analog.
        
       | hashkb wrote:
       | Trey Anastasio of Phish famously uses 2 stacked tube screamers.
       | (And so do many of us phans). He deserves to be mentioned because
       | more notes have hit audience ears through his screamers than
       | anyone else's.
       | 
       | Also, the modern TS9 isn't exactly right. I'd love to see this
       | work applied to vintage vs current TS vs modded units.
        
       | 317070 wrote:
       | That is very cool. Though, part of the pedal are of course the
       | knobs. You'd need to condition the wavenet on the knobs. Did that
       | work well (I assume that you tried that already)?
       | 
       | Also, what is the inference latency on your model? A nice thing
       | about analog guitar effects is that they are blazingly fast.
        
       | fab1an wrote:
       | Pretty cool, though I wonder what the latency of this would be if
       | used as a plugin?
       | 
       | The author says it works in real-time, but to non music/audio
       | folks this could mean '100 ms latency is real-time enough,
       | right?'
       | 
       | Generally, I think the audio VST business is a really fun space
       | to be in for a lifestyle business, as it is way too small to be
       | attractive for VCs. It seems like a space that provides many
       | niches for lots of small players to thrive in.
       | 
       | As an aside, it's really quite interesting that a lot of cutting
       | edge tech is now used to emulate the hardware-based tech of
       | yesteryear. Think film filters for photoshop, and about 90% of
       | all audio plugins that emulate high end hardware, compressors,
       | pedals, etc etc.
        
         | [deleted]
        
         | TheRealPomax wrote:
         | while training? terrible. As finalised model running in an
         | AU/VST3 wrapper? probably extremely low.
        
         | whiddershins wrote:
         | Do solo or small shop vst plugin developers make any money?
         | 
         | I'm curious if anyone has any direct knowledge about that.
         | 
         | There are so many professional activities similar to that where
         | no one makes any money and people really just do it for the
         | love, and then there are seemingly similar things like that
         | where people make surprisingly large amounts of money.
        
           | ff7f00 wrote:
           | There are definitely big players making a lot of money from
           | plugins they develop. Here are a few to check out:
           | 
           | ($1200) https://www.native-
           | instruments.com/en/products/komplete/bund...
           | 
           | ($300) https://www.soundtoys.com/product/soundtoys-5/
           | 
           | ($500) https://www.arturia.com/products/analog-
           | classics/v-collectio...
           | 
           | However, piracy is also pretty big when it comes to plugins.
        
             | moralestapia wrote:
             | Sure, but,
             | 
             | >Do solo or small shop vst plugin developers make any
             | money?
        
               | kleer001 wrote:
               | The implication seems to be "no".
        
               | TheRealPomax wrote:
               | That implication would be wildly incorrect.
        
               | kleer001 wrote:
               | With the operative word being "developers" I disagree.
        
               | TheRealPomax wrote:
               | Yeah but facts don't really care about agreement: there
               | are loads of renowned single-person VST shops, and many
               | more "just a handful of folks" ones. Chris Heinz, Steve
               | Duda, Strezov Sampling, Matt Tytel, heck even Plugin
               | Guru, etc. etc. are all renowned folks in the VST/VSTi
               | world, and that doesn't even scratch the surface.
        
               | rogerclark wrote:
               | Steve Duda wrote Serum, probably the most popular synth
               | plugin in modern electronic music. everyone I know has a
               | license. so "yes", with the caveat that it's difficult to
               | actually create products of this level of quality
        
               | fab1an wrote:
               | Quite a few small developers in this space. It's not like
               | indie gaming, but there's also less competition. I think
               | you need to be a musician/producer to be successful here
               | though.
        
             | williamdclt wrote:
             | Having a friend working at Arturia, I don't think they're
             | "small shop" nor do they make "a lot of money" to be honest
        
           | gregsadetsky wrote:
           | I was in talks with a (new-style) 'label' that sells samples,
           | sound packs, and VST plugins. Some of their plugins have been
           | purchased 25k times.
           | 
           | One of the things I've also heard from labels is that not
           | only there's money in the VST world (it's also very crowded,
           | piracy is rampant as noted, etc.), a lot of plugins are
           | ported over to iOS and are sold as "virtual pedals". The
           | number of sales and revenue there was noted as being very
           | interesting.
        
             | williamdclt wrote:
             | When I had an active band, our guitarist went from bringing
             | his amp to rehearsal, to having a bunch of pedals, to
             | having a digital pedal board, to having an iPhone with some
             | sort of tiny adapter.
             | 
             | I made fun of him and we wouldn't have trusted it to be
             | used live, but damn it worked impressively well
        
           | jeremyjh wrote:
           | AFAIK, Mike Schuffham (www.scuffhamamps.com) earns a living
           | developing and selling S-Gear. It might be a semi-retirement
           | or lifestyle type living - not sure - but he's been doing it
           | over a decade now. He doesn't charge as much as he could and
           | gives away free updates for far too long. Despite being a
           | (mostly at least) solo effort, its widely regarded as being a
           | top-tier amp sim. I personally think it sounds better than
           | both Helix and Bias, which are both heavily bank-rolled
           | outfits.
           | 
           | It doesn't have their breadth, but the tones it does have are
           | nearly as good as it gets without serious air movement.
        
           | TheRealPomax wrote:
           | They do. Strezov sampling is one guy. Serum is one guy. Chris
           | Heinz is one guy, etc. etc.
           | 
           | But you have to be willing to put in the time and make
           | phenomenal products, because no one wants average instruments
           | and effects, we can get those for free.
        
           | munificent wrote:
           | Steve Duda, the developer of Serum is kind of the poster
           | child for this. He contracts out for pieces of the synth (UI
           | design, resampler, filters), but he's mostly a one-man shop
           | and, as I understand it, Serum pays the bills.
           | 
           | It's hard to tell how much Duda is an outlier, though, and
           | how many other people could succesfully follow his path.
        
           | abaga129 wrote:
           | I'm fairly new to the game, but I'm a solo developer.
           | Currently I dont make enough to quit my day job, but it is a
           | nice supplementary income, and it's nice to get paid a bit
           | for something I truly enjoy.
           | 
           | There are also several solo/small shop developers that do
           | make a living from selling plug-ins. Here are a few that I
           | can think of off the top of my head.
           | 
           | Auburn Sounds: https://www.auburnsounds.com/ Valhalla DSP:
           | https://valhalladsp.com/ Kilohearts: https://kilohearts.com/
        
             | fab1an wrote:
             | what's your link?
        
               | abaga129 wrote:
               | Cut Through Recordings:
               | https://cutthroughrecordings.com/home
        
             | false_kermit wrote:
             | +1 Valhalla makes some of my favorite reverbs!
        
         | alexlarsson wrote:
         | I assume by real-time he meant "able to produce samples at a
         | rate equal to or higher than the audio output sample rate".
        
           | amelius wrote:
           | Latency is equally important.
        
           | TheRealPomax wrote:
           | That's not what real time, means though. Real time processing
           | means taking signals as they come in, and outputting the
           | transformed result such that there is as close to no signal
           | lag as possible. The output can in fact be wildly lower or
           | higher resolution, real-time does not particularly say
           | anything about that. It's all about whether the output plays
           | (for practical purposes) at the perceived "same time" as the
           | input signal. There will always be some delay, but that delay
           | can't get perceivable, and for obvious reasons there can't be
           | any (significant) buffering.
        
             | jeffbee wrote:
             | Is that your private definition of "real-time"? I think it
             | is common to define real-time processing by a specified,
             | finite time between input and output. Many real-time
             | processes are concerned more with the consistency of the
             | latency than with its absolute value.
        
               | avisser wrote:
               | For guitar pedals, there is an implied sub-
               | perceptibility. The output needs to happen as I play - if
               | the delay is too long, it's now a delay pedal.
               | 
               | So realtime might match your definition, but it is
               | consistent in audio production.
               | 
               | For humans, you can start to notice the lag @ 50ms. (A
               | selection of experimental results summarized here
               | https://gamedev.stackexchange.com/a/74975)
        
               | NobodyNada wrote:
               | Latency is _much_ more noticeable when you're playing a
               | musical instrument; 25-30ms is the point at which it
               | becomes distracting in my (anecdotal) experience as a
               | keyboardist. 50ms would be literally unplayable --- I
               | cannot keep in time if latency is that severe. And that's
               | _total_ output latency from the moment a key is depressed
               | to the moment the sound comes out the speakers, so it's
               | important for every component in the signal chain to have
               | the lowest possible latency. A bunch of 5-10ms delays
               | adds up _really_ quickly.
        
               | strbean wrote:
               | Throw on some headphones, and mix your clean signal with
               | the rest of the ensemble on a 50ms delay.
               | 
               | (jk)
        
               | TheOtherHobbes wrote:
               | 1ms is usually considered inaudible. 5ms is bearable.
               | 10ms will start to annoy some people. 25ms is actually
               | pretty bad.
        
             | nitrogen wrote:
             | I think "rate" in the parent comment was just referring to
             | speed, not sample rate. But yes, latency is critical for
             | anything used during recording or performance. However way
             | back when I used to make my own music I used non-realtime
             | plugins sometimes and it was okay.
        
             | jfkebwjsbx wrote:
             | RT does not necessarily mean a small latency but a
             | guarantee on a maximum one, whatever that is.
        
         | qppo wrote:
         | I know of a few shops that took VC money. The big problem isn't
         | the market size so much as how slow the market moves. The
         | product lifetime of a plugin is around a decade. And users hate
         | subscriptions. And it's really hard to determine the value you
         | add to your customers. And no one wants to pay you.
         | 
         | It's basically a terrible place to be a developer in it for the
         | money. Really fun work otherwise. The cool gigs are the ones
         | where you build custom plugins for someone's crazy idea.
         | 
         | In consumer applications, plugins are used all the time for
         | prototyping before you go to hardware. MATLAB is way too slow
         | for anything useful.
        
           | nil-sec wrote:
           | The success of splice would disagree with your notion that
           | "users hate subscriptions". Given the horrendous price point
           | of many of these plugins it seems to be perfect for a
           | subscription based model. To me it always seemed there is
           | more of a pushback from the industry producing vsts than from
           | the consumers.
        
             | qppo wrote:
             | Splice's numbers aren't public so I can't comment on their
             | success. Avid's are, and they had a terrible quarter - and
             | they're the poster child (alongside Adobe) for subscription
             | licensing in creative software. But I'd be interested to
             | see what the breakdown in revenue is for plugin licenses
             | versus preset/sample packs (bit of a blade & razor model
             | there).
             | 
             | The price points really aren't horrendous if you consider
             | how expensive the engineering is, how little demand there
             | is, and how long you need to maintain a product. You aren't
             | being ripped off by spending a couple hundred bucks on a
             | plugin. I think we'll end up at a place where everything is
             | a subscription, but I can tell you from experience that it
             | creates friction for the users.
        
               | nil-sec wrote:
               | Agreed. The business model seems to be to give access to
               | the rent to own deals via the sample subscription fee.
               | Don't think they make any money of their plugin deals.
               | I'm also not arguing it's too expensive or a rip off. But
               | it's still a large amount of money for software, in the
               | private space at least. The rent to own thing seems like
               | a smart tool to get rid of the barrier of entry.
        
         | eru wrote:
         | Real-time has a few slightly different meanings. So it's hard
         | to say what the author means.
         | 
         | One meaning is just that you can guarantee specific deadlines.
         | So if your programme can react within an hour guaranteed, that
         | would be real-time. (Though usually we are talking about
         | tighter deadlines, like what's needed to make ABS brakes work.)
         | 
         | For 'real time' music usage you wouldn't need strict
         | guarantees, but something that's usually fast enough.
        
           | aea12 wrote:
           | Implementing a VST plugin is literally the _exact_ definition
           | of requiring strict latency guarantees. Your comment winds
           | through a lot of unrelated comparisons to ultimately not make
           | any sense.
           | 
           | "Usually fast enough" are three words that guarantee failure
           | in a live show/MIDI environment, which is a large use case of
           | VST and its peers beyond production. By extension, "usually
           | fast enough" further guarantees nobody will ever use your
           | software. That's noticeable right away.
           | 
           | The question isn't about compsci real-time theorycrafting,
           | it's "here's a buffer of samples, if you don't give it back
           | in a dozen milliseconds the entire show collapses." That's
           | pretty clearly meant by "real time" contextually.
        
             | swebs wrote:
             | No, the other guy is right. Technically the definition of
             | real-time can have a lot of leeway. Here's the paper linked
             | in the article. Note how the authors never define what they
             | really mean by real-time. They even make statements like
             | "runs 1.9 times faster than real-time". They certainly
             | _imply_ your definition, but there 's plenty of wiggle room
             | to say "Well technically, I wasn't lying"
             | 
             | https://www.mdpi.com/2076-3417/10/3/766/pdf
        
               | [deleted]
        
               | zodiac wrote:
               | In the context of e.g. offline video encoders "1.9x
               | realtime" is a statement about throughput, not latency
        
             | eru wrote:
             | The show won't collapse, if you have one glitch an hour.
        
               | munificent wrote:
               | If you drop an audio buffer and fire off a 22kHz impulse
               | into a 50,000 watt soundsystem, you are going to have
               | thousands of very unhappy people and likely some hearing
               | damage.
        
               | melq wrote:
               | Point taken, but 22khz is too high for people to hear I
               | think.
        
               | munificent wrote:
               | You don't need to perceive a sound to have your ears be
               | damaged by it.
               | 
               | (This goes in both directions on the spectrum too. You
               | can have your hearing damaged by infrasound as well.)
        
               | melq wrote:
               | Ah, this makes a lot of sense, thank you. Much like there
               | are spectrums of light we can't see that can damage the
               | eyes.
        
               | aea12 wrote:
               | Yes, it absolutely 100% will, depending on what you mean
               | by handwaving "glitch". VST is built into chains, and a
               | flaky plugin will derail an entire performance, often
               | making downstream plugins crash. I'm speaking from
               | extensive experience writing plugins and performing with
               | them in multiple hosts and trigger setups. It's not a
               | robust protocol, but it gets the job done.
               | 
               | Are you speaking from some experience with which I'm
               | unfamiliar where it's okay for DSP code to fail hourly?
               | Trying to understand your viewpoint.
        
               | gregsadetsky wrote:
               | Hey @aea12, would you be available for a quick chat? My
               | email is in my profile. Thank you!
        
               | nseggs wrote:
               | Agreed. If anyone wants to see some of the more
               | successful DSP work being done today for pro or prosumer
               | audio, I recommend checking out Strymon and Universal
               | Audio products. Both make use of SHARC SoCs and achieve
               | great results.
        
               | boomlinde wrote:
               | A VST that doesn't fill its buffer on time shouldn't
               | crash another plugin. It's your other plugins that are
               | flaky.
        
               | boomlinde wrote:
               | To expand on this, each plugin will receive host managed
               | buffers that they're requested to fill and the input
               | they're expected to process. If they don't do that in
               | time for the host to mix the buffers and deliver the
               | mixed buffer to the audio driver, it simply won't.
               | Nowhere do the plugins directly interact through this
               | process.
               | 
               | If your plugins are crashing because of an underrun you
               | have a much more serious problem than underruns. Then you
               | have plugins writing to or reading from memory that
               | wasn't either handed to them by the host or allocated by
               | themselves. That bad code running in your process can
               | cause it to crash is an orthogonal problem to buffer
               | underruns causing skips or stuttering in audio.
        
               | [deleted]
        
               | remcob wrote:
               | Are there any VST containers? Something that will wrap
               | the VST, intercept under-runs or other bad behaviour and
               | substitute some alternative signal (zero, passthrough,
               | etc.). This could also be part of the host software.
               | 
               | The article and your comments inspired in me the idea of
               | a wave-net based VST learning wrapper. If the real plugin
               | fails, substitute a wave-net based simulation of the
               | plugin.
        
               | boomlinde wrote:
               | Underruns are not bad behavior. It's the host
               | application's responsibility to hand VSTs buffers to
               | process, and the VSTs themselves have no concept of how
               | much processing time is available to them (except a
               | method that signals to distinguish real-time processing
               | from offline processing) or what it means to underrun the
               | buffer.
               | 
               | The behavior you describe (zero signal on underruns) is a
               | common mitigation. The DAW or the driver itself
               | initializes that'll eventually be handed to the sound
               | card to zero before the host application requests the
               | plugins to process, and if it doesn't have time to mix
               | the plugin outputs it'll play back the initialized buffer
               | instead.
               | 
               | From aea12 one might think that it's normal for an
               | underrun to be fatal. Because underruns are not an
               | exceptional occurrence during production (where you might
               | occasionally load one plugin too many or run a different
               | application with unpredictable load characteristics like
               | a web browser) it really isn't an unexplored area and
               | although they're are a pretty jarring degradation I've
               | never experienced crashes that directly correlated with
               | underruns.
        
             | boomlinde wrote:
             | "Usually fast enough" is unfortunately the only guarantee a
             | preemptive multitasking OS can give you. Unless your system
             | is guaranteeing your program x cycles of uninterrupted
             | processing per frame of audio and you can consistently
             | process the frame in that amount of cycles, the only
             | mitigation is to deliver frames in large enough chunks that
             | you never run out of time in practice under agreeable
             | circumstances.
             | 
             | That said, I agree that the question of what "real-time"
             | might mean is irrelevant given the context.
        
               | aea12 wrote:
               | It is completely irrelevant, given the context. The only,
               | only, only thing real-time means here is "can be run on a
               | live signal passing through it" rather than "is a slow,
               | offline effect for a DAW". No hard real-time, no soft
               | real-time, no QNX, no pulling out the college compsci
               | textbook. There IS real-time in that sense in DSP, it
               | just isn't in a VST plugin.
               | 
               | I'll repeat again that any compsci theorycrafting is not
               | the concern here, and real-time has a very specific
               | meaning in DSP. Computer science does not own the concept
               | of real-time, and the only people tripping over the
               | terminology are those with more compsci experience than
               | DSP. I appreciate everyone trying to explain this to me,
               | but (a) I understand both, and (b) this is like saying
               | "no, Captain, a vector could mean anything like a
               | mathematical collection, air traffic control should learn
               | a thing or two from mathematics."
        
               | boomlinde wrote:
               | Just to be perfectly clear here because I'm not sure
               | you're just using my post as a soapbox or if you have
               | misunderstood my argument: I agree that it's clear what
               | real-time means in this context. I disagree that "usually
               | fast enough" guarantees failure for a VST, because in the
               | case of VST, "usually fast enough" is the only guarantee
               | the host operating system will offer your software.
               | 
               | It's not "theorycrafting" to say that real-time music
               | software running in a preemptive multitasking operating
               | system without deterministic process time allocation will
               | have to suffer the possibility of occasional drops. It
               | happens in practice and audio drivers have to be
               | implemented to account for the bulk of it, and the VST
               | API is designed in such a way that failure to fill a
               | buffer on time needn't be fatal.
        
               | TheOtherHobbes wrote:
               | It usually doesn't happen in practice unless you're doing
               | a lot of other things at the same time. Which you
               | shouldn't be.
               | 
               | Of course audio is block buffered over (mostly) USB, and
               | as long as the buffers are being filled more quickly than
               | they're being played out, the odd ms glitch here and
               | there is irrelevant.
               | 
               | As real-time systems Windows, MacOS and Linux are
               | terrible from a theoretical POV, and they're useless for
               | the kinds of process control applications where even a ms
               | of lag can destroy your control model.
               | 
               | But with adequate buffering and conservative loading they
               | work well enough to handle decent amounts of audio
               | synthesis processing without glitching - live, on stage.
        
               | boomlinde wrote:
               | _> It usually doesn 't happen in practice unless you're
               | doing a lot of other things at the same time. Which you
               | shouldn't be._
               | 
               |  _> Of course audio is block buffered over (mostly) USB,
               | and as long as the buffers are being filled more quickly
               | than they 're being played out, the odd ms glitch here
               | and there is irrelevant._
               | 
               | As I've noted earlier in the thread. In fact, that the
               | only thing you can offer under such circumstances is that
               | "it usually doesn't happen" because "it's usually fast
               | enough" is my entire point.
               | 
               |  _> As real-time systems Windows, MacOS and Linux are
               | terrible from a theoretical POV, and they 're useless for
               | the kinds of process control applications where even a ms
               | of lag can destroy your control model._
               | 
               | You could employ the same strategies to process control
               | problems where latency is not a problem so much as
               | jitter. You don't, because unlike a music performance an
               | occasional once-in-a-week buffer underflow caused by a
               | system that runs tens to hundreds of processes already at
               | boot can actually make lasting damage there.
        
             | microcolonel wrote:
             | Not to mention if the inference is done on the CPU, it
             | shouldn't be that hard to control it. The matrices are of a
             | set size by the time you're running a VST; this is the
             | actual simple answer.
             | 
             | The medium answer is "this is a wavenet model, so inference
             | is probably really expensive unless the continuous output
             | is a huge improvement to performance".
        
               | mochomocha wrote:
               | Indeed. Having myself spent some time in the "VST
               | lifestyle business" when I was in grad school (was
               | selling a guitar emulation based on physical modelling
               | synthesis), and now working in ML, I think there's no
               | chance for such an approach to hit "mainstream" anytime
               | soon. Even if you do your inference on CPU, most deep
               | learning libraries are designed for throughput, not
               | latency. In a VST plugin environment, you're also only
               | one of the many components requiring computation, so your
               | computational requirements better be low.
        
               | jerf wrote:
               | You might be able to combine it with the recent work on
               | minimizing models to obtain something that is small
               | enough to run reliably in real time.
               | 
               | Although the unusual structure of the net here may mean
               | you're doing original and possibly publication-level work
               | to adapt that stuff to this net structure.
               | 
               | If you were really interested in this, there could also
               | be some profit in minimizing the model and then figuring
               | out how to replicate it in a non-neural net way. Direct
               | study of the resulting net may be profitable.
               | 
               | (I'm not in the ML field. I haven't seen anyone report
               | this but I may just not be seeing it. But I'd be
               | intrigued to see the result of running the size reduction
               | on the net, running training on _that_ network, then
               | seeing if maybe you can reduce the resulting network
               | again, then training _that_ , and iterating until you
               | either stop getting reduced sizes or the quality degrades
               | too far. I've also wondered if there is something you
               | could do to a net to encourage it not to have
               | redundancies in it... although in this case the structure
               | itself may do that job.)
        
               | microcolonel wrote:
               | I wonder if teddykoker has looked at applying FFTNet or
               | similar methods as a replacement for Wavenet. I'm not
               | sure but it seems to me like FFTNet is a lot more
               | tractable than Wavenet, and not necessarily that much
               | worse for equivalent training data.
        
         | samplenoise wrote:
         | There's latency and there's the somewhat separate question of
         | how much time is needed to make a prediction. Wavenet is causal
         | (no look-ahead) and operates on the sample level so there are
         | no buffers and thus no latency in the strict sense, beyond
         | encoding/decoding into the sample rate and format required by
         | the ML model, which should take <1ms. Whether a model manages
         | to make a prediction in that amount of time depends on things
         | like the receptive field and number of layers. The linked paper
         | says their custom implementation runs at 1.1x real-time. I
         | guess this isn't impossible; their receptive field is ~40ms,
         | vs. 300 for the original (notoriously slow) wavenet, and the
         | model is likely to have less layers and channels.
        
           | cjlars wrote:
           | "Round trip," or guitar to processing to speakers needs to be
           | sub 10ms to be transparent to the musician. Source: spent
           | years playing guitar through my guitar -> DAC -> PC -> DAC ->
           | speaker signal chain
        
             | samplenoise wrote:
             | The receptive field size is how much 'history' the
             | algorithm requires, it doesn't affect the round trip time,
             | which can still be sub ms
        
       | munificent wrote:
       | I'm not an expert on machine learning or DSP, but I do know just
       | enough of each to suspect this isn't anywhere near as impressive
       | as it seems.
       | 
       | A distortion pedal is essentially just a waveshaper [1]. Think of
       | audio in digital terms as just a series of numbers. A waveshaper
       | is just a simple mathematical function. To apply it, you
       | literally just apply the function to each value in the input
       | stream and there's your output stream. There's no memory or
       | interesting algorithms going on. It's the audio equivalent to
       | calling map() on your list of samples with some lambda to produce
       | a new list of samples.
       | 
       | Of course distortion pedals do that in the analogue domain using
       | circuitry, which has some additional complexity because
       | transistors and diodes and friends don't behave exactly like
       | mathematical functions. There's "sag" and some other physical
       | effects that cause the output to also somewhat depend on previous
       | input.
       | 
       | Even so, that can generally be modelled using a simple
       | convolution. Each output sample is calculated by taking some
       | finite number of previous input samples, multiplying each of them
       | by a weight factor, and then summing the results.
       | 
       | Does that sound like a neural net? It is. That's what we call
       | them _convolutional_ neural networks. Convolution is bread and
       | butter in DSP. You can easily generate one that produces the same
       | effect as some piece of hardware or acoustic environment by
       | running an impulse (a single 1.0 sample surrounded by silence)
       | through the system and then recording the result. That  "impulse
       | response" essentially _is_ your set of convolution weights.
       | 
       | So using a _deep_ neural network and then _training_ sounds a lot
       | to me like overkill to me. You could accomplish much the same by
       | using a  "depth-1 network" and running an impulse through it.
       | 
       | Caveat, though: I am just a novice here, so there could very well
       | be a lot of subtlety I'm missing out on.
       | 
       | [1]: https://en.wikipedia.org/wiki/Waveshaper
        
         | afro88 wrote:
         | I think you're hand waving away all the complexity. You're
         | right that distortion is pretty much waveshaping. But all the
         | nuance, "warmth" and lovely non-linearities that make these
         | pedals highly sought after is the really really hard part. It
         | can't be simply solved with convolution.
         | 
         | The same pedal from this post has been pain stakingly circuit
         | modeled by Cytomic[1] over the past few years and still isn't
         | out of beta. Analog circuit modeling is a huge thing in DSP
         | right now because it's the closest we have to proper 1:1
         | software clones of analog hardware. But it's incredibly time
         | consuming.
         | 
         | I'm really excited by this use of WaveNet. It could drastically
         | cut down the time to clone old costly to maintain hardware. But
         | it will have some way to go before you can tweak the parameters
         | in realtime. Or so I assume?
         | 
         | [1]: https://cytomic.com/#plugins
        
         | ndm000 wrote:
         | I think the real innovation here is that this was done on just
         | a few minutes of training data, opening up the possibility for
         | all kinds of effects / amps to be modeled through this same
         | method somewhat easily. I'm not sure how current DSPs are
         | designed, but this is likely orders of magnitude more simple
         | than designing the audio transformations (digital or analog)
         | manually.
        
           | [deleted]
        
         | samplenoise wrote:
         | > You could accomplish much the same by using a "depth-1
         | network" and running an impulse through it
         | 
         | This would be true for a linear impulse response, however for
         | this kind of effects you need both state/memory (like a
         | convolution) and non-linearity (like a waveshaper), which is
         | why people use RNN's and CNN's
        
           | munificent wrote:
           | Ah, good point. Thank you for mentioning non-linearity. This
           | has helped clarify my novice thinking on this.
        
         | rrss wrote:
         | From a certain point of view, modern deep neural networks for
         | audio are 'just' nonlinear adaptive filters on steriods.
         | 
         | Linear adaptive filters have been around for a long long time,
         | and nowadays are everywhere. They can't capture the nonlinear
         | behavior of effect pedals, not even just the waveshaper.
         | 
         | The model you are describing sounds like a 'wiener model,'
         | which refers to a linear filter followed by some nonlinearity
         | (i.e. the waveshaper).
         | 
         | There are other approaches to nonlinear adaptive filters, like
         | Volterra series and kernel methods.
         | 
         | People have been using all of these techniques, and more, to
         | approximate analog audio effects for decades.
         | 
         | A 'trained deep neural network' is not in principle that much
         | different or 'less pure' than other nonlinear adaptive
         | filtering techniques, just with a load more parameters. What
         | matters is if the results are sufficiently improved to justify
         | the computation.
        
         | Exmoor wrote:
         | Also not an expert, but that sounds about right to me.
         | 
         | I imagine the difficulty in designing these models comes from
         | modeling the variable factors, IE the parameters normally
         | controlled by the knobs on the amp or effect. Some of these
         | should be straightforward (for example "gain" increasing the
         | volume on the input signal), but I suspect that in some pedals
         | these parameters changing can have impacts on how other
         | parameters behave. I don't see any mention of how this "deep
         | learning" model works with that.
         | 
         | Guitar modeling gear has been around for about 25 years (The
         | first Line6 amp debuted in 1996, I'm not sure if their were
         | earlier products brought to market). They've been derided by
         | purists, but have kind of turned a corner in recent years and
         | are now becoming very mainstream.
         | 
         | Some modern products, such as those sold by Kemper, actually
         | allow you to plug in to your existing gear and generate a
         | profile based on the impulse response. The results, at least
         | according to the reviews I've read, are actually very
         | impressive.
        
         | dontreact wrote:
         | I believe you are are vastly oversimplifying this.
         | 
         | An impulse response will characterize only a system that is
         | 
         | * linear
         | 
         | * time-invariant
         | 
         | Many effects are not linear (especially distortion: the
         | crunchiness comes from the nonlinearity). f(a) + f(b) != f(a+b)
         | 
         | And many effects are time varying, for example phasers and
         | choruses which have low frequency oscillators controlling how
         | the sound is shaped depending on when it comes in. Chorus for
         | example will vary the pitch up and down.
        
           | aerospace_guy wrote:
           | Yup! This covers the basics of control theory; a simple
           | concept that most don't understand.
        
       | dharma1 wrote:
       | Here is the original paper from 2019 by Eero-Pekka Damskagg-
       | https://research.aalto.fi/en/publications/realtime-modeling-...
       | 
       | It was also published as a realtime JUCE project, which might be
       | more useful for actual (realtime VST/AU) use:
       | 
       | https://github.com/damskaggep/WaveNetVA
       | 
       | Alec Wright has done more work on this since then, using it for
       | amplifiers:
       | 
       | https://www.aalto.fi/en/news/deep-learning-can-fool-listener...
       | 
       | And time variant effects:
       | 
       | https://github.com/Alec-Wright/NeuralTimeVaryFx
        
       ___________________________________________________________________
       (page generated 2020-05-11 23:00 UTC)