[HN Gopher] Pedalboard: Spotify's Audio Effects Library for Python
       ___________________________________________________________________
        
       Pedalboard: Spotify's Audio Effects Library for Python
        
       Author : bobbiechen
       Score  : 167 points
       Date   : 2021-09-08 15:56 UTC (7 hours ago)
        
 (HTM) web link (engineering.atspotify.com)
 (TXT) w3m dump (engineering.atspotify.com)
        
       | aVx1uyD5pYWW wrote:
       | It would be cool if there was a way to use this tool in a shell
       | pipeline, e.g.:
       | 
       | cat sound.wav | distortion | reverb | aplay
        
         | gregsadetsky wrote:
         | That's an interesting idea that you should be able to build on
         | top of Pedalboard -- i.e., implement a CLI that accepts input
         | and output file paths (and optionally support piped data, as in
         | your example) and expose the VST plugin names and their options
         | i.e.
         | 
         | $ cat sound.wav | pedalboard --compressor --compressor-
         | threshold-db=-50 --gain --gain-db=30 | aplay
        
         | PaulDavisThe1st wrote:
         | That's easy to do if you first strip the WAV formatting stuff
         | and then at the end add back the format information.
         | 
         | You can't really do it without that, because sound.wav contains
         | both actual audio data and "metadata".
         | 
         | In the real world however, almost nobody who has done this sort
         | of thing actually wants to do it that way. The processing
         | always has a lot of parameters and you are going to want to
         | play with them based on the actual contents of sound.wav. Doing
         | this in realtime (listening while fiddling) is much more
         | efficient than repeatedly processing then listening.
        
         | jcelerier wrote:
         | you can do that trivially with sox
         | (http://sox.sourceforge.net/sox.html)                   alias
         | fx="sox - -t wav -"         cat foo.wav | fx overdrive | fx
         | reverb | play -
         | 
         | considering that sox has existed since the early 1990's, I'd
         | wager that the demand for that isn't exactly huge
         | 
         | (note that in practice you'd directly use sox's play command to
         | apply effects as it's certainly muuuuuuuuuuuch more efficient
         | than to spin up a ton of processes which'll read from
         | stdin/stdout)
        
           | dimatura wrote:
           | I've never actually used sox for fx, I wonder how good they
           | are. (I have used it plenty for resampling, normalization,
           | trimming, etc). But regardless, there's thousands of VSTs out
           | there -- a lot more options than whatever sox has built in.
        
             | jcelerier wrote:
             | IIRC Sox can apply plugins, maybe not vst but lv2 or ladspa
        
       | bayesian_horse wrote:
       | I'm still hoping someone will implement a custom node editor for
       | sound effects in Blender...
       | 
       | I thought about doing it, but don't need it that badly and you
       | know, so many ideas so little time!
        
       | PaulDavisThe1st wrote:
       | > This ability to play with sound is usually relegated to DAWs,
       | and these apps are built for musicians, not programmers. But what
       | if programmers want to use the power, speed, and sound quality of
       | a DAW in their code?
       | 
       | Well then, they could:
       | 
       | * use Faust or Soul
       | 
       | * use existing plugins in LV2, VST3 or AU formats
       | 
       | * write a new plugin in LV2, VST3 or AU formats
       | 
       | * use SuperCollider, or PureData or any of more than a dozen
       | live-coding languages
       | 
       | * use VCV Rack or Reaktor or any of at least half-dozen other
       | modular environments to build new processing pathways.
       | 
       | Oh wait ...
       | 
       | > Artists, musicians, and producers with a bit of Python
       | knowledge can use Pedalboard to produce new creative effects that
       | would be extremely time consuming and difficult to produce in a
       | DAW.
       | 
       | So it's not actually for programmers at all, its for people "with
       | a bit of Python knowledge".
       | 
       | OK, maybe I'm being a bit too sarcastic. I just get riled up by
       | the breathless BS in the marketing copy for this sort of thing.
       | 
       | It's a plugin host, with the ability to add your own python code
       | to the processing pathway. Nothing wrong with that, but there's
       | no need to overstate its novelty or breadth.
       | 
       | [ EDIT: if I hadn't admitted to my own over-snarkiness, would you
       | still have downvoted my attempt to point out other long-available
       | approaches for the apparent use-case? ]
        
       | 41209 wrote:
       | Any reason for picking GPL.
       | 
       | I was very excited to see this , but with a GPL license I can't
       | use it in my projects .
        
         | lsb wrote:
         | I'm sure they'd love to sell you access under a different
         | license!
        
         | psobot wrote:
         | Pedalboard is a wrapper around the JUCE framework
         | (https://juce.com), which is dual-licensed under the GPLv3 or a
         | custom paid commercial license. We chose to license it with the
         | GPLv3 rather than coming up with a dual-license solution
         | ourselves, given that this is an audio processing tool in
         | Python and will usually be used in scripts, backends, and other
         | scenarios where users of Pedalboard are not likely to
         | distribute their code in the first place.
        
           | 41209 wrote:
           | Ahh.
           | 
           | That makes sense.
           | 
           | I definitely do appreciate it, I couldn't figure out JUCE
           | when I tired to use it.
        
       | psobot wrote:
       | Wow, didn't expect this to hit HN! I'm the author of this project
       | and super glad that it's getting some traction.
       | 
       | Under the hood, this is essentially just a Python wrapper around
       | JUCE (https://juce.com), a comprehensive C++ library for building
       | audio applications. We at Spotify needed a Python library that
       | could load VSTs and process audio extremely quickly for machine
       | learning research, but all of the popular solutions we found
       | either shelled out to command line tools like sox/ffmpeg, or had
       | non-thread-safe bindings to C libraries. Pedalboard was built for
       | speed and stability first, but turned out to be useful in a lot
       | of other contexts as well.
        
         | Mizza wrote:
         | Out of curiosity, did you use any of the code produced by Echo
         | Nest? They were a Boston audio tech company that had lots of
         | features like this, but they got swallowed by Spotify many
         | years ago. I built some tools on top of their service, I always
         | wondered what happened to it.
        
           | psobot wrote:
           | No Echo Nest code was included in this project specifically,
           | but my team owns a lot of the old Echo Nest systems, data,
           | and audio magic (i.e.: what used to be the Remix API, audio
           | features, audio analysis, etc.). Pedalboard is being used to
           | continue a lot of the audio intelligence research that
           | started way back with the Echo Nest!
           | 
           | (Fun fact: the Echo Nest's Remix API was what got me
           | interested in writing code way back in high school. Now, more
           | than a decade later, I'm the tech lead for the team that owns
           | what's left of it. I still can't believe that sometimes.)
        
         | gregsadetsky wrote:
         | This is great, congrats and thank you (& Spotify) for releasing
         | this!
         | 
         | I was just about to look for a library to layer 2 tracks (a
         | text-to-speech "voice" track, and a background music track) and
         | add compression to the resulting audio.
         | 
         | A few questions if you don't mind:
         | 
         | - Pedalboard seems more suited to process one layer at a time,
         | correct? I would be doing muxing/layering (i.e. automating the
         | gain of each layer) elsewhere?
         | 
         | - Do you have a Python library recommendation to mux and add
         | silence in audio files/objects? pydub seems to be ffmpeg-based.
         | Is that a better option than a pure-Python implementation such
         | as SoundFile?
         | 
         | Thanks
        
           | psobot wrote:
           | Thanks!
           | 
           | That's correct: Pedalboard just adds effects to audio, but
           | doesn't have any notion of layers (or multiple tracks, etc).
           | It uses the Numpy/Librosa/pysoundfile convention of
           | representing audio as floating-point Numpy arrays.
           | 
           | Mixing two tracks together could be done pretty easily by
           | loading the audio into memory (e.g.: with soundfile.read),
           | adding the signals together (`track_a * 0.5 + track_b *
           | 0.5`), then writing the result back out again.
           | 
           | Adding silence or changing the relative timings of the tracks
           | is a bit more complex, but not by much: the hardest part
           | might be figuring out how long your output file needs to be,
           | then figuring out the offsets to use in each buffer (i.e.:
           | `output[start:end] += gain * track_a[:end - start]`).
        
             | gregsadetsky wrote:
             | Makes sense, so I'd be doing everything at the sample-level
             | 
             | For layers, I could have an array that represents "gain
             | automation" for each layer, and then let numpy do `track_a
             | * gain_a + track_b * (1-gain_a)` for the whole output in
             | one go.
             | 
             | And I'd create silences by inserting 0's (and making sure
             | that I'm inserting them after a zero crossing point to
             | avoid clicks)
             | 
             | I'm prone to NIH :-) but I'll also try to see if something
             | like this exists. But at least -- it's clearly do-
             | able/prototype-able!
             | 
             | Thank you
        
         | PaulDavisThe1st wrote:
         | Are you using python to do realtime audio processing, or is
         | this all offline ("batch") processing? It wasn't entirely clear
         | from reading the blurb ...
        
           | psobot wrote:
           | We use Pedalboard (and Python) for offline/batch processing -
           | mostly ML model training.
           | 
           | Pedalboard would also be usable in situations that are
           | tolerant of high latency and jitter, though, given that all
           | audio gets handed back to Python (which is both garbage
           | collected and has a global interpreter lock) after processing
           | is complete.
        
         | odiroot wrote:
         | Hey, did you consider releasing a wrapper for VST instruments?
         | 
         | There's definitely a lack of cross platform VST host (without
         | the need to use a DAW).
         | 
         | Also can Pedalboard support VST GUIs?
        
           | psobot wrote:
           | Instruments wouldn't be that hard to add to Pedalboard, but
           | we don't have a use case for them on my team just yet. I
           | might give that a try in the future, or might let someone
           | else in the community contribute that.
           | 
           | Pedalboard doesn't support GUIs at the moment, but there's an
           | issue on GitHub to track that:
           | https://github.com/spotify/pedalboard/issues/8
        
         | ironrabbit wrote:
         | Slightly off-topic, but is there a good overview of machine
         | learning research being done at Spotify?
        
           | psobot wrote:
           | There is! Check out http://research.atspotify.com/.
        
       | squarefoot wrote:
       | I see some criticism, however keep in mind that the news is that
       | the library has just been open sourced, so it's a good thing even
       | just for learning.
        
       | 12ian34 wrote:
       | Whilst this seems cool - I'm struggling to understand the real
       | world use cases.
       | 
       | > Machine Learning (ML): Pedalboard makes the process of data
       | augmentation for audio dramatically faster and produces more
       | realistic results ... Pedalboard has been thoroughly tested in
       | high-performance and high-reliability ML use cases at Spotify,
       | and is used heavily with TensorFlow.
       | 
       | What are the actual use cases internally at Spotify and for the
       | public here?
       | 
       | > Applying a VST3(r) or Audio Unit plugin no longer requires
       | launching your DAW, importing audio, and exporting it; a couple
       | of lines of code can do it all in one command, or as part of a
       | larger workflow.
       | 
       | I wonder how many content creators are more comfortable with
       | Python than with a DAW or Audacity?
       | 
       | > Artists, musicians, and producers with a bit of Python
       | knowledge can use Pedalboard to produce new creative effects that
       | would be extremely time consuming and difficult to produce in a
       | DAW.
       | 
       | Googling "how to add reverb" yields Audacity as the first option.
       | A free, open source tool available on Linux+Win+Mac. In what
       | world is it easier to do this in Python for Artists, musicians
       | and producers?
       | 
       | As a music producer that's well versed in Python myself (even if
       | I hadn't switched to producing almost entirely out-of-the-box and
       | on modular/hardware synths) I'd much rather just apply basic
       | effects like these in a DAW/Audacity, where accessing and
       | patching a live audio stream is much easier than figuring out how
       | to do that in Python and only being able to apply effects to .wav
       | files rather than live audio.
        
         | bayesian_horse wrote:
         | For machine learning on audio data, it is often (always?)
         | useful to modify the original dataset to make the model more
         | general.
         | 
         | A way to do such manipulation that is both convenient to use
         | from Python (a major programming language in the field and well
         | tied in to the major frameworks) and performant is extremely
         | welcome.
        
         | thibaut_barrere wrote:
         | Not a lot of creators are necessarily comfortable with Python
         | or other coding, but there are definitely people (including me)
         | interested in whatever can be done programmatically with a DAW
         | quality, without a DAW.
         | 
         | This opens possibilities such as version control, collaboration
         | via PR, the regular coding workflow etc.
         | 
         | (I am dabbling with music and Elixir + Rust at the moment, and
         | definitely interested by what Pedalboard brings, including
         | programmatic VST hosting etc).
        
           | PaulDavisThe1st wrote:
           | There are plenty of standalone plugin hosts, so just write a
           | plugin (JUCE offers a perfectly fine framework and workflow
           | for that, as do some others like DPF), load it into a
           | standalone plugin host, done.
        
             | hannasanarion wrote:
             | That's what pedalboard does. It's a python wrapper and
             | plugin host for JUCE.
        
             | thibaut_barrere wrote:
             | I have used various options for that (using VST SDK host,
             | or various librairies etc), but I am happy to have options
             | actually.
        
         | dimatura wrote:
         | On the ML front (which is probably their primary motivation)
         | it's pretty useful for the kind of things Spotify is
         | interested. As a basic example, say you want to train a model
         | to classify songs by genre. If you have say, a country song,
         | adding a bit of reverb or compression to it will not change
         | what genre it sounds like. So augmenting their training data
         | with small transformations such as these can make their models
         | more robust to these transformations. Obviously, this has to
         | been judiciously, e.g, if you add tons of distortion and reverb
         | to a country song it might sound like some experimental noise
         | and not country. This kind of thing also can help with
         | duplicate detection, song recommendations, playlist generation,
         | autotagging, etc.
         | 
         | As for creators, maybe not a large fraction of music creators
         | are coders, but there's certainly an intersection in that venn
         | diagram, though I have no idea how large it is. And I imagine
         | this could be used to create other tools that don't require
         | coding.
         | 
         | Clearly, most of the time it makes more sense to apply FX
         | interactively in your DAW of choice, but I find it useful to
         | programmatically modify audio sometimes. For example, I've
         | written quick scripts using sox and other tools to
         | normalize/resample audio, as well as slice loops. I could see
         | being able to add other fx such as compression or maybe even
         | reverb programatically could be occasionally useful.
        
           | tekromancr wrote:
           | > maybe not a large fraction of music creators are coders
           | 
           | I think you would be surprised to know how large that middle
           | spot in the venn diagram is
        
             | ace2358 wrote:
             | I don't think I would. The amount of music produces and
             | musicians I know the majority of them are relatively poor
             | at 'tech' and definitely not coders/programmers/software
             | engineers. They definitely know their way around tools, but
             | not coders.
        
         | minxomat wrote:
         | > Applying a VST3(r) or Audio Unit plugin no longer requires
         | launching your DAW, importing audio, and exporting it; a couple
         | of lines of code can do it all in one command, or as part of a
         | larger workflow.
         | 
         | Is also not really true. There are plenty of scriptable VST
         | hosts, and libraries. BASS (the library) for instance has been
         | around for ages and I've used it to host VSTs in script
         | workflows.
        
         | interestica wrote:
         | > I wonder how many content creators are more comfortable with
         | Python than with a DAW or Audacity?
         | 
         | This opens it up potential for a simple GUI. For a basic user,
         | drag and drop an audio file and flip virtual switches. Or,
         | easier integration into a mobile "podcast creator" app.
        
           | PaulDavisThe1st wrote:
           | Some years ago when Ardour (a crossplatform FLOSS DAW) was
           | being sponsored by SSL (famous for their large scale mixing
           | consoles), I got to attend a meeting designed to float and
           | discuss "blue sky" ideas.
           | 
           | Somebody who had been with the company for a long time
           | predicted that the broadcast world was going to end up
           | demanding a box with just 3 buttons:                 [  That
           | was worse ]            [  That was better ]            [  Try
           | something else ]
           | 
           | Everybody laughed, but everybody also knew that this was
           | indeed the direction that audio engineering was going to go
           | in.
           | 
           | And now, 12 years later ...
        
             | interestica wrote:
             | One of my favourite things with Winamp ~20 years ago was
             | the ability to stack DSP/sound plugins and then output to
             | WAV. It was a weird but great way to quickly create CD-
             | ready tracks that were crossfaded with effects (eg speed or
             | stereo separation or vocal removal) etc. I was basically
             | 'batch processing' through a GUI without even realizing it.
        
             | dimatura wrote:
             | Yeah, sounds like the kind of thing Landr and Izotope are
             | offering these days.
        
       ___________________________________________________________________
       (page generated 2021-09-08 23:00 UTC)