[HN Gopher] Show HN: I made a volumetric audio visualizer
       ___________________________________________________________________
        
       Show HN: I made a volumetric audio visualizer
        
       I'm developing Hyperstep[0], a spatial language for music
       production. I find using existing DAWs frustrating because they
       don't allow me to navigate and operate intuitively on the latent
       spaces behind my musical ideas. This is why I've decided to build
       my own set of "seeing tools".(Bret Victor)[1]. I'm also convinced
       that by framing music as processes and interactions in the 3D
       world, spatialization and mixing should become fairly pain-free.
       I'm still early in development and I would love to build this into
       an actual product that can be integrated into existing DAWs or even
       turn it into a musical framework itself for AR and VR experiences.
       If you're interested in working on it or if you simply want to know
       more, feel free to contact me.  [0]
       https://github.com/a-sumo/hyperstep.  [1]
       https://www.youtube.com/watch?v=klTjiXjqHrQ
        
       Author : rslice
       Score  : 55 points
       Date   : 2022-11-01 18:18 UTC (4 hours ago)
        
 (HTM) web link (a-sumo.github.io)
 (TXT) w3m dump (a-sumo.github.io)
        
       | pvg wrote:
       | You might want to include some audio (or maybe it's there and not
       | loading in my browser) so people who try it can see what it is
       | right away. People also don't have to wonder what happens to
       | their audio file to check it out.
        
         | eimrine wrote:
         | And in some cases like a work PC, finding a music can be
         | challenging.
        
           | archontes wrote:
           | https://commons.wikimedia.org/wiki/Category:Audio_files_of_c.
           | ..
        
             | Kaibeezy wrote:
             | https://commons.wikimedia.org/wiki/Category:Audio_files_of_
             | c...
        
       | knaekhoved wrote:
       | What is the point of making this 3D when it seems to be
       | circularly symmetric? Just replace the circle with a line and
       | make it 2D, yeah?
        
         | rslice wrote:
         | By building symmetries and lifting low-dimensional data into
         | higher dimensional spaces(Here 3D), you can better leverage the
         | human perceptual system. You also unlock more intuitive modes
         | of interaction, that are closer to the real-world processes
         | that originated the sounds.
        
           | knaekhoved wrote:
           | > lifting low-dimensional data into higher dimensional spaces
           | 
           | This is only useful if you actually do something with the
           | extra dimensions, e.g. lifting with a kernel. Duplicating the
           | exact same data into more dimensions is not helpful
        
             | rslice wrote:
             | I agree. As of now, the extra dimensions are redundant.
             | 
             | However they are still somewhat useful. By using distance
             | functions I localize sound sources in space but I also turn
             | them into symbols. This helps me keep track of them and I
             | plan on building a set of rules (grammar) for how these
             | symbols can interact.
             | 
             | Now, building the semantically relevant 'spatial symbols'
             | from data is where the real challenge is, and the first
             | step is to actually gather such data. Unfortunately I don't
             | have access to a photogrammetry setup so all I can do is
             | wait for companies/research institutes to make appropriate
             | data accessible. The alternative is to generate the data
             | synthetically, but you hit a procedural audio generation
             | wall.
             | 
             | >lifting with a kernel
             | 
             | I am not familiar enough with the lifting trick to know
             | whether or not it is relevant to this context, which is
             | that of 'embodying' sounds, and not of classifying them. I
             | think it would be silly to think the 3d space would be
             | sufficient to perform sound source separation and/or music
             | transcription. If I were to add those features I would
             | definitely use existing models(neural nets), which properly
             | leverage much higher dimensional spaces.
        
       | eimrine wrote:
       | I tried to open mp3 but nothing happened. Chrome outdated on
       | Windows x32. Will try on a decent pc later.
        
       | nsxwolf wrote:
       | I tried it on my iPhone with an mp3. It displayed what looked
       | like a blue aerogel. I could zoom in and rotate but I couldn't
       | tell what kind of information it was trying to convey. It wasn't
       | playing the audio or changing with time in any way either.
        
       | naillo wrote:
       | Worked for me :) Neat!
        
         | rslice wrote:
         | Thanks! Glad you're enjoying it!
        
       | techbro92 wrote:
       | I agree a 3d visual would be useful for spatialization and mixing
       | but I don't see how it would be useful for anything else.
        
         | AlecSchueler wrote:
         | I mean mixing is a whole industry itself, why do you think
         | that's an unsatisfactory target market?
        
           | techbro92 wrote:
           | I dont
        
         | rslice wrote:
         | In the repository https://github.com/a-sumo/hyperstep#organic-
         | drums-through-ag..., I explain how one can generate drums by
         | framing them as agent locomotion.
         | 
         | Moreover, music and dancing are two sides of the same coin and
         | in styles such as hip hop and popping you have 'beat-
         | killers'[0] who basically represent music through their motion.
         | 
         | If you build an appropriate spatial language, you can compose
         | music from dancing or you can build a form of generalized
         | dancing. I've explored this concept artistically[1],[2] and I'm
         | currently working on formalizing it and making it
         | computational.
         | 
         | [0] https://www.youtube.com/watch?v=IeNn4XG0QHo
         | [1]https://imgur.com/gallery/pfrvIot
         | [2]https://imgur.com/gallery/MoDVr6v
        
       | knaik94 wrote:
       | I am not sure sure what is causing it, but it takes a solid 2 to
       | 3 minutes on my computer before it does anything. I load a file
       | and it feels like it freezes and firefox gave me a warning banner
       | that the tab was causing all of firefox to slow down. Same thing
       | is Chrome and I have a i7-10750h. Some people might mistake that
       | for it not working since there is not UI feedback of anything
       | happening. Windows 10.
       | 
       | I got two different tracks to work, and it's clear that one was a
       | lot harder to process than the other. It took noticeably more
       | time on the second one, to start and the CPU utilization was
       | higher as well. They were both instrumental tracks in the same
       | format and around the same length. The one simpler to process was
       | the instrumental of Britney Spear's Baby One More Time. The
       | harder one was Porter Robinson's Divinity.
       | 
       | Neither audio had an effect similar to the one from the demo
       | video, but were interesting regardless. They both looked like how
       | I imagine sound waves echo and bounce around if contained in a
       | cube shape.
       | 
       | I appreciate the notebook writeup where you described the goals
       | because the visualization wasn't inherently intuitive with the
       | sound. I chose much more complex tones than your demo. I imagine
       | the feature extraction is much easier on isolated sounds. This
       | reminds me a lot of project milkdrop and so I was expecting it to
       | be closer to that but in 3d. That was probably a misunderstanding
       | on my part of the goals for this.
       | 
       | I think exposing more parameters about how features get mapped
       | and scaled would be really helpful in making it feel more
       | intuitive. Zooming the cube in and out is nice but didn't seem to
       | help convey more information with the tracks I chose. If anything
       | it got in the way because on my computer the zoom sensitivity was
       | very very high.
       | 
       | I look forward to seeing where this goes.
        
         | rslice wrote:
         | Thank you for your thoughtful comment! This is by no means a
         | product, it's more of a way for me to test an idea and share it
         | with the community.
         | 
         | This demo is meant for small audio samples. My initial goal was
         | to use it to visualize and compare drum samples by looking at
         | their 'spatial signature'.
         | 
         | Right now, I'm using arbitrarily defined 'shapes' (sphere and
         | tube) but the goal is to recover those from real-world data.
         | Unfortunately, building the appropriate data-set and the model
         | to go with it is currently out of my reach but I know that's
         | how my brain sort of learned these audio/spatial associations.
         | 
         | In order for this to work on complete tracks, I would need to
         | add source separation, transcription and some form of
         | information compression.
         | 
         | Additionally, I'm working on ways to deal with richer sounds,
         | by laying them out in space or by splitting them into 'voices
         | of unison'. Here's a demo of what it would look like:
         | https://imgur.com/gallery/gkFPXXu
         | 
         | There are two directions I could take this, either making music
         | from interacting with spatial representations or building
         | spatial representations from music. I don't have the bandwidth
         | to do both alone so I would love to work on it with other
         | developers.
        
       | dvh wrote:
       | Works exactly once but only on short files, then I have to close
       | browser (all windows) and restart. Chrome, Ubuntu.
        
         | rslice wrote:
         | Thanks for the feedback, I'm investigating issues on Linux. Try
         | opening the website on a smartphone.
        
       | packetlost wrote:
       | Broke for me on FF and chrome
        
         | rslice wrote:
         | I am working on using audio worklets to perform the audio
         | processing without blocking the main UI thread, but support on
         | browsers other than Chrome is still limited.
         | 
         | Try loading smaller samples or audio snippets.
        
       ___________________________________________________________________
       (page generated 2022-11-01 23:01 UTC)