[HN Gopher] Using ASCII waveforms to test real-time audio code
       ___________________________________________________________________
        
       Using ASCII waveforms to test real-time audio code
        
       Author : jwosty
       Score  : 70 points
       Date   : 2021-10-13 18:30 UTC (4 hours ago)
        
 (HTM) web link (goq2q.net)
 (TXT) w3m dump (goq2q.net)
        
       | spicybright wrote:
       | Why would you use ascii for something like a waveform, something
       | that's inherently a graph?
       | 
       | Sure, maybe you don't need that much resolution for what the use
       | case is. But it's the equivalent of looking at a graph and
       | squinting your eyes to blur it.
        
         | jwosty wrote:
         | In short, because text is much easier to deal with than
         | bitmaps, and there is much more tooling that "just works" for
         | text than actual graphics, like Expecto's textual diffing in
         | assertations. @MayeulC said it well:
         | https://news.ycombinator.com/item?id=28856884
        
         | [deleted]
        
       | bitwize wrote:
       | That's so cool, and reminds me of how I used Gnuplot as a
       | makeshift oscilloscope to test and evaluate some (not real time)
       | software synthesis I was doing.
        
       | munchler wrote:
       | This is great. People are doing very cool things with F# these
       | days.
        
         | jwosty wrote:
         | Thanks, I like to think so! I didn't see other people doing
         | much audio programming in F#, so I figured someone would be
         | interested in seeing what it can look like.
        
           | brianberns wrote:
           | FWIW, you might like this:
           | https://github.com/brianberns/FYampaSynth
        
             | jwosty wrote:
             | That looks right up my alley, thanks for the link!
        
       | phab wrote:
       | This approach is neat for observability, but it's worth noticing
       | that it essentially quantises all of your samples down to the
       | vertical resolution of your graph. If you somehow introduced a
       | bug that caused an error that was smaller than the step size then
       | these tests wouldn't catch it.
       | 
       | (e.g. if you somehow managed to introduce a constant DC-offset of
       | +0.05, with the shown step size of 0.2, these tests would
       | probably never pick it up, modulo rounding.)
       | 
       | That said, these tests are great for asserting that specific
       | functionality does broadly what it says on the tin, and making it
       | easy to understand why not if they fail. We'll likely start using
       | this technique at Fourier Audio (shameless plug) as a more
       | observable functionality smoke test to augment finer-grained
       | analytic tests that assert properties of the output waveform
       | samples directly.
        
         | PaulDavisThe1st wrote:
         | A more accurate and only slightly more complex process for this
         | is to generate numerical text representations of the desired
         | test waveforms and then feed them through sox to get actual
         | wave files. The numerical text representations are likely even
         | easier to generate programmatically than the ascii->audio
         | transformation.
        
         | jwosty wrote:
         | That's true that it quantizes (aka bins) the samples, so it
         | isn't right for tests that need to be 100% sample-perfect, at
         | least vertically speaking. I suppose it is a compromise between
         | a few tradeoffs - easy readability just from looking at the
         | code itself (you could do images, but then there's a separate
         | file you have to keep track of, or you're looking at binary
         | data as a float[]) vs strict correctness. The evaluation of
         | these tradeoffs would definitely depend on what you're doing,
         | and in my case, most of the potential bugs are going to relate
         | to horizontal time resolution, not vertical sample depth
         | resolution.
         | 
         | If the precise values of these floats is important in your
         | domain (which it very well may be), a combination of approaches
         | would probably be good! Would love to hear how well this
         | approach works for you guys. Keep me updated :)
        
           | phab wrote:
           | I'm not sure it makes sense to separate "vertical"
           | correctness from "horizontal" correctness when it comes to
           | "did the feature behave" though; to extend the example in
           | TFA, if your fade progress went from 0->0.99 but then stopped
           | before it actually reached 1 for some reason, you might find
           | that you still had a (small, but still present) signal on the
           | output, which, if the peak-peak amplitude was < 0.1, the test
           | wouldn't catch.
           | 
           | Obviously any time you're working with floating-point sample
           | data the precise values of floats will almost always not be
           | bit-accurate against what your model predicts (sometimes even
           | if that model is a previous run of the same system with the
           | same inputs as in this case); it's about defining an
           | acceptable deviation. I guess what I'm saying is that for
           | audio software, a peak-peak error of 0.1 equates to a signal
           | at -20 dBFS (ref DBFS@1.0) (which of course is quite a large
           | amount of error for an audio signal), so perhaps using
           | higher-resolution graphs would be a good idea.
           | 
           | (Has anyone made a tool to diff sixels yet? /s)
        
             | jwosty wrote:
             | Fair points here. Unfortunately adding more vertical
             | resolution starts to get a little unwieldy to navigate
             | through. Maybe it could start using different characters to
             | multiply the resolution to something sufficiently less
             | forgiving of errors. If it could choose between even 3
             | chars, for example, it would effectively squash 3 possible
             | values into one line, tripling the resolution.
        
       | necubi wrote:
       | This is such a great idea! I've really struggled with how to test
       | real-time audio code in the live looper I've been working on [0].
       | Most of my tests use either very small, hand-constructed arrays,
       | or arrays generated by some function.
       | 
       | This is both tedious and makes it very hard to debug test
       | failures (especially with cases like crossfades, pan laws, and
       | looping). I love the idea of having a visual representation that
       | lets me see what's going wrong in the test output, and I'm
       | definitely going to try to implement some similar tests.
       | 
       | I'm also curious what the state-of-the-art is for these sorts of
       | tests. Does anyone have insight into what e.g., ableton's test
       | suite looks like?
       | 
       | [0] http://github.com/mwylde/loopers
        
         | jwosty wrote:
         | > I'm also curious what the state-of-the-art is for these sorts
         | of tests. Does anyone have insight into what e.g., appleton's
         | test suite looks like?
         | 
         | I don't know, but if I were to make an educated guess, maybe
         | rendering stuff to actual audio files is a common approach?
         | That way when something goes wrong, they can inspect it in a
         | standard waveform editor?
        
       | rbanffy wrote:
       | Am I the only one almost offended by Braille not being ASCII?
       | 
       | edit: Yes. I miscalculated the dot density.
       | 
       | /me slaps forehead
        
         | thewakalix wrote:
         | Aren't those asterisks?
        
           | rbanffy wrote:
           | Oh... The shame...
           | 
           | Yes. I miscalculated the dot density. :-(
        
       | rbanffy wrote:
       | If we go beyond ASCII, Unicode specifies 2x2 mosaics since ever
       | (they were present in DEC terminals) and 2x3 mosaics (from
       | Teletext and the TRS-80) since version 13. Some more enlightened
       | terminals (such as VTE) implement those symbols without the need
       | of font support.
       | 
       | Or you can use Braille to get 2x4 mosaics, but they usually look
       | terrible.
        
         | jwosty wrote:
         | I just might have to try this next.
        
       | focom wrote:
       | Would love to use it as a library! Is it open source?
        
         | jwosty wrote:
         | I've added an fssnip for the ASCII renderer. It uses NAudio.
         | Should be pretty easy to use. http://www.fssnip.net/85g
        
         | jwosty wrote:
         | Not yet, but it certainly could be. Would it be useful to
         | publish the helper classes that render the waves out to ASCII?
         | That's really the guts of the thing. After that, you just use
         | whatever testing framework you want to do the actual diffing
         | (in my case Expecto for F#).
        
       | robotsteve2 wrote:
       | Once you've got the waveforms as arrays, what do you need the
       | ASCII rendering for?
       | 
       | Instead of diffing ASCII-rendered waveforms, save the arrays and
       | diff the arrays (and then use any kind of numerical metric on the
       | residual). Scientist programmers have all sorts of techniques for
       | testing and debugging software that processes sampled signals.
        
       | user-the-name wrote:
       | Imagine if we had terminals that could handle graphical data. We
       | wouldn't have to do weird kludges like this, we could just plot
       | the waveforms in the output of our tools.
       | 
       | But it's 2021, and not only is this not possible, there is not
       | even a path forward to a world where this would be possible. It's
       | just not an option. Nobody is working on this, nobody is trying
       | to make this happen. We're just sitting here with our text
       | terminals, and we can't even for a second imagine that there
       | could be anything else.
       | 
       | It's sad, is what it is.
        
         | HPsquared wrote:
         | Notebook interfaces are basically that, e.g. Jupyter or
         | Mathematica.
        
         | voldacar wrote:
         | In TempleOS you can mix text, images, hyperlinks, and 3d models
         | in the terminal. This is true for the whole system: you could
         | literally have a spinning 3d model of a tank as a comment in a
         | source file. That's right, it took a literal schizophrenic to
         | make an OS with a feature that should have been standard
         | decades ago.
         | 
         | Nobody tries to make actually interesting new operating systems
         | anymore. OS research today is just "let's implement unix with
         | $security_feature", nobody is actually trying to make computers
         | more powerful or fun to use, or design a system based off of a
         | first-principles understanding of what a computer should be.
         | 
         | God I wish I was born in the lisp machine timeline
        
           | woodrowbarlow wrote:
           | the downside of rich terminal output is that media formats
           | become the system's responsibility. applications can't output
           | media in formats that aren't provided by the system, because
           | then the terminal wouldn't know how to display it and interop
           | with other applications (e.g. piping) wouldn't work either.
        
             | voldacar wrote:
             | You could let a program create an API for manipulating a
             | new type of data and inform the system about it so that
             | other programs could use it. This is more or less what
             | AmigaOS did; you installed a datatype for e.g. a PSD file,
             | then all your programs that worked with images could read
             | PSD files. I think it's a nice idea.
        
           | PeterisP wrote:
           | The features you describe belong to the app ecosystem, not to
           | the OS - IMHO the OS is about hardware and drivers, and what
           | kind of graphics is supported by your terminal and source
           | file editor is orthogonal to the OS and could be done in any
           | of the current OS'es; but that would require a
           | rewrite/redesign/reimagining of the whole standard
           | application package which seems a much larger project than
           | "merely" an OS.
        
             | voldacar wrote:
             | "There are more things in heaven and earth, Horatio, than
             | are dreamt of in your philosophy"
             | 
             | An OS facilitates communication between programs running on
             | a computer. Unix lets those programs communicate by sending
             | characters of text to each other. You could just as easily
             | imagine an OS that lets them communicate by sending images,
             | audio/video, 3d models, etc. An OS can be way more than
             | what you think it is. To detox your brain from this unix
             | worldview, spend some time in a VM and play around with
             | amigaOS or opengenera. Those were actual coherent OSes with
             | an actual view of what a computer should be and how it
             | should behave. Unix isn't.
             | 
             | > reimagining of the whole standard application package
             | which seems a much larger project than "merely" an OS.
             | 
             | By OS, I don't mean kernel. I mean the base set of software
             | that lets you interact with your computer and do
             | interesting stuff with it.
        
             | rbanffy wrote:
             | The line between app platform and OS is a blurry one. The
             | Amiga OS, for instance, has libraries for specific file
             | types that expose standardized entry points. This way, if
             | you install the library for Photoshop files, all graphics
             | programs that adhere to that protocol will be able to read
             | and write Photoshop PSD files. Microsoft had DDE and,
             | later, OLE, for embedding objects from one program into
             | data from another in a standard way all programs were
             | supposed to share. It was a pain.
             | 
             | This blurry line is present in other environments as well.
             | In the Apple Lisa, installing a program resulted in new
             | templates in the Stationery folder. In Smalltalk,
             | installing a program adds its class definitions to the
             | system as independent entities you could use in your own
             | programs.
             | 
             | Not all operating systems are the children of Unix and VMS.
        
               | bitwize wrote:
               | Smalltalk's components were so tightly interdependent
               | that their integration smoke test was 4+3, because
               | evaluating a simple addition expression exercised like
               | 3/4 of the entire system.
        
         | outworlder wrote:
         | Some terminals can.
         | 
         | https://iterm2.com/documentation-images.html
         | 
         | That's iterm's own implementation. There's also sixel, as
         | pointed out by another comment.
        
         | MayeulC wrote:
         | In truth, it's because text is quite easy to handle. It's easy
         | to make a program that handles text, too.
         | 
         | And so we have a lot of text editors, diff tools, efficient
         | compression, tools like sort and uniq: the whole unix
         | ecosystem.
         | 
         | So if you transform sound to text, you can then use text tools
         | to compare the output to catch differences. A simple
         | serialization of numerical sample values would have caught the
         | bug, but I agree that having a way of visualizing the output is
         | nice.
         | 
         | Command line input, programming, etc. is also still mostly done
         | with text, because it's easy to transform. Of course, you can
         | imagine working at a higher level with objects (like powershell
         | does IIRC), mimetypes, etc.
        
         | thanatos519 wrote:
         | Maybe you would like to support https://ctx.graphics/
        
         | zokier wrote:
         | > Imagine if we had terminals that could handle graphical data.
         | 
         | We have. They are called "browsers". You might be even using
         | one right now!
        
         | charlesdaniels wrote:
         | I would point out that sixels[0] exist. There is a nice
         | library, libsixel[1] for working with it, which includes
         | bindings into many languages. If the author of sixel-tmux[2][3]
         | is to be believed[4], the relative lack of adoption is a result
         | of unwillingness on the part of maintainers of some popular
         | open source terminal libraries to implement sixel support.
         | 
         | I can't comment on that directly, but I will say, it's pretty
         | damn cool to see GnuPlot generating output right into one's
         | terminal. lsix[5] is also pretty handy as well.
         | 
         | But yeah, I agree, I'm not a fan of all the work that has gone
         | into "terminal graphics" that are based on unicode. It's a
         | dead-end, as was clear to DEC even back in '87 (and that's
         | setting aside that the VT220[6] had it's own drawing
         | capabilities, though they were more limited). Maybe sixel isn't
         | the best possible way of handling this, but it does have the
         | benefit of 34 years of backwards-compatibility, and with the
         | right software, you can already use it _now_.
         | 
         | 0 - https://en.wikipedia.org/wiki/Sixel
         | 
         | 1 - https://saitoha.github.io/libsixel/
         | 
         | 2 - https://github.com/csdvrx/sixel-tmux
         | 
         | 3 - https://news.ycombinator.com/item?id=28756701
         | 
         | 4 - https://github.com/csdvrx/sixel-tmux/blob/main/RANTS.md
         | 
         | 5 - https://github.com/hackerb9/lsix
         | 
         | 6 - https://en.wikipedia.org/wiki/VT220
        
           | user-the-name wrote:
           | That's a protocol that's a good forty years old, and even
           | that is not supported. And I can see why, why on earth would
           | you want to be adding support for that in 2021? What a
           | ridiculous state of affairs.
        
           | jwosty wrote:
           | That's interesting. Do you think sixels could work for the
           | baseline tests? Would it be feasible to have them display
           | nicely in an IDE, like VS Code or Visual Studio?
        
           | MayeulC wrote:
           | I find kitty's graphics protocol to be a superior
           | implementation of the idea:
           | https://sw.kovidgoyal.net/kitty/graphics-protocol/
        
         | rbanffy wrote:
         | The venerable xterm and a lot of later physical terminals
         | (those things with CRTs) can emulate Tektronix (Tektronix, that
         | today makes instruments, also made computer terminals with
         | fancy storage CRTs that were kind of e-paper-like, but green -
         | and sometimes yellow - screen) graphics. iTerm2 and some
         | others, as pointed out, can do Sixel graphics (a format
         | designed originally for DEC dot-matrix printers that some DEC
         | terminals also implement).
        
           | user-the-name wrote:
           | I mean, yes, that is how sad the current state is.
        
             | rbanffy wrote:
             | VTE and, with it, almost every Linux distro, will get Sixel
             | support soon. I volunteered to add Tektronix graphics to it
             | too, but this is neither a dire need, nor something I have
             | done before, so it'll take some time.
        
               | user-the-name wrote:
               | It's forty years old. Why on earth would you be adding
               | that in 2021?
               | 
               | Why are we not focusing our energy on making something
               | that is actually up to date?
        
               | rbanffy wrote:
               | Because things that existed 40 years ago are useful,
               | already have software written for it, are compatible in
               | sometimes unforeseen ways (a DEC dot-matrix graph can be
               | printed as is on a Sixel-compatible terminal!) and have
               | been battle tested for ages.
               | 
               | There is a reason the Unix way of bytestream-based shell
               | and pipes is still useful and present these days to the
               | point that That Other OS is now embedding Linux in it.
               | 
               | Also, these ancient terminals often had some interesting
               | typography options that are encoded in the ANSI standard
               | that most modern terminals don't bother (line attributes
               | that generate wider and taller cells are one such
               | example).
               | 
               | These formats may be more desirable than more modern and
               | complete ones such as PostScript for other reasons. I
               | wouldn't advise implementing a terminal capable of
               | rendering PostScript graphics because it's one more way
               | to infiltrate malware in your computer by rendering
               | untrusted inputs (There are a lot of RCE opportunities in
               | exploiting vulnerable decoders).
        
         | gwbas1c wrote:
         | > It's sad, is what it is.
         | 
         | With graphics being everywhere in 2021, I wouldn't call this
         | situation "sad," I'd think a lot more critically about _why._
         | 
         | To start with, fixed-width text is significantly easier to work
         | with than graphics.
         | 
         | Nothing's stopping anyone from writing a CI tool that outputs
         | to HTML with embedded images. The bigger question is why it's
         | uncommon.
        
       ___________________________________________________________________
       (page generated 2021-10-13 23:00 UTC)