[HN Gopher] Using ASCII waveforms to test real-time audio code ___________________________________________________________________ Using ASCII waveforms to test real-time audio code Author : jwosty Score : 70 points Date : 2021-10-13 18:30 UTC (4 hours ago) (HTM) web link (goq2q.net) (TXT) w3m dump (goq2q.net) | spicybright wrote: | Why would you use ascii for something like a waveform, something | that's inherently a graph? | | Sure, maybe you don't need that much resolution for what the use | case is. But it's the equivalent of looking at a graph and | squinting your eyes to blur it. | jwosty wrote: | In short, because text is much easier to deal with than | bitmaps, and there is much more tooling that "just works" for | text than actual graphics, like Expecto's textual diffing in | assertations. @MayeulC said it well: | https://news.ycombinator.com/item?id=28856884 | [deleted] | bitwize wrote: | That's so cool, and reminds me of how I used Gnuplot as a | makeshift oscilloscope to test and evaluate some (not real time) | software synthesis I was doing. | munchler wrote: | This is great. People are doing very cool things with F# these | days. | jwosty wrote: | Thanks, I like to think so! I didn't see other people doing | much audio programming in F#, so I figured someone would be | interested in seeing what it can look like. | brianberns wrote: | FWIW, you might like this: | https://github.com/brianberns/FYampaSynth | jwosty wrote: | That looks right up my alley, thanks for the link! | phab wrote: | This approach is neat for observability, but it's worth noticing | that it essentially quantises all of your samples down to the | vertical resolution of your graph. If you somehow introduced a | bug that caused an error that was smaller than the step size then | these tests wouldn't catch it. | | (e.g. if you somehow managed to introduce a constant DC-offset of | +0.05, with the shown step size of 0.2, these tests would | probably never pick it up, modulo rounding.) | | That said, these tests are great for asserting that specific | functionality does broadly what it says on the tin, and making it | easy to understand why not if they fail. We'll likely start using | this technique at Fourier Audio (shameless plug) as a more | observable functionality smoke test to augment finer-grained | analytic tests that assert properties of the output waveform | samples directly. | PaulDavisThe1st wrote: | A more accurate and only slightly more complex process for this | is to generate numerical text representations of the desired | test waveforms and then feed them through sox to get actual | wave files. The numerical text representations are likely even | easier to generate programmatically than the ascii->audio | transformation. | jwosty wrote: | That's true that it quantizes (aka bins) the samples, so it | isn't right for tests that need to be 100% sample-perfect, at | least vertically speaking. I suppose it is a compromise between | a few tradeoffs - easy readability just from looking at the | code itself (you could do images, but then there's a separate | file you have to keep track of, or you're looking at binary | data as a float[]) vs strict correctness. The evaluation of | these tradeoffs would definitely depend on what you're doing, | and in my case, most of the potential bugs are going to relate | to horizontal time resolution, not vertical sample depth | resolution. | | If the precise values of these floats is important in your | domain (which it very well may be), a combination of approaches | would probably be good! Would love to hear how well this | approach works for you guys. Keep me updated :) | phab wrote: | I'm not sure it makes sense to separate "vertical" | correctness from "horizontal" correctness when it comes to | "did the feature behave" though; to extend the example in | TFA, if your fade progress went from 0->0.99 but then stopped | before it actually reached 1 for some reason, you might find | that you still had a (small, but still present) signal on the | output, which, if the peak-peak amplitude was < 0.1, the test | wouldn't catch. | | Obviously any time you're working with floating-point sample | data the precise values of floats will almost always not be | bit-accurate against what your model predicts (sometimes even | if that model is a previous run of the same system with the | same inputs as in this case); it's about defining an | acceptable deviation. I guess what I'm saying is that for | audio software, a peak-peak error of 0.1 equates to a signal | at -20 dBFS (ref DBFS@1.0) (which of course is quite a large | amount of error for an audio signal), so perhaps using | higher-resolution graphs would be a good idea. | | (Has anyone made a tool to diff sixels yet? /s) | jwosty wrote: | Fair points here. Unfortunately adding more vertical | resolution starts to get a little unwieldy to navigate | through. Maybe it could start using different characters to | multiply the resolution to something sufficiently less | forgiving of errors. If it could choose between even 3 | chars, for example, it would effectively squash 3 possible | values into one line, tripling the resolution. | necubi wrote: | This is such a great idea! I've really struggled with how to test | real-time audio code in the live looper I've been working on [0]. | Most of my tests use either very small, hand-constructed arrays, | or arrays generated by some function. | | This is both tedious and makes it very hard to debug test | failures (especially with cases like crossfades, pan laws, and | looping). I love the idea of having a visual representation that | lets me see what's going wrong in the test output, and I'm | definitely going to try to implement some similar tests. | | I'm also curious what the state-of-the-art is for these sorts of | tests. Does anyone have insight into what e.g., ableton's test | suite looks like? | | [0] http://github.com/mwylde/loopers | jwosty wrote: | > I'm also curious what the state-of-the-art is for these sorts | of tests. Does anyone have insight into what e.g., appleton's | test suite looks like? | | I don't know, but if I were to make an educated guess, maybe | rendering stuff to actual audio files is a common approach? | That way when something goes wrong, they can inspect it in a | standard waveform editor? | rbanffy wrote: | Am I the only one almost offended by Braille not being ASCII? | | edit: Yes. I miscalculated the dot density. | | /me slaps forehead | thewakalix wrote: | Aren't those asterisks? | rbanffy wrote: | Oh... The shame... | | Yes. I miscalculated the dot density. :-( | rbanffy wrote: | If we go beyond ASCII, Unicode specifies 2x2 mosaics since ever | (they were present in DEC terminals) and 2x3 mosaics (from | Teletext and the TRS-80) since version 13. Some more enlightened | terminals (such as VTE) implement those symbols without the need | of font support. | | Or you can use Braille to get 2x4 mosaics, but they usually look | terrible. | jwosty wrote: | I just might have to try this next. | focom wrote: | Would love to use it as a library! Is it open source? | jwosty wrote: | I've added an fssnip for the ASCII renderer. It uses NAudio. | Should be pretty easy to use. http://www.fssnip.net/85g | jwosty wrote: | Not yet, but it certainly could be. Would it be useful to | publish the helper classes that render the waves out to ASCII? | That's really the guts of the thing. After that, you just use | whatever testing framework you want to do the actual diffing | (in my case Expecto for F#). | robotsteve2 wrote: | Once you've got the waveforms as arrays, what do you need the | ASCII rendering for? | | Instead of diffing ASCII-rendered waveforms, save the arrays and | diff the arrays (and then use any kind of numerical metric on the | residual). Scientist programmers have all sorts of techniques for | testing and debugging software that processes sampled signals. | user-the-name wrote: | Imagine if we had terminals that could handle graphical data. We | wouldn't have to do weird kludges like this, we could just plot | the waveforms in the output of our tools. | | But it's 2021, and not only is this not possible, there is not | even a path forward to a world where this would be possible. It's | just not an option. Nobody is working on this, nobody is trying | to make this happen. We're just sitting here with our text | terminals, and we can't even for a second imagine that there | could be anything else. | | It's sad, is what it is. | HPsquared wrote: | Notebook interfaces are basically that, e.g. Jupyter or | Mathematica. | voldacar wrote: | In TempleOS you can mix text, images, hyperlinks, and 3d models | in the terminal. This is true for the whole system: you could | literally have a spinning 3d model of a tank as a comment in a | source file. That's right, it took a literal schizophrenic to | make an OS with a feature that should have been standard | decades ago. | | Nobody tries to make actually interesting new operating systems | anymore. OS research today is just "let's implement unix with | $security_feature", nobody is actually trying to make computers | more powerful or fun to use, or design a system based off of a | first-principles understanding of what a computer should be. | | God I wish I was born in the lisp machine timeline | woodrowbarlow wrote: | the downside of rich terminal output is that media formats | become the system's responsibility. applications can't output | media in formats that aren't provided by the system, because | then the terminal wouldn't know how to display it and interop | with other applications (e.g. piping) wouldn't work either. | voldacar wrote: | You could let a program create an API for manipulating a | new type of data and inform the system about it so that | other programs could use it. This is more or less what | AmigaOS did; you installed a datatype for e.g. a PSD file, | then all your programs that worked with images could read | PSD files. I think it's a nice idea. | PeterisP wrote: | The features you describe belong to the app ecosystem, not to | the OS - IMHO the OS is about hardware and drivers, and what | kind of graphics is supported by your terminal and source | file editor is orthogonal to the OS and could be done in any | of the current OS'es; but that would require a | rewrite/redesign/reimagining of the whole standard | application package which seems a much larger project than | "merely" an OS. | voldacar wrote: | "There are more things in heaven and earth, Horatio, than | are dreamt of in your philosophy" | | An OS facilitates communication between programs running on | a computer. Unix lets those programs communicate by sending | characters of text to each other. You could just as easily | imagine an OS that lets them communicate by sending images, | audio/video, 3d models, etc. An OS can be way more than | what you think it is. To detox your brain from this unix | worldview, spend some time in a VM and play around with | amigaOS or opengenera. Those were actual coherent OSes with | an actual view of what a computer should be and how it | should behave. Unix isn't. | | > reimagining of the whole standard application package | which seems a much larger project than "merely" an OS. | | By OS, I don't mean kernel. I mean the base set of software | that lets you interact with your computer and do | interesting stuff with it. | rbanffy wrote: | The line between app platform and OS is a blurry one. The | Amiga OS, for instance, has libraries for specific file | types that expose standardized entry points. This way, if | you install the library for Photoshop files, all graphics | programs that adhere to that protocol will be able to read | and write Photoshop PSD files. Microsoft had DDE and, | later, OLE, for embedding objects from one program into | data from another in a standard way all programs were | supposed to share. It was a pain. | | This blurry line is present in other environments as well. | In the Apple Lisa, installing a program resulted in new | templates in the Stationery folder. In Smalltalk, | installing a program adds its class definitions to the | system as independent entities you could use in your own | programs. | | Not all operating systems are the children of Unix and VMS. | bitwize wrote: | Smalltalk's components were so tightly interdependent | that their integration smoke test was 4+3, because | evaluating a simple addition expression exercised like | 3/4 of the entire system. | outworlder wrote: | Some terminals can. | | https://iterm2.com/documentation-images.html | | That's iterm's own implementation. There's also sixel, as | pointed out by another comment. | MayeulC wrote: | In truth, it's because text is quite easy to handle. It's easy | to make a program that handles text, too. | | And so we have a lot of text editors, diff tools, efficient | compression, tools like sort and uniq: the whole unix | ecosystem. | | So if you transform sound to text, you can then use text tools | to compare the output to catch differences. A simple | serialization of numerical sample values would have caught the | bug, but I agree that having a way of visualizing the output is | nice. | | Command line input, programming, etc. is also still mostly done | with text, because it's easy to transform. Of course, you can | imagine working at a higher level with objects (like powershell | does IIRC), mimetypes, etc. | thanatos519 wrote: | Maybe you would like to support https://ctx.graphics/ | zokier wrote: | > Imagine if we had terminals that could handle graphical data. | | We have. They are called "browsers". You might be even using | one right now! | charlesdaniels wrote: | I would point out that sixels[0] exist. There is a nice | library, libsixel[1] for working with it, which includes | bindings into many languages. If the author of sixel-tmux[2][3] | is to be believed[4], the relative lack of adoption is a result | of unwillingness on the part of maintainers of some popular | open source terminal libraries to implement sixel support. | | I can't comment on that directly, but I will say, it's pretty | damn cool to see GnuPlot generating output right into one's | terminal. lsix[5] is also pretty handy as well. | | But yeah, I agree, I'm not a fan of all the work that has gone | into "terminal graphics" that are based on unicode. It's a | dead-end, as was clear to DEC even back in '87 (and that's | setting aside that the VT220[6] had it's own drawing | capabilities, though they were more limited). Maybe sixel isn't | the best possible way of handling this, but it does have the | benefit of 34 years of backwards-compatibility, and with the | right software, you can already use it _now_. | | 0 - https://en.wikipedia.org/wiki/Sixel | | 1 - https://saitoha.github.io/libsixel/ | | 2 - https://github.com/csdvrx/sixel-tmux | | 3 - https://news.ycombinator.com/item?id=28756701 | | 4 - https://github.com/csdvrx/sixel-tmux/blob/main/RANTS.md | | 5 - https://github.com/hackerb9/lsix | | 6 - https://en.wikipedia.org/wiki/VT220 | user-the-name wrote: | That's a protocol that's a good forty years old, and even | that is not supported. And I can see why, why on earth would | you want to be adding support for that in 2021? What a | ridiculous state of affairs. | jwosty wrote: | That's interesting. Do you think sixels could work for the | baseline tests? Would it be feasible to have them display | nicely in an IDE, like VS Code or Visual Studio? | MayeulC wrote: | I find kitty's graphics protocol to be a superior | implementation of the idea: | https://sw.kovidgoyal.net/kitty/graphics-protocol/ | rbanffy wrote: | The venerable xterm and a lot of later physical terminals | (those things with CRTs) can emulate Tektronix (Tektronix, that | today makes instruments, also made computer terminals with | fancy storage CRTs that were kind of e-paper-like, but green - | and sometimes yellow - screen) graphics. iTerm2 and some | others, as pointed out, can do Sixel graphics (a format | designed originally for DEC dot-matrix printers that some DEC | terminals also implement). | user-the-name wrote: | I mean, yes, that is how sad the current state is. | rbanffy wrote: | VTE and, with it, almost every Linux distro, will get Sixel | support soon. I volunteered to add Tektronix graphics to it | too, but this is neither a dire need, nor something I have | done before, so it'll take some time. | user-the-name wrote: | It's forty years old. Why on earth would you be adding | that in 2021? | | Why are we not focusing our energy on making something | that is actually up to date? | rbanffy wrote: | Because things that existed 40 years ago are useful, | already have software written for it, are compatible in | sometimes unforeseen ways (a DEC dot-matrix graph can be | printed as is on a Sixel-compatible terminal!) and have | been battle tested for ages. | | There is a reason the Unix way of bytestream-based shell | and pipes is still useful and present these days to the | point that That Other OS is now embedding Linux in it. | | Also, these ancient terminals often had some interesting | typography options that are encoded in the ANSI standard | that most modern terminals don't bother (line attributes | that generate wider and taller cells are one such | example). | | These formats may be more desirable than more modern and | complete ones such as PostScript for other reasons. I | wouldn't advise implementing a terminal capable of | rendering PostScript graphics because it's one more way | to infiltrate malware in your computer by rendering | untrusted inputs (There are a lot of RCE opportunities in | exploiting vulnerable decoders). | gwbas1c wrote: | > It's sad, is what it is. | | With graphics being everywhere in 2021, I wouldn't call this | situation "sad," I'd think a lot more critically about _why._ | | To start with, fixed-width text is significantly easier to work | with than graphics. | | Nothing's stopping anyone from writing a CI tool that outputs | to HTML with embedded images. The bigger question is why it's | uncommon. ___________________________________________________________________ (page generated 2021-10-13 23:00 UTC)