Design and main ideas

Notes on the design of the saugns software (and SAU language), as it has evolved, and the main ideas involved. (The SAU language has evolved in parallel with the software implementing it.)

Contents

Roots: Early 2011 design

The program was developed from scratch, the language starting out as the most straightforward way I could get a trivial program to generate sounds and handle timing flexibly according to a script.

If a single big letter (W) is used to start a wave oscillator, and a small letter followed by a number assigns a value for frequency (f), amplitude (a), or time duration (t), then parsing is trivial. No abstractions for lexing or syntax trees, etc., are necessary. For each W the parser can simply add an oscillator node to a list, and go on to set values in the current node according to any parameter assignments which follow while that oscillator is still being dealt with. Additional syntax can tell the parser that the next node should be given a time offset value, so that it's not interpreted as being parallel with the previous. After parsing is done, the resulting list of nodes specifies what to run or do to generate audio.

The list of nodes produced can be viewed as a sequential list of "steps" to take, "events" to handle, or "instructions" to follow, for setting up or changing audio generation – where a sequence of nodes with no time offsets between them, or with overlapping time durations for what to run being specified inside of them, configures the running of things in parallel. To run it all, the list can be examined once to figure out what data structures need to be allocated and prepared, and a second time when actually using those for running the simplistic "program" the script has been translated into.

To allow implementing modulation techniques, support for nested lists was then added to the parser and audio generation code – lists of oscillators assigned in connection with the parameters for frequency, phase, amplitude, etc. for an oscillator – naturally parsed with recursion. This was a fairly simple extension for the language, with time proceeding along a linear list like before, though the data used in connection with each node and time position may thereafter branch out.

But complexity, and difficulty in keeping it all working, then seemed to increase exponentially as more features were added to the early language – even as the code size grew much more modestly than that. The design was far too simple to include most of the classic main structure of compilers and interpreters, and I didn't have the experience to scale up my own design well. The language also looked quite different from typical well-known and well-described paradigms.

Yet before complexity turns into a problem, in a small and tidy way it's very doable to support things like parallel audio generation (several "voices"), combined with a sequential "time separator", and a more flexible "insert time delay for what's placed after this" modifier. That kind of terse and simple language which a fairly trivial program can run could be ideal for growing into a flexible tone generator language. But there's a great, huge chasm between that and a powerful musical language.

Timing when running and generating

The basic design for how time works is very simple. Time for a script begins at 0 ms. A script is translated into a list of timed update instruction nodes, or "steps", each new step taking place after the previous, with or without time (samples to generate) passing between any two steps. Each step configures the system, e.g. telling it to start generating something or changing parameter values for some generator object.

The running of a script primarily advances through time, and secondarily through the timed update steps which are laid out in a list like a timeline of events. After parsing, time is translated into the number of samples to generate, or which should be generated before a time is reached. Time proceeds as output samples are written, while update events only come with time and do not advance it. The handling of such updates takes priority over output generation, pausing it until the updates at that time position have been handled.

Each thing which generates output, such as a wave oscillator, has a time duration of use. Thus, its output will begin at a time position and end at another time position. The script has ended when no things remain in use, and no further update steps remain to be waited for (in sound or in quiet). In other words, the duration of a script is equal to the total sum length of times to wait before each new update step, plus the remaining duration of play after the last update step for still-active "things" (e.g. oscillators).

Limitations of the early design

Some functionality cannot be reached without complicating the core design and beginning to move beyond it. For example, trying to support some syntax for more flexible timing arrangement in a script than globally flatly time-ordered steps, led to moving out and growing some data processing between the parsing and the interpreting/audio generation main parts of the program, so that time may branch during parsing but timed nodes still be arranged into a flat timeline of steps before interpreting. That's a main example of something with a solution arrived at early on. (However, the early program had quite a few timing bugs.)

Another limitation not dealt with early on is the nature of the nested list, i.e. tree structures, as the form of a script. Early on, the capabilities of old FM synthesizer systems had been an inspiration, but it was seen that they support connecting oscillators in arrangements other than the tree structures of carriers and modulators provided for by nested lists; e.g. several carriers may share a modulator, and in general the oscillator connection schema is a DAG (directed acyclic graph) in Yamaha's old FM "algorithms". (Technically, self-modulation could however be viewed as adding self-loops to an otherwise acyclic graph. Possibilities for going beyond acyclic graphs by supporting feedback loops more generally also exist, and are done in some synthesizer systems.)

Among ideas not implemented in the early program, mostly it isn't audio generation features which suggest great departures in design, however. Mostly, it is programming language ideas, like defining and using functions for creating sounds, or other ways of turning audio definitions entered in a script into a "template" of sorts for further audio in that script, and other ideas increasing language expressiveness.

Relative to the early language, some kinds of extensions for it would mainly require reworking and complicating the design closer to the parsing end of the program, while others mainly require reworking the other end which ultimately runs what is produced. (When considering creating a more powerful interpreter, it's also worth noting that some basic big limitations in features are necessary for e.g. time durations for scripts to be pre-calculable. A Turing-complete language would not allow it.)

Trunk: Experimentation and 2012 ideas

Experimenting on in 2011 and beyond, and also researching potentially useful ideas for programming languages and compilers in 2012, led to a series of old notes, with mainly a list of thoughts on a new possible language, and ideas for possible design elaborations. I discounted my own early language as a starting point for something better after learning more theory, in part because basic standard concepts are usually connected to different-looking syntaxes and I couldn't see how what I'd already come up with may correspond to those concepts. For years I put it all aside, and then, roughly a decade after making those old notes, I've bridged that basic gap in thought to a large extent, but still not worked out much more in practice.

Various good little ideas made it into the program design in the year between April 2011 and April 2012 (when the old work ended), but in a rough and sometimes buggy form, requiring later clean-up. Various things towards which first steps were taken were also left unfinished.

Ideas from 2012, and later, remain to be explored more thoroughly for extending and reworking features.

Growing the early design

The April 2012 program had grown a curious approach to parsing, in which a combination of flat list structures (for time-ordering) and a simplified tree structure (for nested syntax elements) was produced. This corresponds, in a fairly basic way, to the nature of the language: time-ordering is one dimension of structure, and nesting as in nested elements such as lists is another. (Nested lists of oscillator objects is the main and most obvious type of nesting.) As for the program structure, a middle-layer between the parsing end and the interpreting and audio generation end of the program had come along by 2012, initially from moving out semantic handling from the parser after it became overgrown. This middle developed to track timing, and count and allocate voices for audio generation prior to running it.

The way time placement and nesting works together in the language, one main node from the parser describing something placed at a time (in a list of timed such nodes) may provide a tree of attached data, and another main node for a later time may provide an update for part of that data and include a sub-tree which is to become connected to the data tree (a modulator is added after a time delay to an oscillator which in turn was and remains used as a modulator for a carrier, say). Such a main node with an update could also include data which clears links, i.e. removes branches, from an object part of a tree (modulators of a given type are removed from an oscillator, say).

Another type of nesting which applies to time, originally only for a syntax extension part of the early design, places a series of those main nodes in a side-list attached to a main node; after parsing, such side-lists are merged into a single time-ordered list of the main nodes. This allows time placement to branch out several times in a script and during parsing, while the data to later be interpreted or "ran" remains a flat update-after-update list. This is used for the old "compound steps" feature, and for a related "subshift" feature added in early 2022 to replace an earlier oscillator parameter (s) for padding time durations with leading silent time.

The basic idea for the compound steps feature is to allow a series of timed steps in a script which update the same object (an oscillator, say), to be written together as a kind of unit, not advancing the timing for later-placed steps for other objects. That way, timed updates can be grouped per object, rather than rigidly according to the flow of time. (The 2012 program had several types of bugs for that feature, which could however all be fixed without greatly complicating anything.) Once that functionality is there, it can also be used for differently-looking things of an essentially similar kind.

Odds and ends from clean-up work

Some general ideas have evolved since reviving the project in November 2017, concerning making and keeping it cleaner and more modular. Small-scale experiments in how to (re)structure or otherwise improve little pieces of the whole have taken a large part of the focus as of November 2021, in a slow-going pedantic clean-up phase for the project. (It's not pragmatic at all, but it's been a good-enough way to spend some time in a longer, slow and somewhat glum period of life.)

To be continued...