Interactive Machine Learning for Music
2. Machine learning: A good tool for creating mappings
2.2 No need for programming models by hand
The most exciting thing about using supervised learning to build a mapping function is that a learning algorithm (i.e., a classification or regression algorithm) can produce the mapping function for us, rather than requiring a programmer to write the function by hand (figure 3). Specifically, the learning algorithm will use a set of pairs of inputs and outputs, called the training data, and it will do its best to infer a model function that accurately captures the relationship between those inputs and outputs.
Figure 3: A supervised learning algorithm uses training data to build a model. It attempts to build a model that captures the relationships between inputs and outputs in the training dataset.
For instance, as shown in Figure 4, if a regression algorithm sees the example inputs “4, -11, 13” paired with the example outputs “5, -10, 14”, it may produce the model function “output = 1 + input”, reflecting the relationships between the inputs and outputs in these training data. If this model sees the value “20” as an input, it will thus produce the value “21” as its output.
Figure 4: This regression algorithm has produced the model function “output = 1 + input” using these training examples.
An immediately obvious advantage of this approach is that, if we can supply examples of musician actions—each paired with the sound synthesis or processing parameters we would like to correspond to those actions—the learning algorithm can produce the mapping function for us, without the instrument designer needing to grapple with writing code that appropriately handles high-dimensional and noisy data.
Mapping creation was first identified as a potential use case for supervised learning by Lee et al. in 1991. However, using machine learning was far from straightforward at this time, and their proposed technique never took off. For one thing, while they showed that mappings created with a neural network could be run in real-time performance, they mention that training the network was “computationally intensive” (which I assume means that it required hours, at minimum). Further, while they published a mathematical description of their approach, neither they nor others created ML software tools for others to use in instrument building, so there remained a large technical barrier for others to experiment with ML mappings.