Zum Hauptinhalt

Interactive Machine Learning for Music

Prof. Rebecca Fiebrink

1. Motivation: Why IML for music?

1.1 The "mapping problem"

The computational processing that determines how a DMI’s sound should be influenced by its sensor values is often referred to as the instrument’s “mapping.” Hunt et al. (2003) describe this mapping as defining “the very essence of an instrument”; the mapping strongly influences what sounds can be played, what musician gestures may be used to play them, how comfortable a musician is, how a musician looks to an audience, and how easily a musician can learn or repeat particular musical material.

Yet, designing effective (e.g., musically satisfying) mappings can be very difficult in practice. For instance, an instrument may use many sensors to capture a musician’s actions, and/or an individual sensor may produce many dimensions of numbers (e.g., we might want the x, y, and z coordinates of a point tracked in 3D space, or to compute the energy in multiple frequency bands of a signal). Likewise, our sound synthesis or processing algorithm(s) may require many parameters to be set simultaneously to enable rich control over the sound. This can present an instrument designer with a very difficult task: determining a mapping function (Figure 1) which may take dozens (or even hundreds or more) of inputs, use these to compute dozens (or more) of sound parameters, while also ensuring that the instrument is capable of making musically appropriate sounds with feasible, comfortable (enough), and aesthetically appropriate movements. This task becomes even harder when using any of the many sensors that produce noisy signals; for instance, cameras, microphones, and many types of hardware sensors inevitably produce slightly different values at different times, even for the same musician actions.

edu sharing object

Fig. 1: A mapping translates sensor input values into sound synthesis or processing parameters.


As a result, implementing a mapping function by writing code is often a painstaking process, even for instrument builders who are expert programmers. For decades, researchers have explored alternative mechanisms for designing these functions, including using clever mathematical operations (e.g., Bencina 2005, Bevilacqua et al. 2005) as well as machine learning, as we will discuss in the next section.

Of course, this view of the technological components of an instrument as inputs, a mappings, and outputs is quite simplistic, and in practice instruments may have many such mappings, pieced together in different ways, as well as other processes that do not fit into this framework. Nevertheless, mappings are a good starting point for thinking and reasoning about DMI design, and other interactions that you might find in live performance can align well with this framework too (e.g., visuals or robots that respond to live audio or gesture).