AI Composition: New, artistic workflows in dealing with generative audio tools: 4.5 Selection and 4.6 Composition

IV. The Individual Areas of Work

4.5 Selection and 4.6 Composition

4.5 The Selection of Results

4.5.1 The problem of countless sound files

If you have not, as just described, found a way to use the generation itself as a means for a composition or a live performance, then you are now often confronted with the problem of having a lot of results from training and generation available and having to invest a relatively large amount of time to decide which of them you want to use as the basis for a composition or performance.

4.5.2 Conceptual approaches to outputs

In general, working with generative tools requires the conceptual imagination of the user. Without clear ideas and goals, they otherwise quickly resemble a vehicle with a powerful engine, but no steering wheel. In terms of both the selection and the form of sound presentation, ideas are needed to give the work a direction and the compositional result a shape.

Alexander Schubert has found a very elegant way of dealing with this multitude of results as part of his Avery¹⁷ project. The associated website contains ten thousand folders, each with short pieces of music that Avery has generated. Visitors to the website can listen their way through this seemingly unmanageable amount of AI music.

In his piece Despacito Variations (2023) for electric guitar and effect pedals, Malte Giesen has the recording of a short sequence from the guitar hit "Despacito" calculated by the sample-rnn¹⁸ network in one thousand epochs. However, the results of this training cannot be heard directly in the piece itself. Instead, they form the basis of a transcription that is characterised by special, glitch-like sounds and strange, unimaginable loops.

4.5.3 Measuring deviations

At this point, one could well imagine that an automated comparison is made between the source material and the newly generated sounds, i.e. that the degree of deviation is assessed in tests. In his notebooks k-best and longest output,¹⁹ Antoine Daurat has tried to find a solution for this.

4.5.4 Back to step two

Of course, we could also bring a clusterer like Audiostellar into play at this point. Here we could also imagine a kind of loop in our composition process, so that we jump back to step two of the preparation and selection at this point and use the results of the generation as a database for further training.

4.5.5 Live clustering and concatenative synthesis

Another way to utilise sound results live electronically is the IRCAM/Ableton plug-in Coalescence,²⁰ which is based on the principle of concatenative synthesis.

Here you can split one or more sound files into segments, distribute them in a two-dimensional graphical layer and then match them with an incoming live signal. The first step is more or less identical to Audiostellar, with the addition of now being enabled to use this matrix as a live instrument.

In the composition Holometabola²¹ for the clarinettist Carola Schaal, I have arranged alienated clarinet sounds in four different coalescence modules in mimikit, which Carola Schaal then explores by listening and playing in an improvisational passage in the piece.

Another example of live playing with clustered sound segments are the sound performances Tomomibot²² by Tomomi Adachi and the Berlin programmer Andreas Dzialocha.²³

4.6 Composition

4.6.1 Reassembling tonal passages

The final assembly of results into a composition is probably the point at which most musicians would like to lend a hand themselves. Nevertheless, it is also exciting to consider whether the arrangement of larger parts could also be done by an AI.

In auto-arrangement apps such as Jamahook, the main aim seems to be to select parts from an associated library that already match each other. It would be more interesting to associate any type of sound with other sounds according to various criteria.

4.6.2 Concatenative synthesis

Another possibility is the use of concatenative synthesis, as can be used in the programme Audioguide²⁴ by Ben Hackbarth.

In body target operations, individual sounds are mapped onto a long basic sound. Not as in timbre transfer, from a single latency space, but in a mosaic like style, in many matches of individual sound files. Here too, a strategy can be devised to create entire compositions from this.

4.7 Autospatialisation

So far, there are few examples of AI-based multichannel design. The idea of an AI-transformed, spatial auditory impression or an automated, spatial audio dramaturgy seems appealing. If any of the readers have heard of approaches that go in this direction, they can be shared in the discord channel.

¹¹ Ensemble Generator.

Dieser Inhalt wird von YouTube bereitgestellt.
Beim Abspielen wird eine Verbindung zu den Servern von YouTube hergestellt und der Anbieter kann Cookies einsetzen, die Hinweise über dein Nutzungsverhalten sammeln.

Weitere Informationen zum Datenschutz bei YouTube findest du in der Datenschutzerklärung des Anbieters.

¹² Prokonsori, Dohi Moon, 2022.

Weitere Informationen zum Datenschutz bei YouTube findest du in der Datenschutzerklärung des Anbieters.

¹³ KICamp23, AI Concert, Hexorcismus

¹⁴ Github, Acids, IRCAM, nn~
¹⁵ Intelligent Instruments Lab.
¹⁶ Huggingface, Intelligent Instruments, rave models
¹⁷ Av3ry
¹⁸ ktonal, mimikit
¹⁹ Colab, Github, ktonal, mimikit
²⁰ Coalescence
²¹ Elbphilharmonie, Decoder Ensemble

²² Tomomibot with Anton Bruckner 1

²³ adz.garden
²⁴ Audioguide