AI Composition: New, artistic workflows in dealing with generative audio tools

V. New Qualities in Working with Generative AI Toold

V. New Qualities in Working with Generative AI Tools

5.1 New artistic workflows with selection and generation tools

It has become clear that there are various areas of application for working with audio data and AI. Similar to new art and music movements of the past, technical practices can establish themselves at various points in these processes, which can lead to the development of new, genuine styles and genres. Examples discussed in this text include generative sound transformations, AI covers, live interpretations of latency spaces, conceptual arrangements of sounds, transcriptions of AI, improvisation with live concatenative synthesis. But there are many more and there will be many more ways to work with AI artistically and compositionally in the future.

Each area of the scheme in itself can be the basis or main method of an artistic concept. For example, a conceptual work could be created from the very first step, the systematic and automated search for sounds.

The different areas can be combined; some steps can also be swapped or performed several times in sequence. This results in new, sometimes unusual workflows that can offer new perspectives on sound source materials, on the organisation and production of sound and on forms of composition.

edu sharing object

5.2 Automated, but not automatic

However, this also means that we are saying goodbye to the idea that AI composition with audio files is done in one go by a complete automation system.

Occasionally you hear the somewhat smug question of what percentage of a work is AI-generated and what percentage is your own input. One can clearly say that without your own concept or at least your own idea of what you want to try out no composition can be created. This is why an AI composition, as of 2024, with self-trained sounds, is the composer's own creation as a whole. As already mentioned, this does not apply to large, commercial models in which the user is more likely to trigger results that are pre-standardised for coherence.

5.3 Resistance and distance

It should also have become clear that we can say goodbye to the idea that anything is easier to achieve with AI. On the contrary: many work steps are difficult to control and you can only really achieve something by trial and error and with conceptual objectives.

This also brings us to the interesting aspect of working with these tools: a dimension is inserted between our intentional actions and the result of our actions, which creates a distance that leads us astray, but can also surprise us in positive cases. In any case, we are forced to think carefully about what we actually want and how we can reach our goal with the means at hand. Whether we ultimately succeed is another question.

5.4 New editing techniques

On the other hand, AI audio tools enable a variety of concrete, new editing techniques that we could only have dreamed of a few years ago. The most important are:

sound separation
sound continuation
sound cleaning (of reverberation, secondary sounds, etc.)
segment groupings through clustering
timbre transfer

5.5 New types of authorship

New types of collaboration and collective authorship will also be of interest. Take AI covers, for example, where anyone can now try to compose music that is not only identical in style, but also in the sound of a particular music group (would it be better to say "imitate"?). This certainly raises a lot of questions, but should not be viewed negatively in all cases; it could also be seen as a fun, collective continuation of cultural narratives.

5.6 Outlook

I assume that more and more artistic disciplines will emerge in the field of AI-based tools in the future. It can be assumed that artistic innovations in this area will primarily come from people who deal with this technology in more detail or perhaps even specialise in it.

Photography, as a technology newly introduced in the 19th century and partly utilised artistically, offers interesting opportunities for comparison: photography is simple. Anyone can take a photograph, but that doesn't automatically make them a photographer. Activating AI models can also be child's play, depending on the interface. However, a person who generates images with Midjourney today is not seen as someone who achieves anything special artistically. Perhaps someone who has specialised in working with Midjourney, rather, and knows exactly how to use special formats and special prompts in order to get something unexpected and above-average out of the model – but then again most likely this will happen at the cost of more time. The pain of the invention does not seem to be reducible.

Photography as an analogy is also interesting because it shows how long it has taken for a new technology to establish itself as an art form in its own right. In the case of AI, however, this will probably not take a hundred years.

It can also be assumed that the impact of AI will be felt in two directions: in the results of using its tools, but also in those areas that consciously refrain from using AI or those that use their own resources to form equivalents to AI paradigms. Every innovation always motivates a series of counter drafts.