Skip to main content

AI Composition: New, artistic workflows in dealing with generative audio tools

Genoël von Lilienstern

II. AI Tools In the Context of Society and Collectivity

II. AI Tools In the Context of Society and Collectivity

2.1 Differentiation from the commercial use of large models

It should be noted beforehand which types of AI tools are not included in this discussion. Since it is current, the Suno.ai platform1 will serve as an example. By means of text prompts, songs can be generated which you would think were made by humans and could be played on the radio on a Thursday afternoon.

Suno is building a future where anyone can make great music. Whether you're a shower singer or a charting artist, we break barriers between you and the song you dream of making. No instrument needed, just imagination. From your mind to music.

If you spend more time with Suno, however, the limitations and the relative standardisation of the musical results become apparent. In general, the user is someone who operates the top level of a large, fully trained model. This interface is also optimised in such a way that the results will meet an expectation of coherence that is more oriented towards a mainstream idiom. Our expectations should be met. The large models do this very well, especially in the area of a certain standard.

fig_my latest cuisine.jpg

On the one hand, the question of music inheriting a genuine sense of intention remains. Whether music means and wants something becomes a relevant musical parameter. A convincing avant-garde orchestral work or a touching Delta-style blues, that you want to listen to more than once, is unlikely to be created if there is no subjective orientation or, to use the somewhat overused term, if there is no agency.

It is important to remember that no completely new sounds or new musical structures are being created here, but that labelled, human-made music forms the basis.

It is also interesting to see who Suno is primarily aimed at. The fact that it allows non-musicians to influence a musical creation by means of text prompts is probably one of the app's main selling points. This form of recompiling pre-trained results is more like an amusement, however, it probably offers a practical opportunity to try out musical and especially lyrical ideas. We can certainly expect this type of generation to produce even more differentiated, more astonishing results in the future.

A musician usually doesn't want to just decide between A, B, C or D, but wants to make the most far-reaching decisions possible. This starts with the question of what sounds form the basis of a training. In concrete terms, our own creative work with machine learning and neural networks is about training models ourselves.

The question of artistic control can be viewed from a political or moral perspective. It is often asked whether musicians, whose music or voice is used as training data for commercial models, should share in the profit from the results. There is also an ecological question, which should not be neglected, as training large models requires a lot of energy.

Anyone interested in working with neural networks in the long term should (a) train by themselves and (b) work with small models.


2.2 Collectivity, knowledge exchange in online forums

There is another, social aspect to working with AI-based tools, that of collectivity.

For example, learning on the internet is an activity based on networking. Since programming languages and neural networks are so complex and time-consuming to work with, it can be helpful to organise oneself as a small group or collective in order to benefit from different interests and specializations.

In the group ktonal,2 which I founded a few years ago with composer friends, we benefited from the fact that some members had stronger mathematical skills and knowledge of networks while others were very good at giving feedback on possible musical goals. In this way, speculative ideas could be expressed which were then brought down to Earth by the technical team. Vice versa, the members with primarily compositional interests could test new ideas in collaborative, online notebooks (e.g. Jupyter) and provide information on whether these would be helpful for sound experiments.

Online forums are generally extremely helpful for solving problems—and problems with code and training are very common. Discord has been particular helpful. There are channels with various sub-topics for AI projects where you can read about the experiences of others working on similar projects. Often you can also post your own code or error messages and get helpful responses.

I have installed a public channel on Discord, MUTOR AI, for this text and the associated video, which is intended to invite an interactive exchange about the content conveyed here.


2.3 Problem solving with the help of language models

Last but not least, using ChatGPT or similar language models is a great resource for finding solutions to problems with code or other neural network questions. Personally, I am a bit sceptical about ChatGPT and its ability to help with human questions. However, when it comes to the machine providing information about other machine problems, such as Python code, it must be admitted that ChatGPT works very well. It is also a useful instrument for putting your own questions into words as precisely as possible.


1 https://suno.com/ but also https://www.udio.com/
2 Website of ktonal. https://ktonal.com/