Self-organizing maps (SOMs) as an AI tool for music analysis and production
4. Application
4.1. Harpsichords and Grand Pianos
A SOM was trained to distinguish between the sounds of historical harpsicords, hammer pianos and modern grand pianos. It can be explored in an interactive online demo [to do: link].
This example uses timbral parameters derived from the spectrum, like the spectral spread or the spectral centroid, as input features to train the SOM. When looking at the u-matrix, a big black area in the middle of the map can be seen. Neighboring nodes are similar as well as items sorted into this region. A closer look reveals that items in this region are mainly historical hammer pianos. Clicking on the items can ensure they sound quite similar (besides differences in tuning), especially when compared to the other items on the map.
In contrast, if the background is lighter, neighboring nodes of the map differ from each other and the assigned items may sound different even though they are sorted close to each other. This can be perceived when clicking on items sorted into the white region on the lower left, where various historical instruments can be found.
The buttons above the map change the background of the SOM and reveal the different component planes. They describe how one specific feature distributes over the whole Kohonen map. When choosing the sound pressure level, for example, it can be seen that this is much higher in the region on the upper left. This stands to reason as modern grand pianos can be found here, and one may easily verify by clicking on the single items.
Another parameter that usually has a big influence on distinguishing between modern and historical pianos is the spectral flux [4]. It is the change in the spectrum when striking a piano. In the beginning the sound is usually bright; while the sound decays it becomes darker. This can be perceived when listening to the modern pianos' sounds. In comparison to the historical piano, the sound also changes but in a different way. Hence, instruments can be distinguished by humans as well as by AI.
While the u-matrix reveals similarities and differences between the items, however, looking at the component planes shows the reasons for the differences.