Sound Gesture Intelligence: Multimodal data

Sound Gesture Intelligence

Dr. Greg Beller

Gesture recognition and following

Multimodal data

MuBu For Max is a toolbox for multimodal analysis of sound and motion, interactive sound synthesis and machine learning. This is the Max package found in the package manager and on the IRCAM forum. It includes:

real-time and batch data processing (sound descriptors, motion features, filtering);
granular, concatenative and additive synthesis;
data visualization;
static and temporal recognition;
regression and classification algorithms.

The MuBu multi-buffer is a container providing a structured memory for the real-time processing. The buffers of a MuBu container associate multiple tracks of aligned data to complex data structures such as:

segmented audio data with audio descriptors and annotations
annotated motion capture data
aligned audio and motion capture data

Each track of a MuBu buffer represents a regularly sampled data stream or a sequence of time-tagged events such as audio samples, audio descriptors, motion capture data, markers, segments and musical events.

edu sharing object

Different data visualization using imubu from the MuBu package.

Let's dive into the package to find the tools linked to tracking and gesture recognition.

edu sharing object

MuBu overview available in the MuBu package.