Abstract
Marmosets, with their highly social nature and complex vocal communication system, are important models for comparative studies of vocal communication and, eventually, language evolution. However, our knowledge about marmoset vocalisations predominantly originates from playback studies or vocal interactions between dyads, and there is a need to move towards studying group-level communication dynamics. Efficient source identification from marmoset vocalisations is essential for this challenge, and machine learning algorithms (MLAs) can aid it. Here we built a pipeline capable of plentiful feature extraction, meaningful feature selection, and supervised classification of vocalisations of up to 18 marmosets. We optimised the classifier by building a hierarchical MLA that first learned to determine the sex of the source, narrowed down the possible source individuals based on their sex, and then determined the source identity. We were able to correctly identify the source individual with high precisions (87.21% – 94.42%, depending on call type, and up to 97.79% after the removal of twins from the dataset). We also examine the robustness of identification across varying sample sizes. Our pipeline is a promising tool not only for source identification from marmoset vocalisations but also for analysing vocalisations and tracking vocal learning trajectories of other species.