Cross-Lingual Transferability of Voice Analysis Models: a Parkinson’s Disease Case Study
DOI:
https://doi.org/10.17469/O2111AISV000007Keywords:
Natural Language Processing, Deep Learning, Speech Analysis, Parkinson's Disease, Domain Adaptation, MultilingualAbstract
Traditionally, speech analysis has always relied on a set of very informative features like (Mel) spectrogram, Mel Frequency Cepstral Coefficients (MFCC), pitch or intensity to build speech powered applications. Recently, deep learning-based models for the extraction of acoustic features have allowed significantly improving the state of the art in many speech-related applications. With this work, we focus the analysis on the cross-lingual transferability of speech analysis features. The idea is to understand whether and how well a classification model trained on speech features in a source language works on an unseen target language. We evaluate these properties analysing models for Parkinson's disease detection from speech, adapting the models from English to Telugu. Results show that multi-lingual pre-trained deep learning-based features do not require explicit adaptation and work well out-of-the-box. Differently, models not adapting out-of-the-box respond well even to unsupervised adaptation on a small data set.Downloads
Published
29-12-2023
Issue
Section
Articles
License
Copyright (c) 2023 AISV - Associazione Italiana di Scienze della Voce [Italian Association for Speech Sciences]

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.