Gender bias in voice recognition: An i- and x-vector-based gender-specific automatic speaker recognition study

Thayabaran Kathiresan

doi:10.17469/O2108AISV000006

Autori

Thayabaran Kathiresan Department of Computational Linguistics, University of Zurich, Switzerland https://orcid.org/0000-0002-7721-2699

DOI:

https://doi.org/10.17469/O2108AISV000006

Parole chiave:

speaker recognition, i-vectors, x-vectors, gender-difference, speaker-embeddings

Abstract

One of the critical implications of the physiological differences between adult males and females is acoustic differences in speech production. Such acoustic signal variability between the genders affects automatic speech processing applications, especially automatic speaker recognition systems. In this paper, the performance of the genders in state-of-the-art automatic speaker recognition algorithms, such as i- and x-vector, is studied by training the algorithms using a gender-balanced multilingual dataset and tested with gender-separated data from two different languages (English and Mandarin). Furthermore, generated i- and x-vector speaker embedding distributions in higher-dimensions are analysed using the t-SNE technique. The area distribution of speaker embeddings aids interpretation of the speaker recognition performances for both algorithms.