Phonological patterns in the predictions of a syllable-based end-to-end ASR system

Authors

  • Sara Picciau Facoltà di Scienze della Formazione, Libera Università di Bolzano, Italia
  • Domenico De Cristofaro Facoltà di Scienze della Formazione, Libera Università di Bolzano, Italia
  • Alessandro Vietti Facoltà di Scienze della Formazione, Libera Università di Bolzano, Italia https://orcid.org/0000-0002-4166-540X

DOI:

https://doi.org/10.17469/O2112AISV000014

Keywords:

syllable, ASR, pretrained neural models, discrete speech units

Abstract

This paper explores the role of syllables in automatic speech recognition (ASR) systems, focusing on its linguistic implications. Traditionally, ASR has focused on processing speech at the segmental level, but recent research suggests the importance of syllabic processing for robust recognition. We trained a neural ASR model to recognize phonological syllables and conducted a linguistic analysis on its output. Our objective was to observe how various factors, such as syllable token frequency, lexical accent position, syllable type, and parts of speech, influence the neural representation of syllables. To achieve this, we developed a fine-grained linguistic annotation system to overcome the limitations of quantitative metrics like Word Error Rate. By applying Multiple Correspondence Analysis, we identified patterns of association between the neural network's output behavior and linguistic features of speech. Specifically, the study demonstrates that the network compensates for low-frequency syllables through substitution strategies, particularly with absent tokens, which are complex syllables often occurring in proper nouns, common nouns, or numerals. Unstressed high-frequency tokens, such as subordinating conjunctions and determiners, tend toward deletion, while mid-frequency syllables with simple structures (CV) achieve optimal recognition, indicating the network's ability to reflect natural language processing patterns based on token frequency and syllabic complexity. Our findings provide insights into the role of syllables in ASR and contribute to ongoing research in this field.

Downloads

Published

30-12-2024

Similar Articles

<< < 2 3 4 5 6 7 8 9 10 11 12 13 14 > >> 

You may also start an advanced similarity search for this article.