Il corpus DIRHA-ENGLISH ed i relativi task per il riconoscimento vocale a distanza in ambienti domestici

Autori

  • Mirco Ravanelli Fondazione Bruno Kessler (FBK-irst)
  • Luca Cristoforetti Fondazione Bruno Kessler (FBK-irst)
  • Roberto Gretter Fondazione Bruno Kessler (FBK-irst)
  • Marco Pellin Fondazione Bruno Kessler (FBK-irst)
  • Alessandro Sosi Fondazione Bruno Kessler (FBK-irst)
  • Maurizio Omologo Fondazione Bruno Kessler (FBK), Povo, Trento

DOI:

https://doi.org/10.17469/O2102AISV000017

Parole chiave:

distant speech recognition, microphone arrays, corpora, Kaldi, DNN

Abstract

This paper addresses the contents and the possible usage of the DIRHA-ENGLISH multi-microphone corpus, realized under the EC DIRHA project. The reference scenario is a domestic environment equipped with a large number of microphones distributed in space. The corpus is composed of both real and simulated material, and it includes 12 US and 12 UK English native speakers’ utterances. Each speaker uttered different sets of phonetically-rich sentences, newspaper articles, conversational speech, keywords, and commands. From this material, a large set of 1-minute sequences was generated, which also includes typical domestic background noise and inter/intra-room reverberation effects. Development and test sets were derived. The paper reports a first set of baseline results obtained using different techniques, including Deep Neural Networks (DNN), aligned with the state-of-the-art at international level. Various tasks and Kaldi recipes have already been developed.

Downloads

Pubblicato

31-12-2016

Puoi leggere altri articoli dello stesso autore/i