The DIRHA-ENGLISH corpus and related tasks for remote speech recognition in domestic environments

Authors

  • Mirco Ravanelli Fondazione Bruno Kessler (FBK-irst)
  • Luca Cristoforetti Fondazione Bruno Kessler (FBK-irst)
  • Roberto Gretter Fondazione Bruno Kessler (FBK-irst)
  • Marco Pellin Fondazione Bruno Kessler (FBK-irst)
  • Alessandro Sosi Fondazione Bruno Kessler (FBK-irst)
  • Maurizio Omologo Fondazione Bruno Kessler (FBK), Povo, Trento

DOI:

https://doi.org/10.17469/O2102AISV000017

Keywords:

distant speech recognition, microphone arrays, corpora, Kaldi, DNN

Abstract

This paper addresses the contents and the possible usage of the DIRHA-ENGLISH multi-microphone corpus, realized under the EC DIRHA project. The reference scenario is a domestic environment equipped with a large number of microphones distributed in space. The corpus is composed of both real and simulated material, and it includes 12 US and 12 UK English native speakers’ utterances. Each speaker uttered different sets of phonetically-rich sentences, newspaper articles, conversational speech, keywords, and commands. From this material, a large set of 1-minute sequences was generated, which also includes typical domestic background noise and inter/intra-room reverberation effects. Development and test sets were derived. The paper reports a first set of baseline results obtained using different techniques, including Deep Neural Networks (DNN), aligned with the state-of-the-art at international level. Various tasks and Kaldi recipes have already been developed.

Published

31-12-2016

Most read articles by the same author(s)