音声・非音声の信頼度を利用した雑音に頑健な音声認識デコーダの検討

Translated title of the contribution: Noise-robust speech recognition decoder using speech/non-speech confidence measures

大西 翼, 岩野 公司, 古井 貞煕, Koji IWANO

Research output: Contribution to journalMisc

Abstract

In a speech recognition system a Voice Activity Detection (VAD) is a crucial component for maintaining accuracy. This paper proposes an approach that uses speech/non-speech confidence measures to adjust the score of the recognition hypotheses. In order to achieve good search performance, it is important to properly adapt the GMMs for input utterances and environmental noise. This paper also proposes an unsupervised on-line GMM adaptation method based on MAP estimation. Robustness of the proposed method is further improved by weighting updating parameters of GMMs according to the confidence measure for the adaptation data and adaptation speed is largely accelerated by caching statistical values to adapt GMMs. Experimental results on Drivers' Japanese Speech Corpus in a Car Environment (DJSC) show that our approach can improve the accuracy significantly as compared with typical front-end based VAD methods. The adaptation method significantly improves the word accuracy. Moreover, the weighting method improves the robustness of the unsupervised adaptation and the cache method largely accelerates the decoding process, Consequently, the proposed adaptive decoding method significantly improves word accuracy under noise with only minor increase in computational cost.
Translated title of the contributionNoise-robust speech recognition decoder using speech/non-speech confidence measures
Original languageJapanese
Pages (from-to)49 - 54
JournalIEICE technical report
Volume110
Issue number81
StatePublished - 10 Jun 2010

Fingerprint

Dive into the research topics of 'Noise-robust speech recognition decoder using speech/non-speech confidence measures'. Together they form a unique fingerprint.

Cite this