Using multiple visual tandem streams in audio-visual speech recognition

Warning The system is temporarily closed to updates for reporting purpose.

Topkaya, İbrahim Saygın and Erdoğan, Hakan (2011) Using multiple visual tandem streams in audio-visual speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011), Prague, Czech Republic

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Official URL: http://dx.doi.org/10.1109/ICASSP.2011.5947476


The method which is called the "tandem approach" in speech recognition has been shown to increase performance by using classifier posterior probabilities as observations in a hidden Markov model. We study the effect of using visual tandem features in audio-visual speech recognition using a novel setup which uses multiple classifiers to obtain multiple visual tandem features. We adopt the approach of multi-stream hidden Markov models where visual tandem features from two different classifiers are considered as additional streams in the model. It is shown in our experiments that using multiple visual tandem features improve the recognition accuracy in various noise conditions. In addition, in order to handle asynchrony between audio and visual observations, we employ coupled hidden Markov models and obtain improved performance as compared to the synchronous model.

Item Type:Papers in Conference Proceedings
Uncontrolled Keywords:Audio-Visual Speech Recognition , Coupled Hidden Markov Models , Hidden Markov Models , Neural Networks , Support Vector Machines , Tandem Approach
Subjects:T Technology > TK Electrical engineering. Electronics Nuclear engineering
ID Code:18469
Deposited By:Hakan Erdoğan
Deposited On:25 Dec 2011 16:17
Last Modified:31 Jul 2019 09:41

Repository Staff Only: item control page