Title:Speech Recognition with Prediction-Adaptation-Correction Recurrent Neural Networks
Speaker: Dr. Dong Yu(beat365手机官方网站信电系兼职教授)
Time: 11:00~12:00 ,Nov.24,2014
Address:信电系215会议室
Abstract:
In this talk, I will describe our recently proposed prediction-adaptation-correction RNN (PAC-RNN), in which a correction DNN estimates the state posterior probability based on both the current frame and the prediction made on the past frames by a prediction DNN. The result from the main DNN is fed back to the prediction DNN to make better predictions for the future frames. In the PAC-RNN, we can consider that, given the new, current frame information, the main DNN makes a correction on the prediction made by the prediction DNN. Alternatively, it can be viewed as adapting the main DNN’s behavior based on the prediction DNN’s prediction. Experiments on the TIMIT phone recognition task indicate that the PAC-RNN outperforms DNN, RNN, and LSTM with 2.4%, 2.1%, and 1.9% absolute phone accuracy improvement, respectively. We found that incorporating the prediction objective and including the recurrent loop are both important to boost the performance of the PAC-RNN.
Biography:
Dr. Dong Yu is a principal researcher at Microsoft Research. His research interests include speech processing, robust speech recognition, discriminative training, and machine learning. He has published over 140 papers in these areas and is the inventor/coinventor of more than 50 granted/pending patents. His work context-dependent deep neural network hidden Markov model (CD-DNN-HMM) has helped to shape the new direction on large vacabulary speech recognition research and was recognized by the IEEE SPS 2013 best paper award. Most recently, he has focused on applying computational networks, a generalization of many neural network models, to speech recognition.
Dr. Dong Yu is currently serving as a member of the IEEE Speech and Language Processing Technical Committee (2013-) and an associate editor of IEEE Transactions on Audio, Speech, and Language Processing (2011-). He has served as an associated editor of IEEE Signal Processing Magazine (2008-2011) and the leader guest-editor of IEEE Transactions on Audio, Speech, and Language Processing – special issue on deep learning for speech and language processing (2010-2011).