You are here

“A Reverberation-Time-Aware DNN Approach to Speech Dereverberation and Its Application to Robust Automatic Speech Recognition” by Prof. Chin-Hui Lee

Dec 12, 2016, We are honored to invite 

Prof. Chin-Hui Lee to give lectures of “A Reverberation-Time-Aware DNN Approach to Speech Dereverberation and Its Application to Robust Automatic Speech Recognition”

Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had 20 years of industrial experience ending in Bell Laboratories, Murray Hill, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of ISCA. He has published over 450 papers and 30 patents, with close to 30,000 citations and an h-index of 65 on Google Scholar for his publications. He received numerous awards, including the Bell Labs President's Gold Award in 1998. He won the SPS's 2006 Technical Achievement Award for ``Exceptional Contributions to the Field of Automatic Speech Recognition''. In 2012 he gave an ICASSP plenary talk on the future of speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for ``pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition''.

Prof. Chin-Hui Lee’s talk is mainly about casting the classical speech dereverberation problem into a regression setting by mapping log power spectral features of reverberant speech to time-delayed features of anechoic speech. Depending on the reverberation time of the acoustic environment we found that different signal processing parameters are needed to deliver a good quality for dereverberated speech. Furthermore, reverberant-time-aware DNN training and decoding procedures can be designed to optimize the dereverberation performance across a wide range of reverberant times. In addition, a single DNN can also be trained to perform simultaneous beamforming and dereverberation for microphone array speech. Furthermore, as a side benefit, using DNN-based speech dereverberation as a pre-processor in the REVERB Challenge automatic speech recognition (ASR) task, we get the lowest word error rate without retraining the dereverberation front-end and the ASR back-end. It is expected the ASR accuracy and robustness could still be improved with joint training of an integrated dereverberation-ASR system.