You are here

SpeechLab Member Successfully Earned the Best Student Paper Award at ISCSLP2016

The 10th International Symposium on Chinese spoken language processing(ISCSLP2016) was held in Tianjin from 17th October to 20th October. ISCSLP is a flagship IEEE conference about Chinese spoken language processing. There are Speech-Lab members, who was tutored by Prof. Kai yu and Prof. Yamin Qian, publishing in total 4 papers, one of them, about multi-task joint-learning for robust voice activity detection, was honored Best Student Paper Award. At ISCSLP2016 , there 137 papers enrolled, 10 papers of them having the chance to run for the Best Student Award, and only 2 of them eventually getting the award.

Here is the abstract  of awarded paper:

Model based VAD approaches have been widely used and achieved success in practice. These approaches usually cast VAD as a frame-level classification problem and employ statistical classifiers, such as Gaussian Mixture Model (GMM) or Deep Neural Network (DNN) to assign a speech/silence label for each frame. Due to the frame independent assumption classification, the VAD results tend to be fragile. To address this problem, in this paper, a new structured multi-frame prediction DNN approach is proposed to improve the segment-level VAD performance. During DNN training, VAD labels of multiple consecutive frames are concatenated together as targets and jointly trained with a speech enhancement task to achieve robustness under noisy conditions. During testing, the VAD label for each frame is obtained by merging the prediction results from neighbouring frames. Experiments on an Aurora 4 dataset showed that, conventional DNN based VAD has poor and unstable prediction performance while the proposed multitask trained VAD is much more robust.