Papers


2021

Heinrich Dinkel, Mengyue Wu and Kai Yu Towards Duration Robust Weakly Supervised Sound Event Detection Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Yanmin Qian, Zhengyang Chen and Shuai Wang Audio-Visual Deep Neural Network for Robust Person Verification Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu Audio Caption in a Car Setting with a Sentence-Level Loss Conference ISCSLP PDF
Xun Gong, Zhengyang Chen, Yexin Yang, Shuai Wang, Lan Wang and Yanmin Qian Speaker Embedding Augmentation with Noise Distribution Matching Conference ISCSLP PDF
Shuai Wang, Yexin Yang, Yanmin Qian and Kai Yu Revisiting the Statistics Pooling Layer in Deep Speaker Embedding Learning Conference ISCSLP PDF
Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Boeddeker, Zhuo Chen and Shinji Watanabe ESPnet-SE: end-to-end speech enhancement and separation toolkit designed for ASR integration Conference SLT PDF
Chenda Li, Yi Luo, Cong Han, Jinyu Li, Takuya Yoshioka, Tianyan Zhou, Marc Delcroix, Keisuke Kinoshita, Christoph Boeddeker, Yanmin Qian, Shinji Watanabe and Zhuo Chen Dual-path RNN for Long Recording Speech Separation Conference SLT PDF
Chenpeng Du, Hao Li, Yizhou Lu, Lan Wang and Yanmin Qian Data Augmentation for End-to-end Code-Switching Speech Recognition Conference SLT PDF

2020

Chen Zhang, Daihui Peng,Lu Lv,Kaiming Zhuo,Kai Yu,Tian Shen,Yifeng Xu and Zhen Wang Individual Perceived Stress Mediates Psychological Distress in Medical Workers During COVID-19 Epidemic Outbreak in Wuhan Journal Neuropsychiatric Disease and Treatment PDF
Kai Yu, Rao Ma, Kaiyu Shi and Qi Liu Neural Network Language Model Compression With Product Quantization and Soft Binarization Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Zhi Chen, Lu Chen, Xiaoyuan Liu and Kai Yu Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Shuai Wang, Yexin Yang, Zhanghao Wu, Yanmin Qian and Kai Yu Data Augmentation using Deep Generative Models for Embedding based Speaker Recognition Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Qi Liu, Zhehuai Chen, Hao Li, Mingkun Huang, Yizhou Lu and Kai Yu Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Su Zhu, Ruisheng Cao and Kai Yu Dual Learning for Semi-Supervised Natural Language Understanding Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Wangyou Zhang, Xuankai Chang, Yanmin Qian and Shinji Watanabe Improving End-to-End Single-Channel Multi-Talker Speech Recognition Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Su Zhu, Zijian Zhao, Rao Ma and Kai Yu Prior Knowledge Driven Label Embedding for Slot Filling in Natural Language Understanding Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Wangyou Zhang and Yanmin Qian Learning Contextual Language Embeddings for Monaural Multi-Talker Speech Recognition Conference INTERSPEECH PDF
Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe and Yanmin Qian End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming Conference INTERSPEECH PDF
Zhijun Liu, Kuan Chen and Kai Yu Neural Homomorphic Vocoder Conference INTERSPEECH PDF
Yefei Chen, Heinrich Dinkel, Mengyue Wu and Kai Yu Voice activity detection in the wild via weakly supervised sound event detection Conference INTERSPEECH PDF
Chen Liu, Su Zhu, Zijian Zhao, Ruisheng Cao, Lu Chen and Kai Yu Jointly Encoding Word Confusion Network and Dialogue Context with BERT for Spoken Language Understanding Conference INTERSPEECH PDF
Yizhou Lu, Mingkun Huang, Hao Li, Jiaqi Guo and Yanmin Qian Bi-encoder Transformer Network for Mandarin-English Code-switching Speech Recognition using Mixture of Experts Conference INTERSPEECH PDF
Zhengyang Chen, Shuai Wang and Yanmin Qian Adversarial Domain Adaptation for Speaker Verification Using Partially Shared Network Conference INTERSPEECH PDF
Zhengyang Chen, Shuai Wang and Yanmin Qian Multi-Modality Matters: A Performance Leap on VoxCeleb Conference INTERSPEECH PDF
Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian and Kai Yu Dual-Adversarial Domain Adaptation for Generalized Replay Attack Detection Conference INTERSPEECH PDF
Chenda Li and Yanmin Qian Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation Conference INTERSPEECH PDF
Zihan Zhao, Yuncong Liu, Lu Chen, Qi Liu, Rao Ma, and Kai Yu An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models Conference NLPCC PDF
Zihan Xu, Zhi Chen, Lu Chen, Su Zhu and Kai Yu Memory Attention Neural Network For Multi-Domain Dialogue State Tracking Conference NLPCC PDF
Chen Liu, Su Zhu, Lu Chen and Kai Yu Robust Spoken Language Understanding with RL-Based Value Error Recovery Conference NLPCC PDF
Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning Conference DCASE PDF
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu and Weiyao Lin Multiple Sound Sources Localization from Coarse to Fine Conference ECCV PDF
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu and Weiyao Lin A Two-Stage Framework for Multiple Sound-Source Localization Conference CVPR PDF
Lu Chen, Yanbin Zhao, Boer Lv, Lesheng Jin, Zhi Chen, Su Zhu and Kai Yu Neural Graph Matching Networks for Chinese Short Text Matching Conference ACL PDF
Yanbin Zhao, Lu Chen, Zhi Chen, Ruisheng Cao, Su Zhu and Kai Yu Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks Conference ACL PDF
Ruisheng Cao, Su Zhu, Chenyu Yang, Chen Liu, Rao Ma, Yanbin Zhao, Lu Chen and Kai Yu Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing Conference ACL PDF
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux and Shinji Watanabe End-To-End Multi-Speaker Speech Recognition With Transformer Conference ICASSP PDF
Heinrich Dinkel and Kai Yu Duration Robust Weakly Supervised Sound Event Detection Conference ICASSP PDF
Yexin Yang, Shuai Wang, Xun Gong, Yanmin Qian and Kai Yu Text Adaptation for Speaker Verification with Speaker-Text Factorized Embeddings Conference ICASSP PDF
Chenda Li, and Yanmin Qian Deep Audio-Visual Speech Separation with attention Mechanism Conference ICASSP PDF
Zhengyang Chen, Shuai Wang, Yanmin Qian and Kai Yu Channel Invariant Speaker Embedding Learning with Joint Multi-Task and Adversarial Training Conference ICASSP PDF
Chenpeng Du, and Kai Yu Speaker Augmentation for Low Resource Speech Recognition Conference ICASSP PDF
Chenda Li, and Yanmin Qian Deep Audio-Visual Speech Separation with attention Mechanism Conference ICASSP PDF
Shuai Wang, Johan Rohdin, Lukáš Burget, Oldřich Plchot, Kai Yu and Jan Černocký Investigation of Specaugment for Deep Speaker Embedding Learning Conference ICASSP PDF
Federico Landini, Shuai Wang, Mireia Diez, Lukáš Burget, Pavel Matějka, Kateřina Žmolíková, Ladislav Mošner, Anna Silnova, Oldřich Plchot, Ondřej Novotný Hossein Zeinali and Johan Rohdin BUT System for the Second Dihard Speech Diarization Challenge Conference ICASSP PDF
Mireia Diez, Lukáš Burget, Federico Landini, Shuai Wang and Honza Černocký Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge Conference ICASSP PDF
Rao Ma, Hao Li, Qi Liu, Lu Chen and Kai Yu Neural Lattice Search for Speech Recognition Conference ICASSP PDF
Rao Ma, Lesheng Jin, Qi Liu, Lu Chen and Kai Yu Addressing the Polysemy Problem in Language Modeling with Attentional Multi-Sense Embeddings Conference ICASSP PDF
Lu Chen, Boer Lv, Chi Wang, Su Zhu, Bowen Tan and Kai Yu Schema-Guided Multi-Domain Dialogue State Tracking with Graph Attention Neural Networks Conference AAAI PDF
Yanbin Zhao, Lu Chen, Zhi Chen and Kai Yu Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders Conference AAAI PDF

2019

Yanmin Qian and Xu Xiang Binary Neural Networks for Speech Recognition Journal FITEE PDF
Yanmin Qian, Hu Hu and Tian Tan Data Augmentation Using Generative Adversarial Networks for Robust Speech Recognition. Speech Communication, vol Journal SC PDF
Lu Chen, Zhi Chen, Bowen Tan, Sishan Long, Milica Gasic and Kai Yu AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 9, pp. 1378-1391, Sep. 2019 Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Shuai Wang, Zili Huang, Yanmin Qian and Kai Yu Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification. IEEE/ACM Transactions on Audio, Speech, and Language Processing Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Xu Xiang, Shuai Wang, Houjun Huang, Yanmin Qian and Kai Yu Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition Conference APSIPA PDF
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux and Shinji Watanabe MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition Conference ASRU PDF
Wangyou Zhang, Man Sun, Lan Wang and Yanmin Qian End-to-End Overlapped Speech Detection and Speaker Counting with Raw Waveform Conference ASRU PDF
Mingkun Huang, Yizhou Lu, Lan Wang, Yanmin Qian and Kai Yu Exploring Model Units and Training Strategies for End-to-End Speech Recognition Conference ASRU PDF
Rao Ma, Qi Liu and Kai Yu Highly Efficient Neural Network Language Model Compression Using Soft Binarization Training Conference ASRU PDF
Peiyao Sheng, Zhuolin Yang and Yanmin Qian GANs for Children: A Generative Data Augmentation Strategy for Children Speech Recognition Conference ASRU PDF
Zijian Zhao, Su Zhu and Kai Yu Data Augmentation with Atomic Templates for Spoken Language Understanding Conference EMNLP PDF
Yefei Chen, Shuai Wang, Yanmin Qian and Kai Yu End-to-End Speaker-Dependent Voice Activity Detection Conference NCMMSC PDF
Hao Li, Zhehuai Chen, Qi Liu, Yanmin Qian and Kai Yu OOV Words Extension for Modular Neural Acoustics-to-Word Model Conference NCMMSC PDF
Hao Li, Chen Liu, Su Zhu and Kai Yu Robust Spoken Language Understanding with Acoustic and Domain Knowledge Conference ICMI PDF
Su Zhu, Zijian Zhao, Tiejun Zhao, Chengqing Zong and Kai Yu CATSLU: The 1st Chinese Audio-Textual Spoken Language Understanding Challenge Conference ICMI PDF
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael Seltzer and Christian Fuegen Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR Conference InterSpeech PDF
Jiaqi Guo, Yongbin You, Yanmin Qian and Kai Yu Joint Decoding of CTC Based Systems for Speech Recognition Conference InterSpeech PDF
Chenda Li and Yanmin Qian Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech Conference InterSpeech PDF
Wangyou Zhang, Xuankai Chang and Yanmin Qian Knowledge Distillation for End-to-End Monaural Multi-talker ASR System Conference InterSpeech PDF
Wangyou Zhang, Ying Zhou and Yanmin Qian Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking Conference InterSpeech PDF
Zhanghao Wu, Shuai Wang, Yanmin Qian and Kai Yu Data Augmentation using Variational Autoencoder for Embedding based Speaker Verification Conference InterSpeech PDF
Yexin Yang, Hongji Wang, Heinrich Dinkel, Zhengyang Chen, Shuai Wang, Yanmin Qian and Kai Yu The SJTU Robust Anti-spoofing System for the ASVspoof 2019 Challenge Conference InterSpeech PDF
Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian and Kai Yu Cross-domain Replay Spoofing Attack Detection using Domain Adversarial Training Conference InterSpeech PDF
Shuai Wang, Johan Rohdin, Lukáš Burget, Oldřich Plchot, Yanmin Qian, Kai Yu and Jan Černocký On the Usage of Phonetic Information for Text-independent Speaker Embedding Extraction Conference InterSpeech PDF
Mireia Diez, Lukáš Burget, Shuai Wang, Johan Rohdin and Jan Černocký Bayesian HMM based x-vector clustering for Speaker Diarization Conference InterSpeech PDF
Ruisheng Cao, Su Zhu, Chen Liu, Jieyu Li and Kai Yu Semantic Parsing with Dual Learning Conference ACL PDF
Mengyue Wu, Heinrich Dinkel and Kai Yu Audio Caption: Listen and Tell Conference ICASSP PDF
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael Seltzer and Christian Fuegen End-to-end Contextual Speech Recognition using Class Language Models and a Token Passing Decoder Conference ICASSP PDF
Zijian Zhao, Su Zhu and Kai Yu A Hierarchical Decoding Model for Spoken Language Understanding from Unaligned Data Conference ICASSP PDF
Shuai Wang, Yexin Yang, Tianzhe Wang, Yanmin Qian and Kai Yu Knowledge Distillation for Small Foot-print Deep Speaker Embedding Conference ICASSP PDF
Xuankai Chang, Yanmin Qian, Kai Yu and Shinji Watanabe End-to-end Monaural Multi-speaker ASR System without Pretraining Conference ICASSP PDF

2018

Zhehuai Chen, Jasha Droppo, Jinyu Li, Wayne Xiong Progressive Joint Modeling in Unsupervised Single-channel Overlapped Speech Recognition Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Tian Tan, Yanmin Qian, Hu Hu, Wen Ding, Ying Zhou, Kai Yu Adaptive very deep convolutional residual network for noise robust speech recognition Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Kai Yu, Zijian Zhao, Xueyang Wu, Hongtao Lin and Xuan Liu Rich Short Text Conversation Using Semantic Key Controlled Sequence Generation Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Xuan Liu, Di Cao and Kai Yu Binarized LSTM Language Model Conference NAACL PDF
Ying Zhou and Yanmin Qian Robust Mask Estimation by Integrating Neural Network-Based and Clustering-Based Approaches for Adaptive Acoustic Beamforming Conference ICASSP PDF
Lu Chen, Cheng Chang, Zhi Chen, Bowen Tan, Milica Gasic and Kai Yu Policy Adaptation for Deep Reinforcement Learning-based Dialogue Management Conference ICASSP PDF
Shuai Wang, Yanmin Qian and Kai Yu Focal KL-Divergence based Dilated Convolutional Neural Networks for Co-channel Speaker Identification Conference ICASSP PDF
Ruinian Chen and Kai Yu Fast OOV Words Incorporation using Structured Word Embeddings for Neural Network Language Model Conference ICASSP PDF
Hu Hu, Tian Tan and Yan min Qian Generative Adversarial Networks based Data Augmentation for Noise Robust Speech Recognition Conference ICASSP PDF
Wen Ding, Tian Tan and Yanmin Qian Fast Adaptation on Deep Mixture Generative Network Based Acoustic Modeling Conference ICASSP PDF
Xuankai Chang, Yanmin Qian and Dong Yu Adaptive Permutation Invariant Training with Auxiliary Information for Monaural Multi-Talker Speech Recognition Conference ICASSP PDF
Tian Tan, Yanmin Qian and Dong Yu Knowledge Transfer in Permutation Invariant Training for Single-channel Multi-talker Speech Recognition Conference ICASSP PDF
Zhehuai Chen, Qi Liu, Hao Li, Kai Yu On Modular Training of Neural Acoustics-to-word Model for LVCSR Conference ICASSP PDF
Zhehuai Chen, Jasha Droppo Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition Conference ICASSP PDF
Su Zhu, Ouyu Lan, Kai Yu Robust Spoken Language Understanding with Unsupervised ASR-error Adaptation Conference ICASSP PDF
Ouyu Lan, Su Zhu, Kai Yu Semi-Supervised Training Using Adversarial Multi-Task Learning for Spoken Language Understanding Conference ICASSP PDF
Zili Huang, Shuai Wang and Yanmin Qian Joint I-Vector with End-to-End System for Short Duration Text-Independent Speaker Verification Conference ICASSP PDF

2017

Yanmin Qian, Nanxin Chen, Heinrich Dinkel and Zhizheng Wu Deep Feature Engineering for Noise Robust Spoofing Detection Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Zhehuai Chen, Yimeng Zhuang, Yanmin Qian and Kai Yu Phone Synchronous Speech Recognition with CTC Lattices Journal IEEE/ACM Transactions on Audio, Speech, and Language Processing PDF
Ruinian Chen, Ying Zhou and Yanmin Qian Emotion Recognition Using Support Vector Machine and Deep Neural Network Conference NCMMSC PDF
Cheng Chang, Huifeng Zhang, Zhangxuan Gu and Yanmin Qian Fusion Model for Speech Emotion Recognition with Low Level Descriptor Features Conference NCMMSC PDF
Yue Wu, Qi Liu, Kai Yu The adaptive adjustment of learning rate is applied in the language model Conference NCMMSC PDF
Xuan Liu, Kai Yu GLEU-Guided Multi-resolution Network for Short Text Conversation Conference NCMMSC PDF
Kaiyu Shi, Xuan Liu and Yanmin Qian Speech Emotion Recognition Based on SVM and GMM-HMM Hybrid System Conference NCMMSC PDF
Yue Wu, Tianxing He, Zhehuai Chen, Yanmin Qian, Kai Yu Multi-view LSTM Language Model with Word-synchronized Auxiliary Feature for LVCSR Conference CCL PDF
Zhehuai Chen, Yanmin Qian, and Kai Yu A unified confidence measure framework using auxiliary normalization graph Conference IScIDE PDF
Di Cao and Kai Yu Deep Attentive Structured Language Model Based on LSTM Conference IScIDE PDF
Xiaowei Jiang, Shuai Wang, Xu Xiang and Yanmin Qian Integrating Online i-vector into GMM-UBM for Text-dependent Speaker Verification Conference APSIPA PDF
Qi Liu, Yanmin Qian and Kai Yu Future Vector Enhanced LSTM Language Model for LVCSR Conference ASRU PDF
Lu Chen, Xiang Zhou, Cheng Chang, Runzhe Yang and Kai Yu Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning Conference EMNLP PDF
Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou and Kai Yu Affordable On-line Dialogue Policy Learning Conference EMNLP PDF
Xu Xiang, Yanmin Qian and Kai Yu Binary Deep Neural Networks for Speech Recognition Conference InterSpeech PDF
Dong Yu, Xuankai Chang and Yanmin Qian Recognizing Multi-Talker Speech with Permutation Invariant Trainin Conference InterSpeech PDF
Bo Chen, Jiahao Lai and Kai Yu Comparison of Modeling Target in LSTM-RNN Duration Model Conference InterSpeech PDF
Bo Chen, Tianling Bian and Kai Yu Discrete Duration Model For Speech Synthesis Conference InterSpeech PDF
Shuai Wang, Yanmin Qian and Kai Yu What Does the Speaker Embedding Encode? Conference InterSpeech PDF
Heinrich Dinkel, Yanmin Qian, Kai Yu Small-footprint convolutional neural network for spoofing detection Conference IJCNN PDF
Lu Chen, Runzhe Yang, Cheng Chang, Zihao Ye, Xiang Zhou and Kai Yu On-line Dialogue Policy Learning with Companion Teaching Conference EACL PDF
Heinrich Dinkel, Nanxin Chen, Yanmin Qian and Kai Yu End-To-End Spoofing Detection With Raw Waveform Cldnns Conference ICASSP PDF
Su Zhu and Kai Yu Encoder-decoder with Focus-mechanism for Sequence Labelling Based Spoken Language Understanding Conference ICASSP PDF
Zhehuai Chen, Yimeng Zhuang and Kai Yu Confidence Measures for CTC-based Phone Synchronous Decoding Conference ICASSP PDF