Publications

Thesis

  • Yanmin Qian (2012).
    Study on New Speech Recognition Technology under Low Data Resource Conditions.

Journal Papers

  1. Zhehuai Chen, Yimeng Zhuang, Yanmin QianCorresponding author, Kai Yu. Phone Synchronous Speech Recognition with CTC Lattices. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 1, 86-97, 2017.
  2. Yanmin Qian, Mengxiao Bi, Tian Tan, Kai Yu. Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 24, no. 12, 2263-2276, 2016.
  3. Yanmin Qian, Tian Tan and Dong Yu. Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 12, 2231-2240, 2016.
  4. Tian Tan, Yanmin QianCorresponding authorand Kai Yu. Cluster Adaptive Training for Deep Neural Network Based Acoustic Model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 3, 459-468, 2016. 
  5. Yanmin Qian, Nanxin Chen, Kai Yu. Deep Features for Automatic Spoofing Detection. Speech Communication, vol. 85, 43-52, 2016.
  6. Yuan Liu, Yanmin QianCorresponding author, Nanxin Chen, Tianfan Fu, Ya Zhang and Kai Yu. Deep Feature for Text-dependent Speaker Verification. Speech Communication, vol. 73, 1-13, 2015.
  7. Yanmin Qian, Jia Liu, Michael T Johnson. Efficient embedded speech recognition for very large vocabulary Mandarin car-navigation systems. IEEE Transactions on Consumer Electronic, 2009, 55(3):1496-1500.
  8. Yanmin Qian, Yuxiang Shan, Linfang Wang, Jia Liu. Improvement comparison of different lattice-based discriminative training methods in Chinese-Monolingual and Chinese-English Bilingual speech recognition. ACTA AUTOMATICA SINICA, 2012, 38(7):1162-1168.
  9. Yanmin Qian, Ji Xu, Jia Liu. Multi-Stream posterior features and combining Subspace Gmms for low resource LVCSR. Chinese Journal of Electronics, 2013, 22(2): 291-295.
  10. Yanmin Qian, Jia Liu. Cross-entropy OSF-based voice activity detection algorithm. Tsinghua Science and Technology, 2009, 49(10):87-90.
  11. Yanmin Qian, Jia Liu. Optimized data selection strategy based unsupervised acoustic modeling for low data resource speech recognition. Tsinghua Science and Technology, 2013, 53(7): 1001-1004+1010.
  12. Hong Liu, Yanmin Qian, Jia Liu. English speech recognition system on chip. Tsinghua Science and Technology, 2011, 16(1):95-99.
  13. Yan Deng, Weiqiang Zhang, Yanmin Qian, Jia Liu. Language recognition based on acoustic diversified phone recognizers and phonotactic feature fusion. IEICE Transactions on Information and Systems, 2011, E94-D(3):679-689.
  14. Yan Deng, Weiqiang Zhang, Yanmin Qian, Jia Liu. Time-Frequency cepstral features and combining discriminative training for phonotactic language recognition. Journal of Computers, 2011, 6(2):178-183.
  15. Hua Yuan, Yanmin Qian, Junhong Zhao, Jia Liu. Mispronunciation detection with an optimized detection network and multi-layer perception based features. Tsinghua Science and Technology, 2012, 52(4):557-560+570.

Conference Papers

  1. Heinrich Dinkel, Yanmin Qian and Kai Yu. End-to-End Spoofing Detection with Raw Waveform CLDNNs. In proceedings of ICASSP, New Orleans, Louisiana, USA, 2017.
  2. Yanmin Qian, Philip C. Woodland. Very Deep Convolutional Neural Networks for Robust Speech Recognition. In proceedings of SLT, San Diego, California, USA, 2016.
  3. Yanmin Qian, Tian Tan. The SJTU ChiME-4 system: Acoustic noise robustness for real single or multiple microphone scenarios. In proceedings of CHiME4, San Francisco, California, USA, 2016.
  4. Yanmin Qian, Tian Tan and Dong Yu. An investigation into using parallel data for far-field speech recognition. In proceedings of ICASSP, Shanghai, China, 2016: 5725-5729.
  5. Yanmin Qian, Tian Tan, Dong Yu and Yu Zhang. Integrated adaptation with multi-factor joint-learning for far-field speech recognition. In proceedings of ICASSP, Shanghai, China, 2016: 5770-5774.
  6. Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai SIM, Xiong Xiao and Yu Zhang. Speaker-aware training of LSTM-RNNS for acoustic modeling. In proceedings of ICASSP, Shanghai, China, 2016: 5280-5284.
  7. Souvik Kundu, Gautam Mantena, Yanmin Qian, Tian Tan, Marc Delcroix and Khe Chai Sim. Joint acoustic factor learning for robust deep neural network based automatic speech recognition. In proceedings of ICASSP, Shanghai, China, 2016: 5025-5029.
  8. Xie Chen, Xunying Liu, Yanmin Qian, Mark J. F. Gales, Philip C. Woodland. CUED-RNNLM-An open-source toolkit for efficient training and evaluation of recurrent neural network language models. In proceedings of ICASSP, Shanghai, China, 2016: 6000-6004.
  9. Linlin Wang, Chao Zhang, Phil Woodland, Mark Gales, Penny Karanasou, Pierre Lanchantin, Xunying Liu, Yanmin Qian. Improved DNN-based segmentation for multi-genre broadcast audio. In proceedings of ICASSP, Shanghai, China, 2016: 5700-5704.
  10. Yimeng Zhuang, Xuankai Chang, Yanmin Qian, Kai Yu. Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC. In proceedings of InterSpeech, San Francisco, America, 2016: 938-942.
  11. Pierre Lanchantin, Mark Gales, Penny Karanasou, Xunying Liu, Yanmin Qian, Linlin Wang, Phil Woodland and Chao Zhang. Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition. In proceedings of InterSpeech, San Francisco, America, 2016: 3057-3061.
  12. Yimeng Zhuang, Sibo Tong, Maofan Yin, Yanmin Qian, Kai Yu. Multi-Task Joint-Learning for Robust Voice Activity Detection. In proceedings of ISCSLP, Tianjin, China, 2016.
  13. Pavel Korshunov, Sébastien Marcel, Hannah Muckenhirn, Andre Ricardo Goncalves, Alan Godoy Souza Mello, Ricardo Paranhos Velloso Violato, Flavio Olmos Simoes, Marcus de Assis Angeloni, Jose Augusto Stuchi, Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Dipjyoti Paul, Goutam Saha, and Md Sahidullah. Overview of BTAS 2016 Speaker Anti-spoofing Competition (BTAS2016), In proceedings of BTAS, Niagara Falls, Buffalo, New York, USA, 2016.
  14. Tian Tan, Yanmin Qian, Maofan Yin, Yimeng Zhuang and Kai Yu. Cluster Adaptive Training For Deep Neural Network. In proceedings of ICASSP, Brisbane, Australia, 2015: 4325-4329.
  15. Tianxing He, Xu Xiang, Yanmin Qian and Kai Yu. Recurrent Neural Network Language Model with Structured Word Embeddings for Speech Recognition. In proceedings of ICASSP, Brisbane, Australia, 2015: 5396-5400.
  16. Suliang Bu, Yunxin Zhao, Yanmin Qian and Kai Yu. A Novel Static Parameter Calculation Method for Model Compensation. In proceedings of ICASSP, Brisbane, Australia, 2015: 4510-4514.
  17. Yanmin Qian, Tianxing He, Wei Deng and Kai Yu. Automatic Model Redundancy Reduction for Fast Back-Propagation for Deep Neural Networks in Speech Recognition. In proceedings of IJCNN, Killarney, Killarney, Ireland, 2015:1-6.
  18. Yongbin You, Yanmin Qian, Tianxing He, Kai Yu. An Investigation on DNN-Derided Bottleneck Features for GMM-HMM Based Robust Speech Recognition. In proceedings of ChinaSIP,Chengdu, China, 2015: 30-34.
  19. Yongbin You, Yanmin Qian, Kai Yu. Local Trajectory Based Speech Enhancement for Robust Speech Recognition With Deep Neural Network. In proceedings of ChinaSIP,Chengdu, China, 2015: 5-9.
  20. Wengong Jin, Tianxing He, Yanmin Qian, Kai Yu. Paragraph Vector based Topic Model for Language Model Adaptation. In proceedings of InterSpeech, Dresden, Germany, 2015: 3516-3520.
  21. Nanxin Chen, Yanmin Qian, Kai Yu. Multi-Task Learning for Text-dependent Speaker Verification. In proceedings of InterSpeech, Dresden, Germany, 2015: 185-189.
  22. Nanxin Chen, Yanmin Qian, Heinrich Dinkel, Bo Chen, Kai Yu. Robust Deep Feature for Spoofing Detection - The SJTU System for ASVspoof 2015 Challenge. In proceedings of InterSpeech, Dresden, Germany, 2015: 2097-2101.
  23. Mengxiao Bi, Yanmin Qian, Kai Yu. Very Deep Convolutional Neural Networks for LVCSR. In proceedings of InterSpeech, Dresden, Germany, 2015: 3259-3263.
  24. Yanmin Qian, Maofan Yin, Yongbin You and Kai Yu. Multi-Task Joint-Learning of Deep Neural Networks for Robust Speech Recognition. In proceedings of ASRU, Scottsdale, Arizona, USA, 2015: 310-316.
  25. Philip C. Woodland, Xunying Liu, Yanmin Qian, Chao Zhang, Penny Karanasou, Pierre Lanchantin, Linlin Wang and Mark J. F. Gales. Cambridge University Transcription Systems for the Multi-Genre Broadcast Challenge. In proceedings of ASRU, Scottsdale, Arizona, USA, 2015: 639-646.
  26. Pierre Lanchantin, Mark Gales, Penny Karanasou, Xunying Liu, Yanmin Qian, Linlin Wang, Phil Woodland and Chao Zhang. The Development of the Cambridge University Alignment Systems for the Multi-Genre Broadcast Challenge. In proceedings of ASRU, Scottsdale, Arizona, USA, 2015: 647-653.
  27. Penny Karanasou, Mark J. F. Gales, Pierre Lanchantin, Xunying Liu, Yanmin Qian, Linlin Wang, Philip C. Woodland and Chao Zhang. Speaker Diarisation and Longutudinal Linking in Multi-Genre Broadcast Data. In proceedings of ASRU, Scottsdale, Arizona, USA, 2015: 660-666.
  28. Wei Deng, Yanmin Qian, Yuchen Fan, Tianfan Fu and Kai Yu. Stochastic Data Sweeping for Fast DNN Training. In proceedings of ICASSP, Florence, Italty, 2014: 240-244.
  29. Tianxing He, Yuchen Fan, Yanmin Qian, Tian Tan, and Kai Yu. Reshaping Deep Neural Network for Fast Decoding by Node-pruning. In proceedings of ICASSP, Florence, Italty, 2014:245-249.
  30. Suliang Bu, Yanmin Qian, Khe Chai Sim, Yongbin You, and Kai Yu. Second Order Vector Taylor Series Based Robust Speech Recognition. In proceedings of ICASSP, Florence, Italty, 2014: 1788-1792.
  31. Yuan Liu, Tianfan Fu, Yuchen Fan, Yanmin Qian and Kai Yu. Speaker Verification with Deep Features. In proceedings of IJCNN, Beijing, China, 2014: 747-753.
  32. Jianwei Niu, Yanmin Qian and Kai Yu. Acoustic Emotion Recognition using Deep Neural Network. In proceedings of ISCSLP, Singapore, 2014: 128-132.
  33. Tianfan Fu, Yanmin Qian, Yuan Liu and Kai Yu. Tandem Deep Features for Text-Dependent Speaker Verification. In proceedings of InterSpeech, Singapore, 2014: 1327-1331.
  34. Suliang Bu, Yanmin Qian and Kai Yu. A Novel Dynamic Parameters Calculation Approach For Model Compensation. In proceedings of InterSpeech, Singapore, 2014: 2744-2748.
  35. Sibo Tong, Nanxin Chen, Yanmin Qian and Kai Yu. Evaluating VAD For Automatic Speech Recognition. In proceedings of ICSP, Hangzhou, 2014: 2308-2314.
  36. Yanmin Qian and Jia Liu. MLP-HMM Two-Stage Unsupervised Training for Low-Resource Languages on Conversational Telephone Speech Recognition. In proceedings of InterSpeech, Lyon, France, 2013: 1816-1820.
  37. Yanmin Qian, Kai Yu and Jia Liu. Combination of Data Borrowing Strategies for Low-Resource LVCSR. In proceedings of ASRU, Olomouc, Czech Republic, 2013: 404-409. 
  38. Daniel Povey, Mirko Hannemann, Gilles Boulianne, Lukáš Burget, Arnab Ghoshal, Miloš Janda, Martin Karafiát, Stefan Kombrink, Petr Motlíček, Yanmin Qian, Korbinian Riedhammer, Karel Veselý, Ngoc Thang Vu. Generating exact lattices in the WFST framework. In proceedings of ICASSP, Kyoto, Japan, 2012:4213-4216.
  39. Yanmin Qian, Jia Liu. Cross-Lingual and ensemble MLPs strategies for low-resource speech recognition. In proceedings of InterSpeech, Portland, Oregon, USA, 2012: 2582-2585.
  40. Yanmin Qian, Jia Liu. Articulatory feature based multilingual MLPs for low-resource speech recognition. In proceedings of InterSpeech, Portland, Oregon, USA, 2012: 2602-2605.
  41. Ji Xu, Yanmin Qian, Jia Liu. Utilizing auxiliary data in phoneme recognition based on articulatory feature. In proceedings of ICCSN, Xi’an, China, 2011:363-366.
  42. Daniel Povey, Arnab Ghoshal, Gilles Boulianne, Lukas Burget, Ondrej Glembek, Nagendra Goel, Mirko Hannemann, Petr Motlicek, Yanmin Qian, Petr Schwarz, Jan Silovsky, Georg Stemmer, Karel Vesely. The Kaldi speech recognition toolkit. In proceedings of ASRU, Honolulu, Hawaii, USA, 2011.
  43. Yanmin Qian, Daniel Povey, Jia Liu. State-Level data borrowing for low-resource speech recognition based on Subspace GMM Models. In proceedings of InterSpeech, Florence, Italy, 2011:553-556.
  44. Yanmin Qian, Ji Xu, Daniel Povey, Jia Liu. Strategies for using MLP based features with limited target-language training data. In proceedings of ASRU, Honolulu, Hawaii, USA, 2011:354-358.
  45. Yanmin Qian, Jia Liu. Phone model and combining discriminative training for Mandarin-English bilingual speech recognition. In proceedings of ICASSP, Dallas, TX, USA, 2010:4918-4921.
  46. Yanmin Qian, Jia Liu. Mandarin-English bilingual phone modeling and combining MPE based Discriminative training for cross-language speech recognition. In proceedings of ISCSLP, Tainan, Taiwan, 2010:103-108.
  47. Yan Deng, Weiqiang Zhang, Yanmin Qian, Jia Liu. Integration of complementary phone recognizers for phonotactic language recognition. In proceedings of ICICA, Tangshan, China, 2010:237-244.

List of Patents

  1. Qian Yanmin, Chen Nanxin, Yu Kai (2016).
    Speaker Anti-spoofing Detection using Deep Learning.
    Applied ID: CN201610478041.0, Published ID: CN105869630A, Published Date: 2016-08-17.
  2. Chen Nanxin, Ge Linting, Gu Hao, Chang Xuankai, Qian Yanmin, Yu Kai (2015).
    Text-dependent Speaker Verification using Multi-Task Deep Learning.
    Applied ID: CN201510107647.9, Published ID: CN104732978A, Published Date: 2015-06-24.
  3. Liu Jia, Qian Yanmin (2010).
    Embedded Large Vocabulary Mandarin Commands Speech Recognition Approach.
    Applied ID:CN200910242404.0, Published ID: CN101751924A, Published Date: 2010-06-23.
  4. Liu Jia, Qian Yanmin (2010).
    Embedded Mandarin-English Bilingual Speech Recognition Approach.
    Applied ID:CN200910242406.X, Published ID: CN101727901A, Published Date: 2010-06-09.