Technical Program of SJTU Young Researchers Forum on Speech and Language Processing

Technical Program

S1ANovel Training Methods

Time: Saturday, March 21, 9:20-11:20

Location: 3-200, SEIEE Building

Session Chair: Tian Tan

S1A.1: Echo State Network Random Projections and Ensemble Diversity.

Jeremy wong: University of Cambridge

S1A.2: Data selection from multiple asr systems' hypotheses for unsupervised acoustic model training.

Sheng li: Kyoto University. Yoshida Campus

S1A.3: Semi-supervisd training of Deep Neural Networks.

Karel Vesely: Brno University of Technology

S1A.4: The speaker code based adaptation of RNN-BLSTM

Zhiying huang: University of Science and Technology of China


S1BRich Audio Analysis

Time:Satuaday,March 21, 9:20-11:20

Location: 3-410, SEIEE Building

Session Chair: Shuai Wang

S1B.1: Phone-aware local variability vector for speaker verification of short utterances

Liping Chen: University of Science and Technology of China

S1B.2: Acoustic Multi-Feature Analysis for the Classification of Snore Sounds

Christoph Janott: Technische Universitaet München

S1B.3: Compact Convolutional Neural Network Transfer Learning for Small-scall Image classification

Zengxi Li: University of Science and Technology of China

S1B.4: Multi-scale approaches to dynamic music emotion prediction

Xinxing Li: Tsinghua University


S2AScalable NN Training

Time:Satuaday,March 21, 12:00-14:00

Location: 3-200, SEIEE Building

Session Chair: Zhehuai Chen

S2A.1: Deep Big Audio Data Learning in High Performance Computing System: A Case Study of Big Bird Sounds Data Classification

Kun Qian: Technische Universitaet München

S2A.2: Artificial Neural Network Acoustic Models in HTK 3.5

Chao Zhang: University of Cambridge

S2A.3: A New Approach to Scalable Training of Deep Learning Machines

Kai Chen: MSRA

S2A.4: Long-Short Term Memory Recurrent Neural Networks for Audio Engineering

Erik Marchi: Technische Universität München


S2BSpeech Synthesis

Time:Satuaday,March 21, 12:00-14:00

Location: 3-410, SEIEE Building

Session Chair: Bo Chen

S2B.1: Speaker and Language Factorization in DNN-based TTS Synthesis

Yuchen Fan: Shanghai Jiao Tong University

S2B.2: A KL Divergence And DNN Approach to Cross-Lingual TTS

Fenglong Xie: Harbin Institute of Technology

S2B.3: Initial investigation of speech synthesis based on complex-valued neural networks

Qiong Hu: Edinburgh University

S2B.4: Simple Multi Frame Analysis methods for estimation of Amplitude Spectral Envelope estimation in Singing Voice

Gilles Degottex: University of Cambridge


S3ANovel NN Model

Time:Satuaday,March 21,14:00-16:00​

Location: 3-200, SEIEE Building

Session Chair: Qi Liu

S3A.1: Highway LSTM for Speech Recognition

Yu Zhang: Massachusetts Institute of Technology

S3A.2: Listen, Attend and Spell

William Chan: Carnegie Mellon University

S3A.3: End-to-End Speech Recognition using Deep LSTMs, CTC Training and WFST Decoding

Yajie Miao: Carnegie Mellon University

S3A.4: Exploiting Lstm Structure In Deep Neural Networks For Speech Recognition

Tianxing He : Shanghai Jiao Tong University


S3BSpeech and language technology

Time:Satuaday,March 21, 14:00-16:00

Location: 3-410, SEIEE Building

Session Chair: Lu Chen

S3B.1: Speech-Driven Visual Speech Synthesis

Dawei Zhang: Institute of Automation of Chinese Academy of Sciences

S3B.2: Exemplar-based sparse representation of timbre and prosody for voice conversion.

Ming Huaiping: Agency for Science, Technology and Reseach

S3B.3: Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogue Systems

Yun-Nung (Vivian) Chen: Carnegie Mellon University

S3B.4: Spoken language understanding at University of West Bohemia

Adam Chýlek: University of West Bohemia Faculty of Applied Sciences Department of Cybernetics