Speech Recognition of Under-resourced Languages Using Mismatched Crowdsourcing
A major challenge in automatic speech recognition for under-resourced languages (such as Vietnamese or Tamil) is the scarcity of word transcriptions. Mismatched crowdsourcing is a cost-effective solution to this problem: transcribers do not speak the under-resourced language of interest, yet they write down what they hear in nonsense words in their native language. These transcriptions are called mismatched transcriptions. In this talk, we present how we utilize mismatched transcriptions for speech recognition. Specifically, mismatched transcriptions are used to adapt an initial deep neural network. Experimental results show that mismatched transcriptions significantly improves performance of speech recognition for limited training data conditions.
Van Hai DO is a Postdoctoral Researcher at the Advanced Digital Sciences Center (ADSC) in Singapore, a joint research center by the University of Illinois at Urbana-Champaign (UIUC) and A*STAR. Van Hai received his PhD degree from School of Computer Science and Engineering, Nanyang Technological University. His research interests include speech recognition and speech search.