5 Junho 2009, 13:46 • Isabel Maria Martins Trancoso
June 5th, 15h, Room Va4
Speech Synthesis: past, present and future and how it mirrors speech processing development in general
Alan W Black
This talk will look at the past, present and future of speech synthesis and how it relates to speech processing development in general. Specifically I will outline the advances in synthesis technology giving analogies to the developments in other speech and language processing fields (e.g. ASR and SMT) where knowledge-based techniques gave way to data-driven techniques, which in turn have pushed both machine learning technologies and later re-introduced techniques to include higher level knowledge in our data-driven approaches.
We will give overviews of diphone, unit selection, statistical parametric synthesis, voice morphing technologies and how synthesis can be optimized for the desired task. We will also address issues of evaluation, both in isolation and when embedded in real tasks. While widening our view of speech processing we will also present the publicly used Let's Go Spoken Dialog System (and its evaluation platform Let's Go Lab), our rapid language adaptation system (CMUSPICE) allowing construction of ASR and TTS support in new languages by non-speech experts and out hands-free real-time two-way speech to speech translation system showing how system integration can cause cross technology innovation.