FenixEdu™

Anúncios

Notas de Laboratório

18 junho 2014, 12:32 • Isabel Maria Martins Trancoso

Revisão: 20/06/2014, 10h00-11h00, INESC-ID, Gab. 231

Palestra: Behavioral Signal Processing: Enabling human-centered behavioral informatics

11 junho 2014, 16:55 • Isabel Maria Martins Trancoso

Prof. Shrikanth Narayanan, University of Southern California, USA

Behavioral Signal Processing: Enabling human-centered behavioral informatics
30 June 2014, 2:30p.m., IST, anfiteatro Abreu Faro

>Abstract
Audio-visual data have been a key enabler of human behavioral research and its applications. The confluence of sensing, communication and computing technologies is allowing capture and access to data, in diverse forms and modalities, in ways that were unimaginable even a few years ago. Importantly, these data afford the analysis and interpretation of multimodal cues of verbal and non-verbal human behavior. These signals carry crucial information about not only a person’s intent and identity but also underlying attitudes and emotions. Automatically capturing these cues, although vastly challenging, offers the promise of not just efficient data processing but in tools for discovery that enable hitherto unimagined insights. Recent computational approaches that have leveraged judicious use of both data and knowledge have yielded significant advances in this regards, for example in deriving rich, context-aware information from multimodal sources including human speech, language, and videos of behavior. This talk will focus on some of the advances and challenges in gathering such data and creating algorithms for machine processing of such cues. It will highlight some of our ongoing efforts in Behavioral Signal Processing (BSP)—technology and algorithms for quantitatively and objectively understanding typical, atypical and distressed human behavior—with a specific focus on communicative, affective and social behavior. The talk will illustrate Behavioral Informatics applications of these techniques that contribute to quantifying higher-level, often subjectively described, human behavior in a domain-sensitive fashion. Examples will be drawn from health and well being realms such as Autism, Couple therapy and Addiction counseling.

>Bio
Shrikanth (Shri) Narayanan is Andrew J. Viterbi Professor of Engineering at the University of Southern California, where he is Professor of Electrical Engineering, Computer Science, Linguistics and Psychology and Director of the Ming Hsieh Institute. Prior to USC he was with AT&T Bell Labs and AT&T Research. His research focuses on human-centered information processing and communication technologies. He is a Fellow of the Acoustical Society of America, IEEE, and the American Association for the Advancement of Science (AAAS). Shri Narayanan is an Editor for the Computer, Speech and Language Journal and an Associate Editor for the IEEE Transactions on Affective Computing, the Journal of Acoustical Society of America and the APISPA Transactions on Signal and Information Processing having previously served an Associate Editor for the IEEE Transactions of Speech and Audio Processing (2000-2004), the IEEE Signal Processing Magazine (2005-2008) and the IEEE Transactions on Multimedia (2008-2012). He is a recipient of several honors including the 2005 and 2009 Best Transactions Paper awards from the IEEE Signal Processing Society and serving as its Distinguished Lecturer for 2010-11. With his students, he has received a number of best paper awards including winning the Interspeech Challenges in 2009 (Emotion classification), 2011 (Speaker state classification), 2012 (Speaker trait classification) and in 2013 (Paralinguistics/Social Signals). He has published over 600 papers and has been granted 16 U.S. patents. (http://sail.usc.edu/shri.php)

Revisão de provas / Lab 4

10 junho 2014, 23:48 • Isabel Maria Martins Trancoso

5ª feira, 12 de Junho, 14h30-15h30

Revisão de provas / Horário de dúvidas

28 maio 2014, 11:07 • Isabel Maria Martins Trancoso

Dia 3 de Junho, 10h00-11h00, no Lab

Há disponibilidade para dúvidas a outras horas, desde que previamente combinadas via email no gabinete.

Palestra: A Deep Neural Network Approach to Speech Enhancement

23 maio 2014, 18:35 • Isabel Maria Martins Trancoso

A Deep Neural Network Approach to Speech Enhancement

[ Date ]

- 15h30, Monday, May 26th, 2014
- QA1.2 (IST Alameda)

[ Speaker ]

- Prof. Chin-Hui Lee, School of Electrical and Computer Engineering, Georgia Institute of Technology

[ Abstract ]

In contrast to the conventional minimum mean square error (MMSE) based noise reduction techniques, we formulate speech enhancement as finding a mapping function between noisy and clean speech signals. In order to be able to handle a wide range of additive noises in real-world situations, a large training set, encompassing many possible combinations of speech and noise types, is first designed. Next a deep neural network (DNN) architecture is employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been adopted to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the MMSE based techniques. It is also interesting to observe that the proposed DNN approach can well suppress the highly non-stationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without generating the annoying musical artifact commonly observed in conventional enhancement methods.

[ Bio ]

Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had 20 years of industrial experience ending in Bell Laboratories, Murray Hill, New Jersey, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of ISCA. He has published over 400 papers and 30 patents, and was highly cited for his original contributions with an amazing h-index of 66. He received numerous awards, including the Bell Labs President's Gold Award in 1998. He won the SPS's 2006 Technical Achievement Award for "Exceptional Contributions to the Field of Automatic Speech Recognition". In 2012 he was invited by ICASSP to give a plenary talk on the future of speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for “pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition”.

Processamento da Fala