Sumários

Aula não leccionada

29 maio 2014, 15:30 Isabel Maria Martins Trancoso

Esta aula não foi leccionada, devido a uma reunião internacional e dado que o teste previsto para esta aula teve lugar fora do horário das aulas.


Lab 4

29 maio 2014, 14:00 Isabel Maria Martins Trancoso

Reconhecimento de palavras isoladas usando o HTK.

Aula dada aos alunos dos dois turnos.


Aula substituída

26 maio 2014, 17:00 Isabel Maria Martins Trancoso

Este turno não foi leccionado devido a uma reunião internacional. Foi marcado um turno extra no dia 3 de Junho às 10h00.


A Deep Neural Network Approach to Speech Enhancement

26 maio 2014, 15:30 João Pedro Carvalho

Palestra convidada pelo Prof. Chin-Hui Lee, School of Electrical and Computer Engineering, Georgia Institute of Technology

Abstract
In contrast to the conventional minimum mean square error (MMSE) based noise reduction techniques, we formulate speech enhancement as finding a mapping function between noisy and clean speech signals. In order to be able to handle a wide range of additive noises in real-world situations, a large training set, encompassing many possible combinations of speech and noise types, is first designed. Next a deep neural network (DNN) architecture is employed as a nonlinear regression function to ensure a powerful modeling capability. Several techniques have also been adopted to improve the DNN-based speech enhancement system, including global variance equalization to alleviate the over-smoothing problem of the regression model, and dropout and noise-aware training strategies to further improve the generalization capability of DNNs to unseen noise conditions. Experimental results demonstrate that the proposed framework can achieve significant improvements in both objective and subjective measures over the MMSE based techniques. It is also interesting to observe that the proposed DNN approach can well suppress the highly non-stationary noise, which is tough to handle in general. Furthermore, the resulting DNN model, trained with artificial synthesized data, is also effective in dealing with noisy speech data recorded in real-world scenarios without generating the annoying musical artifact commonly observed in conventional enhancement methods.


Reconhecimento de fala

22 maio 2014, 15:30 Isabel Maria Martins Trancoso

Construção de modelos de língua de n-gramas.