FenixEdu™

Dissertação

{en_GB=Addressing the Texture-Bias and Domain Generalization Performance of Convolutional Neural Networks for Simulation-Based Learning} {} EVALUATED

Detalhes: {pt=As Redes Convolucionais tornaram-se a solução mais utilizada para resolver problemas de Visão por Computador. Graças a avanços em desempenho computacional e ao desenvolvimento de bibliotecas abertas de grande qualidade para aprendizagem profunda, automatizar uma tarefa de visão simples envolve apenas construir um conjunto de dados representativo, que simula as condições nas quais a rede será aplicada, e treinar uma rede adequada nesse conjunto de dados, com algumas considerações adicionais para evitar sobre-adaptar ou sub-adaptar a rede ao conjunto de dados. Este problema torna-se mais complexo quando o conjunto de dados de treino não representa inteiramente o ambiente no qual a rede deverá funcionar. Por exemplo, uma rede que é treinada num ambiente simulado poderá não funcionar bem quando é testada num ambiente real. Esta situação requer redes que sejam capazes de generalizar de forma mais profunda. As redes deverão ser robustas a alterações no seu ambiente. No caso de uma tarefa de visão, a rede deverá reconhecer a forma dos objetos em vez de reconhecer apenas as suas texturas, para poder reconhecer objetos em diferentes ambientes. Nesta tese iremos explorar o funcionamento das Redes Convolucionais, os detalhes deste problema de generalização, diferentes estratégias para o abordar, e propomos um novo método de texturização aleatória de imagens, que reduz a dependência das redes em texturas e aumenta a sua sensibilidade à forma dos objetos., en=Convolutional neural networks have become the de-facto standard solution for most computer vision problems. Thanks to advances in computer performance and the development of open high-end deep learning libraries, automating a simple vision task only involves gathering a reasonable dataset, that mimics the conditions in which the network will be deployed, and training a suitable network on that dataset, with some extra considerations and tuning to avoid underfitting or overfitting the dataset. This problem becomes more difficult when the dataset on which the network is trained on does not fully represent the scenarios on which the network will be immersed. For example, a network that is trained in a simulated environment may not perform well when it is tested in a real environment, due to the differences between the simulated and real environments. This situation requires networks that are able to generalize at a deeper level. The networks must be robust to changes in their environment. Therefore, in the case of a vision task, they must rely mostly on high-level object shapes rather than low-level image textures to correctly identify objects across environments. Throughout this thesis we will explore the inner workings of convolutional neural networks, the intricacies of this generalization problem, several strategies to tackle it and propose a novel randomized image texturization method that can make make networks rely less on texture and more on the shape of objects.}
Keywords: {pt=Visão por Computador, Aprendizagem Profunda, Redes Convolucionais, Generalização de Domínio, Viés de Texturas, en=Computer Vision, Deep Learning, Convolutional Neural Networks, Domain Generalization, Texture Bias}

Discussão: janeiro 11, 2021, 12:30