Research seminar Jo Vermeulen, Universidad Calgary, Friday, March 31, 9:30 am

23 Março 2017, 10:40 - Ana Maria de Almeida Nogueira Marques

Title: Designing Intelligible Technology

Speaker: Jo Vermeulen, Universidad Calgary 

Date and time: Friday, March 31, 9:30 am

Location: CSE meeting room (Informática II - Alameda), videocast to DSI room in Tagus.

We are currently living in a world of ubiquitous computing – a world beyond the personal computer, in which everyone is interacting daily with many different computing devices in many different form factors. The increasing mobility of computing devices and integration of sensors has radically changed the way we interact with technology. However, in addition to opening new ways of engaging with technology, ubiquitous computing also brought about several new challenges for the field of Human–Computer Interaction. I will illustrate the difficulties people face when dealing with so-called "smart" technologies that act on our behalf, and talk about the dark side of this technology. I will present several strategies and techniques from my research that can help to address these challenges. At the core of my vision is technology that works for us, and that we can understand: the concept of intelligible technology. Finally, I will provide future directions towards designing technology that both empowers us and keeps us in the loop.

Jo Vermeulen has an MSc. and a Ph.D. in Computer Science from Hasselt University in Belgium. He was awarded the 2015 FWO – IBM Innovation Award, a prize given annually to the best Belgian PhD dissertation in computer science. Jo's research received several other awards, including a Best Paper at ACM DIS 2014 and a Best Paper Honorable Mention Award at CHI 2013. He is currently a Postdoctoral Fellow in the InnoVis group in the Interactions Lab at the University of Calgary, where he works with Dr. Sheelagh Carpendale. His research interests lie at the intersection of human–computer interaction, ubiquitous computing and information visualization. A recurring theme in Jo's work is to reveal the invisible aspects of technology. He strongly believes in designing interactive technology that will make people – not just technology – smarter.

Research seminar Pedro Ferreira, IPATIMUP - Porto, Monday, March 27, 15:00pm

21 Março 2017, 10:38 - Ana Maria de Almeida Nogueira Marques

Title: Large-scale data analysis for the characterization of human gene expression patterns across hundreds of individuals and dozens of tissues

Speaker: Pedro Gabriel Dias Ferreira, Ipatimup/i3s

Date and time: Monday, March 27, 15:00 pm

Location: CSE meeting room (Informática II - Alameda), videocast to DSI room in Tagus.

The emergence of high-throughput sequencing has brought substantial advances in human population and cancer genomics research. Such advances made possible to assay with high depth of coverage the DNA and RNA of tumor and normal samples creating an unprecedented explosion of data. Several large-scale consortia, such as TCGA, ICGC, ENCODE, GEUVADIS or GTEx are dedicated to the comprehensive sequencing, characterization and analysis of the genomic changes in different types of tumors, cell lines or tissues. While these datasets bring new opportunities in extending our biological knowledge they also bring new challenges on the computational aspects of data analysis. In this talk, I will describe some of our recent results in the characterization of the transcriptional patterns across dozens of tissues and hundreds of individuals within the scope of GTEx and other large scale sequencing projects. The applied methodology used for data analysis will be explained. The relevance of these findings for biomedical research and future perspectives will also be discussed.

Pedro G. Ferreira graduated in Systems and Informatics Engineering in 2002 and completed his PhD in Artificial Intelligence from the University of Minho in November 2007 with a scholarship from FCT-Portugal. From 2008 to 2012, he was a Postdoctoral Fellow at the Bioinformatics and Genomics Laboratory, Center for Genomic Regulation (CRG), Barcelona. He was supported by an FCT-Portugal fellowship in the first 3 years. During this period he worked extensively with next generation sequencing (NGS) data. He collaborated with several groups in the CRG and the University Pompeu Fabra and several international consortia. In November 2012, he joined as a Postdoctoral Fellow the Functional Population Genomics and Genetics of Complex Traits group, School of Medicine, University of Geneva. His research was focused on genomics for personalized health. During his two postdocs, he was or is involved in four major international consortia: ENCODE, ICGC-CLL, GEUVADIS and GTEx. In August 2014, he joined the start-up company Coimbra Genomics as a senior bioinformatics specialist. He worked in the design and development of clinical decision support systems for personalized medicine. Late 2014, he was awarded an FCT Investigator ‘Starting grant’ and in May 2015 he started at IPATIMUP. His main research focus is the development of information systems to interpret personal genomics data for clinical diagnosis and precision medicine.

Research seminar – Wang Ling, Friday, March 24, 11:00-11:45 (Alameda - 0.19, Pav Informática II) and 15:00-15:45 (Tagus park - 1.38)

20 Março 2017, 13:40 - Ana Maria de Almeida Nogueira Marques

Title: Structured Neural Networks for Natural Language Processing

Speaker: Wang Ling, CMU/IST

Date and time: Friday, March 24, 11:00-11:45 and 15:00-15:45 (Tagus park - 1.38)

Location: CSE meeting room (Informática II - Alameda), videocast to DSI room in Tagus, and Tagus park - 1.38

Recent advances in deep learning have led to a new era in Natural Language Processing where neural networks achieve state-of-the-art results on the majority of their mainstream tasks, such as machine translation, language modeling and parsing. In this talk, I will describe how structure plays an important role in the design of neural models for natural language processing tasks. First, I will describe a class of hierarchical models that employ a word composition model at the character level, in addition to a sentence composition component. By enabling morphological awareness, this class of models have shown remarkable improvements in a multitude of natural language processing tasks, such as language modeling, part-of-speech tagging and machine translation. Secondly, I will describe Latent Predictor Networks, a framework that allows the generation of sequences of tokens with multiple predictors at different levels of granularity. This model is applied in the task of generating programming code from natural language descriptions, where the model learns to generate programming language keywords at the character level, and learns to copy strings and values from the natural language input. Thirdly, I will describe a model that learns to solve high school math problems, where the model is required to understand a question in natural language and generate a natural language rationale describing the solution to the problem. I will conclude by discussing promising directions for future research.


Wang Ling is a research scientist in Google DeepMind. He received his dual-degree PhD in Language Technologies in 2015 from Carnegie Mellon University and University of Lisbon. His research interests include Machine Translation, Natural Language Processing, Machine Learning and Deep Learning. He has published over 30 articles in the top tier conferences and journals (including Computational Linguistics, ACL and EMNLP).

Research seminar - Andre Freitas, University of Passau, Germany - Monday, April 3, 14:00-16:30 pm

20 Março 2017, 13:21 - Ana Maria de Almeida Nogueira Marques

Title: How to talk to your data: Scalable semantic interpretation techniques for heterogeneous data

Speaker: André Freitas, University of Passau, Germany

Date and time: Monday, April 3, 14:00 pm

Location: CSE meeting room (Informática II - Alameda), videocast to DSI room in Tagus.

The recent evolution of approaches, data resources and tools in the Natural Language Processing (NLP) and Artificial Intelligence (AI) fields brings the opportunity for theconstruction of data analysis methods and information systems which are able towork over unstructured, complex and semantically heterogeneous data. The ability to systematically structure, integrate, query and operate over unstructured and highly variable data at scale emerges as a strong demand across different fields which are dependent on analytical reasoning. In this talk we will describe contemporary techniques to automatically interpret the meaning of unstructured data at scale and the emerging formal and methodological data science foundations which support addressing the data variety dimension for Big Data scenarios. A particular emphasis will be given to the description of information extraction, knowledge representation and semantic parsing models and how these models can be combined to build systems that perform complex interpretation tasks such as Question Answering under real-world data conditions.

André Freitas is a research group leader and lecturer at the Natural Language Processing & Semantic Computing research group at the University of Passau in Germany. Before joining Passau, he was part of the Digital Enterprise Research Institute (DERI) at the National University of Ireland, Galway where he did his PhD on Schema-agnostic Query Mechanisms for Large-Schema Databases. André holds a BSc. in Computer Science from the Federal University of Rio de Janeiro (UFRJ), Brazil. His main research areas include Question Answering, Schema-agnostic Database Query Mechanisms, Natural Language Query Mechanisms over Large-Schema Databases, Distributional Semantics, Hybrid Symbolic-Distributional Models, Approximate Reasoning and Knowledge Graphs.

Research seminar - Andre Martins, Unbabel - Monday, March 20, 15:30-16:30

20 Março 2017, 12:39 - Ana Maria de Almeida Nogueira Marques

Title: Machine Learning for Natural Language Processing
Speaker: André F.T.Martins, Unbabel 

Date and time: Monday, March 20, 15:30 pm

Location: CSE meeting room (Informática II - Alameda), videocast to DSI room in Tagus.


In this talk, I will describe how statistical machine learning can be very effective to address various natural language processing tasks, including syntactic parsing, natural language inference, and translation quality estimation. First, I will show how many of these problems fall under the umbrella of structured prediction, a machine learning framework for making predictions involving structured, highly interdependent, and globally constrained output variables. I will illustrate with an application to syntactic parsing ("turbo parser"), which formulates the problem as an integer linear program, represent it as a factor graph, and runs an approximate decoding algorithm to obtain the highest-scoring dependency tree for a given sentence. Then, I will show how recent deep learning models involving attention mechanisms are capable of recognizing input structure. I will describe "sparsemax," a new activation function similar to the traditional softmax, but which promotes sparse probabilities.  This is used to design a sparse attention mechanism which, when applied to a natural language entailment problem, is able to identify which words are relevant to detect an entailment or contradiction relation. Finally, I will describe how structured prediction models and neural networks can be combined effectively for assessing the quality of machine-translated sentences. I will end the talk by discussing possible directions for future research: how to marry deep learning and structured prediction to create a new offspring of methods and algorithms capable of performing complex natural language processing tasks?


André Martins is the Head of Research at Unbabel, an invited professor at Instituto Superior Técnico, and a member of the Instituto de Telecomunicações. He received his dual-degree PhD in Language Technologies in 2012 from Carnegie Mellon University and University of Lisbon. His research interests include natural language processing, machine translation, machine learning, and deep learning. He has published 30+ papers with over 1700 citations and h-index of 21 in top-tier conferences and journals (such as Computational Linguistics, JMLR, PAMI, ACL, ICML, and EMNLP), including a best paper award at the Annual Meeting of the Association for Computational Linguistics (ACL) for his work in natural language syntax. His PhD dissertation has been awarded with a SCS Honorable Mention at CMU. He won the Portuguese IBM Scientific Prize in 2012. André is one of the co-founders and organizers of the Lisbon Machine Learning Summer School (LxMLS).