
Gralhas MP3

18 novembro 2011, 15:46 Helena Galhardas

Existia uma gralha no enunciado da pergunta 4.2, relativamente aos atributos projectados pela interrogação SQL.

Existiam duas gralhas no enunciado da pergunta 5.2 - um no mapeamento de esquema que define o esquema do meadiador à custa das fontes de dados; outra no esquema do mediador G.

Todas já se encontram corrigidas na versão disponível na página.

Mini-Projecto 3

7 novembro 2011, 12:24 Bruno Emanuel Da Graça Martins

Os exercícios do Mini-Projecto 3 encontram-se publicados online, na secção "Mini-Projectos".

Apresentações convidadas na aula teórica 10/11, 9H30

6 novembro 2011, 19:18 Helena Galhardas

Na aula teórica do dia 10/11, 5ªf, 9H30, terão lugar duas apresentações convidadas cujos títulos e sumários se apresentam de seguida.

Medicine.Ask: an extraction and search system for medicine information

Vasco Mendes, IST 

Health personnel deal with medicines in a daily basis. They need to have access to comprehensive information about medicines as fast as possible. Several books and web sites are at their disposal, as well as independent software packages with extra search capabilities that can be used in Pocket PCs or mobiles. The public, in general, is also interested in having quick access to information about medicines. Despite all the electronic possibilities available nowadays, the provided search functionalities are usually based in keywords or class-oriented (allowing, for instance, a search by laboratory or by ATC classification). Our proposal is to speed up the information access process by providing a facility to search for information about medicines through a (controlled) set of questions posed in Natural Language. An example of such a question is: "Which are the medicines for influenza that can be used during pregnancy?''.

We designed and implemented Medicine.Ask, which is a question-answering system about medicines that couples state of the art techniques in Information Extraction and Natural Language Processing. We present the architecture of the system and the main techniques used. Furthermore, we report the experiments that were carried on to validate the modules of the system.

Consolidation of entities in social networks  

André Nunes,  IST

The increasing popularity of social networks has lead to a situation in which Web users frequently duplicate their identity over these fragmented information spaces. However, for many applications, it would be interesting to aggregate the individual contributions from the same user of these sites. This raises the need for methods capable of resolving user identities, i.e. detecting identifiers over different social networks that correspond to the same person.

This work presents a machine learning approach for resolving user identities, based on classifying pairs of user identifiers as either corresponding to the same person or not. Experiments were made with different classification algorithms, namely Support Vector Machines, Random Forests and Alternating Decision Trees, and with different combinations of similarity scores for the feature vectors. The obtained results attest for the adequacy of the proposed method for the task of consolidating user identities on the social Web.


Aula de laboratório amanhã, 4/11, 8H30

3 novembro 2011, 12:28 Helena Galhardas

Por razões de força maior, a aula de laboratório de amanhã, 6ªf, 4/11, 8H30 não terá lugar.

A aula de laboratório das 10H terá lugar como previsto.

Relembramos que as aulas de laboratório desta semana são de apoio ao mini-projecto 2.

Document token-list.xml updated for Mini-Project 2

3 novembro 2011, 12:08 Bruno Emanuel Da Graça Martins

The XML document token-list.xml, which is used in the last exercise from mini-project 2, was updated on the course webpage.

The file that was listed previously on the course website only contained a very small number of examples for the classification of word tokens, and students should use the new  token-list.xml document, which nonetheless has exactly the same format.