Artigos e Projectos
Information Management and Retrieval
(2007/2008)
Nº |
Name |
Paper |
Project |
48372 |
Ricardo Cruz |
A.19 |
B.14 |
49616 |
Bruno José Saraiva Barreiros |
A.4 |
B.5 |
50000 |
Humberto Miguel Guerreiro da Glória |
A.2 |
B.5 |
50991 |
Francisco Miguel Falcão dos Reis |
A.5 |
B.2 |
51110 |
Vanda Sofia Torres Ribeiro |
A.22 |
B.16 |
52304 |
Diogo José da Fonseca Simões |
??? |
??? |
52308 |
Alexandre Santos Frazão |
A.7 |
B.1 |
52316 |
Hugo Jorge de Bento Alberto |
A.6 |
B.6 |
52317 |
Edgar Ferreira de Oliveira |
A.3 |
B.1 |
52387 |
Tiago Alexandrino Janela |
A.20 |
B.4 |
52411 |
Miguel da Silva Ferreira Neiva Vieira |
A.12 |
B.6 |
52412 |
Marcelo Serrano Ferreira |
A.17 |
B.12 |
52421 |
Miguel Jorge de Brito Vilhena |
A.11 |
B.4 |
52473 |
João Miguel Martins Gonçalves |
A.8 |
B.11 |
52475 |
Tiago Jorge Cabrita dos Santos |
A.9 |
B.11 |
53808 |
Daniel Enrique Zacarias Silva |
A.10 |
B.7 |
53823 |
Bruno José Gonçalves Oliveira |
A.14 |
B.9 |
53829 |
Daniel António Quaresma Costa |
A.1 |
B.7 |
53831 |
André Moura Garcia Pereira |
A.21 |
B.13 |
53837 |
Diogo Bagulho Galvão |
A.18 |
B.13 |
53939 |
Ricardo Daniel Figueiredo Freire |
A.16 |
B.9 |
A. Papers (individual works)
A.1. MPEG-21 (Daniel Costa - 53829)
- To raise the actual status of definition and usage of MPEG-21. This work must identify and point also other standards that cover the same or similar requirements.
A.2. MPEG-7 (Humberto Glória - 5000)
- To raise the actual status of definition and usage of MPEG-7. This work must identify and point also other standards that cover the same or similar requirements.
A.3. Search Engines (Edgar Oliveira - 52317)
- To identify multiple relevant cases of search engines, structuring them according to its common characteristics (features, targeted problem/area, etc.).
A.4. Network Access (Bruno Barreiros)
- To raise the actual status of usage of identifiers and resolution services to access to information. Focus on the Handle System, DOI, OpenURL, etc.
A.5. Syndication (Francisco Falcão Reis - 50991)
- To raise the actual status of usage and propose a comparative analysis of representative services based in syndication protocols (RSS, ATOM, …). A special focus on services of news aggregators is recommended.
A.6. Ranking de páginas web (Hugo Alberto - 52316)
- Estado da arte dos rankings das páginas web: A ideia é tentar saber quais as di ferentes abordagens que estão a ser estudadas nessa área. Nas aulas foram dados exemplos de análise de links, aprendizagem automática, relevance feedback, logs (possivelmente com localização geográfica do acesso), etc.
A.7. XML retrieval (Alexandre Frazão - 52308)
- To raise the actual status of definition and usage of XML retrieval tools. Use the INEX proceedings as a main reference
A.8. Image retrieval (João Gonçalves - 52473)
- To raise the actual status of definition and usage of image retrieval techniques and tools. Include content based tools and metadata based tools (Flickr, Google Images, etc.).
A.9. Music retrieval (Tiago Santos - 52475)
- To raise the actual status of definition and usage of music retrieval techniques and tools.
A.10. Calendar Information (Daniel Silva - 53808)
- To raise the actual status of definition and usage of data structures and protocols for calendars to be used in shared environments.
A.11. Electronic Resource Management Systems (Miguel Vilhena - 52421)
- To raise the actual status of definition and usage of Electronic Resource Management Systems (ERMS). An ERMS is a system intended to manage the descriptions and support administrative metadata of digital resources in libraries and organisations in general. The term is common, but the definition is not consensual, so the work must focus also in the identification and analysis of the possible definitions. Note: this is not the same as "Enterprise Resource Management System".
A.12. OAI-PMH (Miguel Vieira)
- To raise the actual status of usage of OAI-PMH (identify related open-source tools, services and projects, in two perspectives: common cases and innovative or unusual but relevant cases). Paper related with project B.7 .
A.13. Z39.50 and SRU
- To raise the actual status of definition and usage of the Z39.50 and SRU protocols (identify related open-source tools, services and projects, in two perspectives: common cases and innovative or unusual but relevant cases). Paper related with project B.8.
A.14. Names in metadata (Bruno Oliveira - 53823)
- State of the art in techniques to detect the occurrences of the name of a same person or organisation, usually as authors, in multiple metadata records. Paper related with project B.9.
A.15. The same citation
- State of the art in techniques to detect multiple occurrences of the same bibliographic citation of scientific works found in a set of citations, considering the most common metadata attributes (authors, title, date and place of publication, etc.). Paper related with project B.10.
A.16. Meta-search engines (Ricardo Freire - 53939)
- To raise the actual status on meta-search engines (existing services and their main characteristics). Paper related with project B.11.
A.17. Clustering of web results (Marcelo Ferreira - 52412)
- To raise the actual state of the art in clustering techniques for web results. Paper related with project B.12.
A.18. Relevance feedback (Diogo Galvão)
- To raise the actual state of the art in relevance feedback techniques. Paper related with project B.13.
A.19. Document Server (Ricardo Cruz - 48372)
- To raise the sate of the art of techniques to manage documents in projects and characterisation of existing project management tools according to their support to store, manage, preserve, search and retrieve documents. Paper related with project B.14.
A.20. Evaluating Web Search Engines (Tiago Janela - 52387)
- To raise the actual state of the art in techniques for evaluation of web search engines. This paper can be relevant for project B.15.
A.21. Geoparsers (André Pereira - 53831)
- To raise the actual state of the art of definition and usage of techniques for geoparsers. The work must identify also cases of usage of geoparsers, including services available in the Web.
A.22. Suggesting tags (Vanda Ribeiro - 51110)
- To raise the state of the art in the characterisation of solutions and systems the suggesting of tags
B. Projects (individual or groups of two students)
B.1. Search Trends (Edgar Oliveira - 52317 / Alexande Frazão - 52308)
- Project: The basic version of this project will comprise the development of a solution to process logs of searching services, detect trends, and store those trends in structured XML files. An advanced version will be to publish the results in indexes, similar to Google Trends ( http://www.google.com/trends ). The logs from PORBASE will be available.
B.2. Metadata Converter Service (Francisco Falcão Reis - 50991)
- Project: To develop a service for bidirectional bibliographic metadata translations (in practical terms, to contribute to a new version of the service http://urn.porbase.org, enriching it with more formats and adding also a service that can take uploaded records in any of the supported formats and convert it to any other format). NOTE: Support provided by Nuno Freire (sample records, existing code in Java and practical advising).
B.3. Alphabetic Indexes
- Develop a solution to create alphabetic browsing indexes from descriptive metadata (taking at least Dublin Core as a reference format). Work can use the JSP Tag Library Alphabetical Navigation Bar ( http://www.aikiinc.com/alphanavbar/ ) with Java APIs to process XML and generate the Web interfaces (dynamic browsing indexes, where each entry links to the full metadata record). Other references can be explored, such as http://simile.mit.edu/wiki/Longwell
B.4. Timelines (Miguel Vilhena - 52421 / Tiago Janela - 52387)
- Develop a solution to create time browsing indexes from descriptive metadata (taking at least Dublin Core as a reference format). Work can use the Javascript SMILE TimeLine ( http://simile.mit.edu/timeline/ ) with Java APIs to process XML and generate the Web interfaces (timelines, where each reference links to a full metadata record). The reference metadata will be sets of record describing old maps (metadata and thumbnails of the maps will bee available)
B.5. SmarterPhone (Humberto Glória - 50000 / Bruni Barreiros)
- A minha tese de mestrado (orientada pelo prof. Daniel Gonçalves) intitula-se de SmarterPhone (http://cgm.dei.ist.utl.pt/propostas/mestrados0708/smarterphone/). O objectivo consiste em tornar um telemóvel/pda/smartphone mais inteligente. Assim, pretende-se recolher informação sensível ao contexto do aparelho para que este possa à posteriori tomar acções inteligentes sem a interacção directa do utilizador. Pretende-se ainda recolher informação não apenas do telemóvel mas também do computador pessoal do utilizador de forma a aumentar a fonte de informação para uma base sólida para as acções inteligentes. Uma das características vai ser a recuperação de documentos. Para a parte aplicativa no computador já existem plataformas nas quais me vou basear, tal como a base de dados Scroll (desenvolvida no INESC-ID). No entanto não existe nada para telemóvel em si, por isso o projecto proposto consistirá, na sua versão básica, no desenho e desenvolvimento de uma solução de indexação para as mensagens SMS e MMS contidas no telemóvel (à partida com o algoritmo TFxIDF). A versão avançada do projecto consistirá num desenvolvimento de uma solução de ranking para as mesmas mensagens.
B.6. Document indexing (Miguel Vieira - 52411 / Hugo Alberto - 52316)
- Implementation of a generic document sort-based indexer, using the algorithms presented in the book "Managing Gigabytes". Compression and in-place algorithms are optional. The implementation should be done under the IR-BASE project framework: http://www.bcs.org/server.php?show=ConWebDoc.8762
B.7. OAI-PMH (Daniel Costa - 53829 / Daniel Silva 53808)
- Develop a service to "watchdog" the quality of service of OAI-PMH servers. This is important, for example, in scenarios where service providers harvest the metadata from the data providers, and use that metadata only to build indexes, without keeping copy of the records. In these scenarios, if for example latter on a user of a service wants to see the full record, the service provider might want to request just that record. Because this is a real time scenario, to be able to predict the behaviour of a data provider is a very important issue.
B.8. Z39.50 and SRU
- Harvest Z39.50 and SRU: To develop a solution to harvest bibliographic databases which are only available by Z39.50 and SRU. A simple version should focus local cases when the harvester and Z39.50 or SRU server in the same network and only one server is targeted. A more advanced version can focus cases when the harvester targets multiple servers in the Internet. Support provided by Nuno Freire (existing code in Java and practical advising).
B.9. Names in metadata (Bruno Oliveira - 53823 / Ricardo Freire 53939)
- Implementation of techniques to detect the occurrences of the name of a same person or organization, usually as authors, in multiple metadata records. Benchmark study with data from PORBASE.
B.10. The same citation
- Implementation of techniques to detect the occurrences of the same bibliographic citation in a set of citations. Benchmark study with data from FENIX and INESC-ID. NOTE: in case of success, the results of this project will be integrated in the FENIX system.
B.11. Meta-search engines (João Gonçalves - 52473 / Tiago Santos - 52475)
- Implementation of a meta-search engine. A meta-search engine is a search engine that, given a user query, uses other existing search-engines to obtain the results. It then combines the multiple lists of results retrieved by the search engines into a single ranking. A description of a meta-search engine can be found in the book "Modern Information Retrieval", chap. 13.6. The implementation should be done under the IR-BASE project framework: http://www.bcs.org/server.php?show=ConWebDoc.8762
B.12. Clustering of web results (Marcelo Ferreira - 52412)
- Project: Implement a system that, given a query, uses an existing search engine to obtain the results and clusters the Web pages retrieved according to topic. The result should be something similar to what is shown in http://clusty.com/ . Already existing clustering applications can be used, such as Cluto ( http://glaros.dtc.umn.edu/gkhome/cluto/cluto/overview ). The implementation should be done under the IR-BASE project framework: http://www.bcs.org/server.php?show=ConWebDoc.8762
B.13. Relevance feedback (André Pereira / Diogo Galvão)
- Project: Implement a system that, given a query, uses an existing search engine to obtain the results. The results are then presented to the user, who can select which ones are relevant. The system will then use the user's selection to improve the initial query and re-submit the results. Algorithms to be used are proposed in http://inex.is.informatik.uni-duisburg.de:2004/pdf/ker_ruthven_lalmas.pdf . The implementation should be done under the IR-BASE project framework: http://www.bcs.org/server.php?show=ConWebDoc.8762
B.14. Document Server (Ricardo Cruz)
- Project: Develop a document server for projects considering a generic project management technique where results are associated to milestones and the work is split in workpackages which can be also successively split in sub-workpackages and in the end in tasks. Issues such as descriptive metadata, document versioning and status (draft documents, approved documents, etc.) must be considered. A fundamental feature of the system must be a searching service that searches in the content of the documents and presents the references to the documents according the project structure, the document status, and all the relevant properties. It can be used an open-source wiki software, especially to support the access control and the management user interface.
B.15. Evaluating B-ON
- To develop and carry on a process of evaluation of the b-on. This work must comprise the identification of the relevant metrics, the perform of the evaluation tasks and the analysis of the results.
B.16. Detecting advertising (Vanda Ribeiro)
- Sugestão de tags para o serviço Digg. O objectivo é melhorar o sistema que um finalista do ano passado elaborou, encontrando uma forma de detectar o que são anúncios numa página e não os considerar para o processo de sugestão de tags, de forma a melhorar as tags sugeridas.