MEIC-A: Bibliographic Metadata Harvesting to Support the Management of an Institutional Repository

2 Novembro 2012, 18:03 - Fátima Sampaio

Dissertação: Bibliographic Metadata Harvesting to Support the Management of an Institutional Repository

Candidato: Nº 62595 Ricardo Miguel Loureiro da Costa

Presidente: Professor José Carlos Alves Pereira Monteiro

Orientador: Professor José Luís Brinquete Borbinha

Vogal: Professor Bruno Emanuel da Graça Martins

Data: 6/11/2012 – 14:30 h / Sala de Reuniões 2, Pavilhão de Informática II, IST, Alameda.

Abstract: This thesis approaches the problem of automatic harvesting of bibliographic metadata records from several indexing services, in the context of population institutional repositories. Since manual inserting the records is a tedious and error-prone task, the automation of the process intends to facilitate the management of a repository. However, the automated harvest of records has to deal with the problem of identifying authors and with the need to consolidate duplicate records retrieved from different services. In an approach to the automation of the aforementioned task, we introduce a system that proposes to harvest bibliographic metadata records from different information sources publicly available, identify and consolidate the retrieved records that are considered duplicates and make available the results of such consolidation to external parties that are interested in the information, such as an institutional repository.
The proposed system was tested with real bibliographic metadata corresponding to scientific publications of a subset of faculty members at Instituto Superior Técnico. The results of the evaluation show that, despite the required time to identify and consolidate, the merged records contain a valid aggregation of all available information in the system and can be efficiently accessed by external entities through a machine-to-machine interface.

Keywords: Bibliographic Metadata, Automated Harvesting, Institutional Repositories, Duplicate Consolidation.