Abaixo encontra-se uma lista de recursos relacionados com a área da recuperação de informação, em particular ferramentas de software utilizadas no desenvolvimento de sistemas recuperação de informação.

  • Livros, outros cursos, conferências e recursos.
  • RankLib, a library of learning to rank algorithms.
  • Lucene, an open-source, mature and high-performance retrieval engine in Java.
  • Luke, a development and diagnostic tool for accessing and managing Lucene indexes.
  • LingPipe, a toolkit for processing text using techniques from computational linguistics.
  • Stanford Core NLP, a set of natural language analysis tools.
  • OBSearch, a general similarity search engine.
  • Managing Gigabytes for Java, a free full-text search engine for large document collections written in Java.
  • WebGraph, a framework to study Web graphs.
  • TREC eval, a program to evaluate IR system results using the standard TREC evaluation procedures. 
  • Weka, a general machine learning framework (see the tutorial on text classification)
  • Scikit-learn, a machine learning library in Python.
  • NLTK, a platform for building Python programs to work with human language data.
  • PyLucene, a Python extension for accessing Lucene's text indexing and searching capabilities