Efficient similarity search in high-dimensional spaces

23 novembro 2018, 08:00 Bruno Emanuel Da Graça Martins

  • Modeling IR/IE taska as similarity search in high-dimensional spaces
  • The Jaccard similarity coefficient
  • Min-hashing and Locality Sensitive Hashing
    • Representing documents as sets of shingles
    • The min-hashing scheme for generating signatures for sets of shingles
    • Locality sensitive hashing for finding candidate pairs of similar instances