Databases vs. Information Retrieval

Information Retrieval is concerned with the representation, storage, organization of, and access to information items.

  • Focus on automatic processing (indexing, clustering, search) of unstructured data (text, images, audio, ...)

  • Some applications:

    • searching in a library catalog

    • categorizing a collection of articles by area

    • web search engines etc

The main difference between databases and IR is that databases focus on structured data while IR focuses mainly on unstructured data ("documents") such as web pages, emails, images etc.

Also, databases are concerned with data retrieval, not information retrieval.

Additionally, while IR is focused on the human user, databases are designed to allow applications to be built on top of them which can then be used by human users.

Last updated