Vector Databases

A vector database is a core component of most LLM (Large Language Model) applications, specifically designed to store vectorized data, such as documents with embeddings. It is a dedicated database that enables rapid retrieval of the most relevant documents based on queries by identifying embeddings that are most similar to the query.

In simple terms, a vector database helps you store documents from the data source, send queries from user input, and return the most relevant portions related to the LLM.

The working principle of a vector database is as follows:

  • Receive documents or text segments from the data loader.

  • Receive queries in text form (from user input or a legal master's program).

  • Embed the text segments using OpenAI text-ada-002 embeddings.

  • Query for the most relevant embeddings.

  • Return the more relevant document segments to be sent to the LLM or output node.

Last updated