Data

Vector Database

A database optimized for storing and querying high-dimensional vector embeddings for semantic search.

Detailed Explanation

A Vector Database stores data as high-dimensional vectors (arrays of numbers) rather than rows and columns. These vectors are generated by embedding models that capture the semantic meaning of text, images, or other data. This enables similarity search: instead of matching exact keywords, you can find data that is semantically similar to your query. This is the backbone of modern AI applications like RAG, recommendation systems, and image search. Popular vector databases include Pinecone, Weaviate, Qdrant, and pgvector.

How It Works

Embedding Generation

Raw data (text, images) is passed through an embedding model that outputs a fixed-size vector representation.

Indexing

Vectors are stored with metadata and indexed using algorithms like HNSW or IVF for fast approximate nearest neighbor search.

Similarity Search

A query vector is compared against stored vectors using distance metrics (cosine similarity, Euclidean distance).

Result Retrieval

The most similar vectors are returned with their associated metadata and original content.

Real-World Use Cases

Semantic Search

Search engines that understand intent rather than just keywords, returning contextually relevant results.

RAG Pipelines

Storing document embeddings for retrieval-augmented generation in LLM applications.

Recommendation Systems

Finding similar products, content, or users based on behavioral or content embeddings.

Related Terms

Rag Agentic Workflow Edge Computing

Related Services

Ai Machine Learning Data Engineering

Need help implementing these?

Knowing the definition is step one. Building it into your product is step two. That's where we come in.

Back to Glossary Consult with Engineers