Vector search & embeddings

Searching by meaning instead of exact keywords using multi-dimensional math.

The idea

AI models convert text (or images) into Embeddings—lists of numbers (vectors) that capture meaning. Words with similar meanings end up close together in this mathematical space.

To find the most relevant documents for a query, we convert the query into a vector and find its nearest neighbours. But calculating the distance to every document (Brute-Force KNN) is too slow for millions of items. Instead, we use Approximate Nearest Neighbour (ANN) algorithms like HNSW, which navigate a graph to rapidly zoom in on the closest points.

2D Representation of Vector Space. Click a search method!

How it works (RAG retrieval)

def search_rag(query_text):
    # 1. Convert query text to a vector using an embedding model
    query_vector = embedding_model.encode(query_text)
    
    # 2. Use ANN index (e.g. HNSW) to find nearest neighbours fast
    # O(log N) instead of O(N)
    top_k_ids = vector_db.ann_search(query_vector, k=3)
    
    # 3. Fetch the actual text for those IDs
    context = db.get_documents(top_k_ids)
    
    # 4. Pass the context to an LLM
    return llm.generate_answer(query_text, context)