Retrieval Recipes

HyDE Retrieval Pattern

HyDE — Hypothetical Document Embeddings — closes the semantic gap between short user questions and long source passages by first asking the model to imagine the answer, then embedding that imagined answer instead of the raw question. On Meridian, the pattern is a two-call recipe: one cheap chat completion followed by one embeddings call, both billed at the same 20% markup as every other route.

1. Generate a hypothetical passage

Send the user query to a fast chat model with a system prompt that says “write a passage that answers the question.” Do not constrain length — the goal is a plausible answer-shaped string, not a correct one. Hallucinations here are fine because the draft is never shown to the user.

2. Embed the hypothetical, not the query

The draft passage lives in the same embedding neighborhood as the real source passages, so cosine similarity jumps sharply versus embedding the bare question. Meridian recommends azure/text-embedding-3-large for English corpora and the multilingual variant otherwise.

3. Retrieve and rerank

Search the vector store with the hypothetical embedding, then pass the top-k chunks through a cross-encoder rerank step before handing them to the final answering model. HyDE typically lifts recall@10 by 15–30% on sparse FAQ corpora and is cheap enough to run on every query.

// HyDE: Hypothetical Document Embeddings
import { meridian } from "@meridian/sdk";

async function hydeRetrieve(query: string) {
  // Step 1: generate a hypothetical answer
  const hypothetical = await meridian.chat.completions.create({
    model: "azure/gpt-4.1",
    messages: [
      { role: "system", content: "Write a passage that answers the question." },
      { role: "user", content: query },
    ],
  });

  const draft = hypothetical.choices[0].message.content;

  // Step 2: embed the hypothetical, not the query
  const embedding = await meridian.embeddings.create({
    model: "azure/text-embedding-3-large",
    input: draft,
  });

  // Step 3: search the vector store with that embedding
  return await vectorStore.search(embedding.data[0].embedding, { topK: 8 });
}