Skip to content

Use case: embeddings & feedback tuning

The vector leg of hybrid search is only as good as its embeddings. little big brain computes embeddings for you — you pick a model per graph, and the service embeds your data and your queries with it. This guide covers configuring managed embeddings and closing the loop with graded feedback so retrieval improves on your data.

Configure a per-graph embedding model. Once set, queries auto-embed with the same model at query time, ingest can embed on write, and you can backfill existing entities.

  • Pick a model per graph (e.g. BAAI/bge-small-en-v1.5, all-MiniLM-L6-v2, e5-small-v2, Qwen3-Embedding-0.6B). The console Models → Embeddings view lists the available models and their dimensions.
  • Backfill existing entities to populate vectors for data written before you enabled the model.
  • Embed-on-ingest so new facts get vectors automatically.
  • Promote a freshly built run so search starts using it.

Confirm the vector leg is using the model you configured — add explain=true to a search and check explain.vector_model_id:

Terminal window
curl "https://db.eu.littlebigbrain.com/v1/search?graph=<g>&explain=true&query=..." \
-H "Authorization: Bearer $LBB_API_KEY"
# explain.vector_model_id should equal the model you set, e.g. "BAAI/bge-small-en-v1.5"

If explain.vector_model_id isn’t the model you expect, check your graph’s embedding configuration in Retrieval → Embeddings.

Retrieval quality is domain-specific. little big brain captures graded relevance feedback and exports it as qrels you can use to fine-tune the embedding model:

  1. Grade results. Call searchFeedback (TS) / search_feedback (Python) to grade a result for a query: 3 = relevant, 1 = partial, 0 = irrelevant. The console Models → Feedback and Labeled data views provide a UI for this, including judging against public sets like BEIR · SciFact or MS MARCO · dev.
  2. Export qrels. searchFeedbackExport / search_feedback_export emits the accumulated judgments as customer qrels (exportable as JSONL from the console).
  3. Fine-tune. Feed the qrels into embedding fine-tuning; the resulting model becomes a managed model you can activate per graph. The console Models → Training view drives this.
// Grade a result during or after a search session.
await lbb.searchFeedback({
query: "which systems store customer identity data",
target: { entity_type: "DATABASE", name: "user-db" },
grade: 3,
});
// Later: export the accumulated qrels.
const qrels = await lbb.searchFeedbackExport();

This is the same loop an agent can drive automatically — see Close the loop in the agent-memory guide.

Track recall/nDCG as you change models or tune. The console Search & eval view runs accuracy/latency checks so you can compare configurations, and grading against a public set (BEIR · SciFact, MS MARCO · dev) in Models → Feedback gives you a comparable baseline.

On BEIR · SciFact, the managed bge-small path reaches recall@10 ≈ 0.84 / nDCG ≈ 0.71 — a realistic baseline before any domain fine-tuning.