Use case: ingest → index → search
This is the canonical little big brain pipeline: take some domain knowledge, write it as typed facts, build search indexes, and answer natural-language questions over it. We’ll model a small “systems and data” knowledge graph — the kind of thing you’d feed an agent that answers questions about your infrastructure.
-
Model the domain. Decide your entity types and relations. Here:
SERVICE,DATABASE, andDATASETentities, connected byWRITES_TO,READS_FROM, andSTORESrelations. On the hosted product the default ontology already covers generic AI-context vocabulary; for a bespoke schema, define a graph ontology first (see SPARQL & SHACL or the MCPdefine_ontology). -
Write facts as triplets. Each triplet carries a
confidenceandevidencestring — the provenance stays attached to the assertion.import { LbbClient } from "@lbb/client";const lbb = new LbbClient({baseUrl: "https://db.eu.littlebigbrain.com",apiKey: process.env.LBB_API_KEY,});await lbb.graph("main").facts.create({triplets: [{ source: { type: "SERVICE", name: "auth-service" }, relation: "WRITES_TO",target: { type: "DATABASE", name: "user-db" },confidence: 0.93, evidence: "auth-service writes identity records to user-db" },{ source: { type: "SERVICE", name: "billing-service" }, relation: "READS_FROM",target: { type: "DATABASE", name: "user-db" },confidence: 0.9, evidence: "billing reads customer identity for invoicing" },{ source: { type: "DATABASE", name: "user-db" }, relation: "STORES",target: { type: "DATASET", name: "customer identity data" },confidence: 0.97, evidence: "user-db is the system of record for customer identity" },],},{ idempotencyKey: "systems-import-v1" },);from lbb import LbbClientlbb = LbbClient("https://db.eu.littlebigbrain.com", api_key=os.environ["LBB_API_KEY"])lbb.graph("main").facts.create({"triplets": [{"source": {"type": "SERVICE", "name": "auth-service"}, "relation": "WRITES_TO","target": {"type": "DATABASE", "name": "user-db"},"confidence": 0.93, "evidence": "auth-service writes identity records to user-db"},{"source": {"type": "SERVICE", "name": "billing-service"}, "relation": "READS_FROM","target": {"type": "DATABASE", "name": "user-db"},"confidence": 0.9, "evidence": "billing reads customer identity for invoicing"},{"source": {"type": "DATABASE", "name": "user-db"}, "relation": "STORES","target": {"type": "DATASET", "name": "customer identity data"},"confidence": 0.97, "evidence": "user-db is the system of record for customer identity"},],}, idempotency_key="systems-import-v1")For large loads (thousands+ of facts), use bulk NDJSON ingest — see Hybrid retrieval and the
POST /v1/graph/importendpoint in the HTTP API. -
Build indexes. BM25, vector/ANN, and adjacency runs are derived from the snapshot.
wait: trueblocks until they’re ready.await lbb.indexes.run({ wait: true });lbb.indexes.run(wait=True) -
Ask a question. Hybrid search fuses lexical + BM25 + vector + ontology + graph-neighborhood signals. The query never has to match keywords exactly — “customer identity” surfaces
user-dband the assertions that connect it.const results = await lbb.search.hybrid("which systems store customer identity data",{ topK: 5, source: "persisted", consistency: "strong", targets: ["entities", "assertions"] },);for (const a of results.assertions ?? []) {console.log(a.relation?.name, "→", a.target?.name, a.score);}results = lbb.search.hybrid("which systems store customer identity data",top_k=5, source="persisted", consistency="strong",targets=["entities", "assertions"],)for a in results.get("assertions", []):print(a["relation"]["name"], "→", a["target"]["name"], a["score"]) -
Follow the graph. Once search finds an anchor entity, traverse its neighborhood or ask why an assertion exists — the evidence you wrote in step 2 comes back as provenance.
await lbb.traverse({ entity: { entity_type: "DATABASE", name: "user-db" }, relation: "READS_FROM", direction: "in" });await lbb.why({ /* the assertion you want to justify */ });lbb.traverse({"entity": {"entity_type": "DATABASE", "name": "user-db"}, "relation": "READS_FROM", "direction": "in"})lbb.why({ ... })
Why this shape
Section titled “Why this shape”- Facts are append-only. Re-running the import with the same idempotency key
is a no-op; new evidence on an existing edge is recorded, not overwritten. You
get a temporal record for free — read
historyto see how a relationship changed. - Indexes are disposable. If you change your embedding model or tokenizer, rebuild — the graph is the source of truth, indexes are derived.
- Consistency is explicit.
strongoverlays any facts committed after the last index run, so search never lags a write you just made.
Keeping it fresh
Section titled “Keeping it fresh”As new facts arrive, commit them and either let the hosted product rebuild base
indexes automatically or call indexes.delta / indexes.run yourself. Use
indexes.gc to retire superseded runs. For a scheduled sync, a small script
that commits a day’s facts and refreshes indexes is enough — this is exactly how
the project dogfoods its own product-development graph.
- Hybrid retrieval, filters & facets — narrow and bucket results.
- SPARQL, structured query & SHACL — analytical and graph-shaped questions.
- Agent long-term memory — the same pipeline driven by an AI agent through MCP.