Skip to content

Use case: SPARQL, structured query & SHACL

When you need exact answers — counts, aggregates, path patterns, validation — reach past hybrid search for the query surfaces. Three complementary APIs sit over the same object-storage graph, all snapshot-consistent with search:

  • Structured query — a typed SPARQL-subset SELECT/aggregate you build as JSON. Best from application code.
  • SPARQL 1.1 text — the conformant engine; also a native /sparql Protocol endpoint for external tools.
  • SHACL — validation, shape selection, property paths, and rule-based inference.

The structured surface is a SELECT/ASK over a basic graph pattern with typed filters, GROUP BY, aggregates, HAVING, and ORDER BY. It’s the ergonomic choice when you’re assembling a query in code and want typed attribute predicates without writing RDF IRIs.

“Commits per area per month” — one server-side query, not N fetches bucketed by hand:

{
"patterns": [{ "subject": { "var": "c" }, "predicate": "committed_to", "object": { "var": "repo" } }],
"group_keys": [
{ "date_bucket": { "var": "c", "field": "committed_at", "granularity": "month", "as": "m" } },
{ "property": { "var": "c", "field": "area", "as": "area" } }
],
"aggregates": [{ "func": "count", "as": "n" }],
"order_by": [{ "var": "m" }]
}

GROUP BY keys can be entity identity, a typed scalar attribute ({ property: { var, field, as } }), or a calendar bucket of a datetime attribute ({ date_bucket: … }). Those same attribute fields work in filters and as SUM/AVG/MIN/MAX operands. A filters/having entry has the shape { compare: { op: eq|ne|lt|le|gt|ge, left, right } } (or and/or/not), where each operand is { var }, { property: { var, field } }, or a typed { value: { str | i64 | f64 | bool | date_time | entity } }.

Discover the queryable attribute field names with an ontology view (property_defs) or a schema read. Attribute pattern predicates are case-insensitive here; add combinators (union/optional/minus/exists/ not_exists) for group-graph-pattern legs.

// Typed attribute filters without RDF IRIs.
await lbb.entities.filterByAttributes({
patterns: [{ subject: { var: "service" }, predicate: "WRITES_TO", object: { var: "db" } }],
where: [{ field: "slo", op: "ge", value: 0.99 }, { var: "db", field: "tier", value: "prod" }],
select: ["service"],
});
// Full structured SELECT/aggregate.
await lbb.sparql(/* SparqlSelectRequest body */);

Run conformant SPARQL SELECT/ASK and get parsed rows back — no zipping head.vars with binding values:

const { vars, rows } = await lbb.sparqlRows({
query: `SELECT ?service ?db WHERE {
?service <https://littlebigbrain.com/r/writes_to> ?db
} LIMIT 10`,
reason: true, // fold in rule-derived edges
});
for (const row of rows) console.log(row.service, "->", row.db);
const exists = (await lbb.sparqlRows({ query: "ASK { ?s ?p ?o }" })).boolean;

Engine extensions are options/keyword args: reason folds rule-derived edges, entailment: "none" disables the default rdfs:subClassOf closure, and as_of_valid_time / as_of_commit_seq pin a snapshot for time-travel queries.

A standalone stack serves the native SPARQL 1.1 Protocol at /sparql so YASGUI, Protégé, and RDFLib’s SPARQLWrapper connect directly — GET ?query=, POST form or application/sparql-query body, Accept-negotiated JSON/XML/CSV/TSV. See the HTTP API.

SHACL evaluates node shapes over the graph:

  • select returns focus nodes that match a shape — a graph-shaped query.
  • validate returns a conformance report — did the data satisfy the constraints?

It’s the home of property paths (inverse, sequence, alternative, one_or_more/zero_or_more/zero_or_one), literal constraints (datatype, minInclusive…, pattern, length), unique cross-node keys, closed nodes, and logical and/or/not/xone.

await lbb.shacl({ /* node shape: select matching focus nodes, or validate */ });

SHACL-AF rules run to a bounded fixpoint and derive new edges. Preview the derived edges first (they’re never written), then store a rule set so validation and selection can run over inferred facts:

  • Preview: lbb_query mode infer (MCP) / the inference request in the SDKs.
  • Store rules: lbb_configure action define_rules (MCP) or the CLI/SDK schema path. Replacing the stored set with an empty array requires an explicit confirm.

Rule inference is one half of little big brain’s reasoning layer (the other is RDFS type closure). Both fold into ordinary queries via the reason and entailment controls — the Reasoning & inference guide covers this end to end.

Treat RDF/SHACL schema changes as preview → publish:

  1. Read current schema and stored rules.
  2. Preview a proposed SHACL/RDF bundle — you get a preview_digest, a compatibility verdict, allowed publish modes, and an audit summary.
  3. Publish with the exact source you previewed plus the preview_digest; restrictive warn publishes require confirm_restrictive: true.

The SDK schema.* methods expose this flow — schema.preview, schema.publish, and schema.audit.

little big brain’s graph is backed by RDF terms and validated with SHACL. That’s not academic box-ticking — the two standards happen to solve problems that are acute for AI agents. Retrieval (covered in agent memory) governs what the model reads; RDF and SHACL govern what it writes and let the system reason.

RDF: a write model that matches how LLMs produce knowledge

Section titled “RDF: a write model that matches how LLMs produce knowledge”
  • Atomic, self-describing facts. A triple is subject–predicate–object. An agent emits knowledge one claim at a time and the store composes those claims into a graph — no table to design, no columns to pick, no migration to represent a relationship it hadn’t seen before. This is a much better fit for a model’s incremental output than a fixed relational schema.
  • Stable global identifiers. An entity is the same entity everywhere it’s referenced. Facts from turn 1 and turn 900 — or from two different agents — merge on identity instead of drifting into near-duplicate rows.
  • Open-world and additive. Adding a fact never rewrites the others, and missing information means “unknown”, not “false” — exactly the epistemic state an agent is usually in. It can record what it learns without having to complete a schema first.
  • Typed literals. Values carry datatypes (numbers, dates, booleans), so downstream questions are real comparisons and aggregations, not string matching.
  • Standards and portability. The same graph is queryable with SPARQL and readable by off-the-shelf RDF tooling, so an agent’s memory isn’t trapped in a bespoke format.

SHACL: guardrails and a self-correction loop for writes

Section titled “SHACL: guardrails and a self-correction loop for writes”

An LLM will occasionally write something malformed, incomplete, or hallucinated. SHACL turns “hope the model got it right” into an enforceable contract.

  • Shapes are a contract. Declare what a well-formed Ticket or Customer looks like — required properties, datatypes, value ranges, allowed values (sh:in), cardinality, uniqueness, property paths, closed nodes.
  • Validation is a checkpoint in the loop. Validate an agent’s proposed write against the shapes; if it doesn’t conform, you catch the bad fact before it pollutes memory.
  • The report is machine-readable feedback. A conformance report names the focus node, the failing constraint, and a message — structured enough for the agent to repair its own output and retry. That is a grounding loop on writes, the counterpart to the retrieval-feedback loop on reads.
propose fact ──▶ SHACL validate ──▶ conforms? ──yes──▶ commit
│ │
└─── report ◀── no ─┘
agent reads the violation
(node, constraint, message)
revises the fact ──▶ retry
  • Inference derives what you shouldn’t ask the model to re-derive. SHACL-AF rules run to a bounded fixpoint and entail new edges deterministically (transitive relationships, role derivations, classifications). The graph gets richer as facts accumulate, using auditable rules instead of ad-hoc LLM reasoning that varies run to run. Preview the derived edges before storing the rules.
  • Preview-then-publish keeps schema evolution safe. When an agent needs to change the shapes themselves, the gated preview/publish flow (digest + compatibility verdict) stops a careless change from silently invalidating existing memory.

An agent on little big brain runs two improvement loops over the same graph:

Loop Signal Effect
Read — retrieval feedback grade what search returns (3/1/0) fine-tune embeddings → better recall next time
Write — SHACL validation conformance report on proposed facts repair on the report → only clean, consistent facts persist; rules entail the rest

Together the agent retrieves better over time and writes cleaner, self-consistent knowledge — with provenance and a temporal history behind both. See the agent long-term memory guide for the read loop in code.

You want… Use
Counts/aggregates from app code, typed attributes, no RDF Structured query / filterByAttributes
Standard SPARQL, path patterns, external tools SPARQL 1.1 text / /sparql
Validation, shape selection, property paths, inference SHACL
Fuzzy, natural-language recall Hybrid search