Why SutraDB uses RDF-star — the most expressive superset of the W3C data model.
The RDF data model has evolved through three stages. Each is a strict superset of the one before:
The original W3C standard. Data is organized as triples: (subject, predicate, object). Simple and powerful, but with one major limitation — if you want to make a statement about a statement, you need reification, which is verbose and awkward:
-- The fact:
:Alice :knows :Bob .
-- To say "Alice knows Bob with 90% confidence" in legacy RDF:
:statement1 rdf:type rdf:Statement .
:statement1 rdf:subject :Alice .
:statement1 rdf:predicate :knows .
:statement1 rdf:object :Bob .
:statement1 :confidence 0.9 .
-- That's 5 triples to annotate one fact!
The W3C's in-progress update to RDF. Introduces triple terms — a triple can be referenced as a value, but only in the object position:
-- The fact:
:Alice :knows :Bob .
-- In RDF 1.2, annotate via rdf:reifies:
:statement1 rdf:reifies << :Alice :knows :Bob >> .
:statement1 :confidence 0.9 .
-- Better (3 triples), but the triple term can only be an object
<< :Alice :knows :Bob >> :confidence 0.9 directly — you need the intermediate rdf:reifies node.
RDF-star removes the restriction. Triple terms can appear in any position — subject, predicate, or object. This is the most natural way to annotate triples:
-- The fact:
:Alice :knows :Bob .
-- In RDF-star, annotate directly:
<< :Alice :knows :Bob >> :confidence 0.9 .
-- Just 2 triples. The triple IS the subject.
This is especially powerful for vector embeddings on edges:
-- An embedding on a relationship (RDF-star)
<< :paper42 :discusses :TransformerArchitecture >>
:hasEmbedding "0.23 -0.11 0.87 ..."^^sutra:f32vec .
<< :paper42 :discusses :TransformerArchitecture >>
:confidence 0.91 .
| Feature | Legacy RDF | RDF 1.2 | RDF-star |
|---|---|---|---|
| Basic triples (S, P, O) | Yes | Yes | Yes |
| Triple terms as objects | No | Yes | Yes |
| Triple terms as subjects | No | No | Yes |
| Direct edge annotation | Reification (verbose) | Via rdf:reifies | Native |
| Triples to annotate one fact | 5 | 3 | 2 |
SutraDB's primary use case is storing vector embeddings alongside graph structure. The << s p o >> :hasEmbedding ... pattern is the natural representation for "this edge has an embedding." RDF 1.2's object-only restriction would require an extra indirection node for every annotated edge.
<< :paper :cites :other >> :hasEmbedding "..."^^sutra:f32vec — one triple, not three<< :Alice :birthDate "1990" >> :source :Wikipedia — directly attach source to the fact<< :model :predicts :outcome >> :confidence 0.95 — annotate predictions with scores<< :company :ceo :person >> :startDate "2020" ; :endDate "2023"Quoted triples are handled efficiently using content-addressed hashing:
<< S P O >> is hashed as xxHash3(S_id, P_id, O_id) → u64Collision probability for xxHash3 on u64 is negligible at any realistic graph size.
Oxigraph, the closest existing Rust triplestore, migrated from RDF-star to RDF 1.2 in v0.5. Their rationale was alignment with the W3C working draft. SutraDB took the opposite position: RDF-star's direct edge annotation is essential for the vector embedding use case, and since it's a superset, RDF 1.2 compatibility comes for free.