SPARQL & Vector Extensions

Backward compatible Every valid SPARQL 1.1 query runs unchanged on SutraDB. The extensions below are purely additive — they add new operators that standard SPARQL cannot express.

Standard SPARQL 1.1 Support

SutraDB implements the core of SPARQL 1.1 Query Language:

Feature	Status
SELECT with projection, *, DISTINCT	Implemented
Basic Graph Patterns (triple patterns)	Implemented
PREFIX declarations	Implemented
FILTER (=, !=, <, >, bound, !bound)	Implemented
OPTIONAL	Implemented
UNION	Implemented
ORDER BY (ASC/DESC)	Implemented
LIMIT / OFFSET	Implemented
`a` shorthand for rdf:type	Implemented
Typed literals, integer literals	Implemented
GROUP BY / HAVING / aggregates	Planned
Property paths (+, *, ?)	Planned
Subqueries	Planned
CONSTRUCT / ASK / DESCRIBE	Planned
String functions (CONTAINS, REGEX, etc.)	Planned

Vector Extensions

SutraDB adds two new operators to SPARQL for approximate nearest neighbor (ANN) search over vector embeddings. These operators integrate directly into graph patterns — you can traverse the graph, jump into vector space, and come back in a single query.

VECTOR_SIMILAR

Finds entities whose vector embeddings are similar to a query vector, above a cosine similarity threshold.

-- Basic usage: find entities similar to a query vector
SELECT ?entity WHERE {
  VECTOR_SIMILAR(?entity :hasEmbedding
    "0.23 -0.11 0.87 ..."^^sutra:f32vec, 0.85)
}

-- With ef_search hint (higher = better recall, slower)
VECTOR_SIMILAR(?entity :hasEmbedding
    "..."^^sutra:f32vec, 0.85, ef:=200)

-- With top-K mode (return K nearest, ignore threshold)
VECTOR_SIMILAR(?entity :hasEmbedding
    "..."^^sutra:f32vec, 0.0, k:=10)

VECTOR_SCORE

Returns the similarity score for ranking in ORDER BY clauses:

SELECT ?paper WHERE {
  ?paper a :Paper .
  VECTOR_SIMILAR(?paper :hasEmbedding "..."^^sutra:f32vec, 0.7)
} ORDER BY DESC(VECTOR_SCORE(?paper :hasEmbedding "..."^^sutra:f32vec))
LIMIT 10

Wormhole Queries

"Wormhole" queries traverse between graph space and vector space in a single query. This is what makes SutraDB unique — no other triplestore can express these patterns.

Graph → Vector → Graph

Start with a graph pattern, jump into vector space to find similar entities, then traverse back into the graph:

-- Find papers by a specific author, then find papers with similar
-- embeddings, then get their titles
SELECT ?similar_paper ?title WHERE {
  -- Graph: find papers by Vaswani
  ?original ex:author ex:Vaswani .

  -- Vector: find papers with similar embeddings
  VECTOR_SIMILAR(?similar_paper ex:hasEmbedding
    "..."^^sutra:f32vec, 0.8)

  -- Graph: get the title of the similar paper
  ?similar_paper ex:title ?title .
}

Vector → Graph

Start in vector space, then use graph patterns to filter or enrich:

-- Find entities similar to a query, then check what type they are
SELECT ?entity ?type WHERE {
  VECTOR_SIMILAR(?entity ex:hasEmbedding "..."^^sutra:f32vec, 0.9)
  ?entity a ?type .
}

How Vectors Live in the Graph

Vectors are not a separate system — they are primitive objects in the graph, like strings or integers. An entity connects to its vector via an ordinary triple:

-- A vector is the object of a triple
:paper42 :hasEmbedding "0.23 -0.11 0.87 ..."^^sutra:f32vec .

-- Multiple entities can share the same vector (disambiguation)
:bank_financial :hasEmbedding "0.5 0.3 ..."^^sutra:f32vec .
:bank_river     :hasEmbedding "0.5 0.3 ..."^^sutra:f32vec .

When you run VECTOR_SIMILAR, it searches the HNSW index for matching vector objects, then follows the triples backward to find which entities connect to those vectors. This means:

A vector never exists without at least one triple pointing to it
Triples always outnumber or equal vectors in a database
Vector search inherently does entity resolution — if two entities share a vector, both are returned

Query Planner

The query planner automatically decides execution order based on pattern selectivity:

Situation	Strategy
Subject unbound at VECTOR_SIMILAR	Execute vector search first (top-K), then evaluate graph patterns over candidates. Vector pattern gets weight 1 (highest priority).
Subject bound before VECTOR_SIMILAR	Execute graph patterns first to bind the subject, then filter by vector similarity. Vector pattern gets weight 5.
FILTER expressions	Pushed after their dependent variable bindings (weight 10).
OPTIONAL clauses	Always last (weight 20).

SPARQL Protocol

SutraDB implements the SPARQL 1.1 Protocol over HTTP:

Endpoint	Method	Description
`/sparql`	GET	Query via `?query=` URL parameter
`/sparql`	POST	Query in request body
`/triples`	POST	Bulk insert N-Triples data
`/vectors/declare`	POST	Declare a vector predicate with HNSW parameters
`/vectors`	POST	Insert a vector embedding (creates triple + HNSW entry)
`/health`	GET	Health check