Semio: Semantic Types

The type system that makes cross-system workflows deterministic

Semio is DataGrout’s semantic interface layer. It gives tools a typed contract — declaring what kind of data they consume and produce — so the planning engine can verify workflow compatibility before anything runs, and route data between systems without LLM guesswork.

The full formal treatment is in the lab paper: Semio: A Semantic Interface Layer for Tool-Oriented AI Systems.

The Problem Semio Solves

A customer in Salesforce looks different from a customer in QuickBooks. Different field names, different IDs, different schemas. Bridging them traditionally means:

Writing hand-coded glue for every pair of systems (O(N²) complexity)
Or asking an LLM to figure it out at runtime (probabilistic, fragile, token-expensive)

Semio takes a third path: tools declare their semantic types, and the planner reasons about compatibility symbolically before execution. The LLM describes intent; Semio handles schema matching.

Semantic Types

Every Semio type follows the pattern:

<family>.<entity>@<version>

Examples:

Type	Meaning
`crm.lead@1`	A CRM lead record
`billing.invoice@1`	A billing invoice
`billing.customer@1`	A billing customer
`core.email@1`	An email address (primitive)
`crm.lead.list@1`	A list of CRM leads

These types exist independently of any vendor. A Salesforce lead and a HubSpot contact are both crm.lead@1.

Tool Contracts

Tools declare their inputs and outputs using Semio types. When you look at a tool in the Playground or via discovery.discover, you see its semantic contract:

tool: salesforce@1/get_lead@1
outputs:
  - type: crm.lead@1
    keys: [id, email]

tool: quickbooks@1/create_invoice@1
inputs:
  - name: customer
    type: billing.customer@1
    required: true
outputs:
  - type: billing.invoice@1

The planning engine uses these contracts to verify that workflow steps connect — that what one tool outputs is compatible with what the next tool expects.

Adapters: Type Bridges

When two tools use related but different types, Semio adapters bridge the gap. An adapter declares that one type can be transformed into another, using a shared identity key (like email).

crm.lead@1 ──[adapter]──▶ billing.customer@1
                anchor: email

When the planner finds a workflow that requires billing.customer@1 but you only have crm.lead@1, it inserts the adapter automatically. No LLM reasoning needed at execution time.

How This Affects Planning

When you call discovery.plan or use flow.into, the planner works with Semio types:

Input types: What data do you have to start with?
Goal type: What type does the final step need to produce?
Path search: Find a chain of tools and adapters that bridges the gap
Verification: Check type safety at every step before execution

This is what allows Cognitive Trust Certificates to assert “type safe” as a compile-time proof — the planner checked every type transformation before any tool ran.

Example: Goal is to create an invoice given only an email address:

core.email@1
  → [salesforce@1/get_lead@1]
  → crm.lead@1
  → [adapter: crm.lead@1 → billing.customer@1]
  → billing.customer@1
  → [quickbooks@1/create_invoice@1]
  → billing.invoice@1

The entire path is verified before execution begins.

Type Tiers

Fields within a Semio type are categorized into tiers that guide planning and PII handling:

Tier	Meaning
Core	Required for basic operations (`id`, `name`, `email`)
Useful	Enhance workflows but aren’t strictly required (`company`, `status`)
PII	Personally identifiable — triggers Dynamic Redaction (`email`, `phone`)
Index	Optimized for search and lookup (`email`, `company`)

The planner uses tiers to request only the fields a workflow actually needs, and to flag when PII fields require policy clearance.

Identity Anchors (Keys)

Cross-system entity resolution uses shared keys, not system-specific IDs. When a Salesforce lead and a QuickBooks customer represent the same person, they’re matched via a shared key like email — not their respective internal IDs.

Salesforce lead: { id: "00Q...", email: "jane@acme.com" }
QuickBooks customer: { id: "cust_99", email: "jane@acme.com" }
        └── matched via email anchor ──┘

This is why Semio adapters specify an anchor key: it’s the identity field that survives the type transformation.

Automatic Enrichment

When a workflow step needs a field that the previous step didn’t return, the planner searches for enrichment tools automatically. If you have a lead with only {id, email} but the next step needs status, the planner finds a tool that can look up status by id and inserts it before proceeding.

This happens transparently — you describe the goal, the planner figures out what data needs to be filled in.

Core Concepts – How discovery and planning use Semio types
Discovery Tools – Semantic search over the type graph
Cognitive Trust Certificates – Type safety as a compile-time proof
Lab paper: Semio – Full formal treatment of the type system