Semio: Semantic Types
The type system that makes cross-system workflows deterministic
Semio is DataGrout’s semantic interface layer. It gives tools a typed contract — declaring what kind of data they consume and produce — so the planning engine can verify workflow compatibility before anything runs, and route data between systems without LLM guesswork.
The full formal treatment is in the lab paper: Semio: A Semantic Interface Layer for Tool-Oriented AI Systems.
The Problem Semio Solves
A customer in Salesforce looks different from a customer in QuickBooks. Different field names, different IDs, different schemas. Bridging them traditionally means:
- Writing hand-coded glue for every pair of systems (O(N²) complexity)
- Or asking an LLM to figure it out at runtime (probabilistic, fragile, token-expensive)
Semio takes a third path: tools declare their semantic types, and the planner reasons about compatibility symbolically before execution. The LLM describes intent; Semio handles schema matching.
Semantic Types
Every Semio type follows the pattern:
<family>.<entity>@<version>
Examples:
| Type | Meaning |
|---|---|
crm.lead@1 |
A CRM lead record |
billing.invoice@1 |
A billing invoice |
billing.customer@1 |
A billing customer |
core.email@1 |
An email address (primitive) |
crm.lead.list@1 |
A list of CRM leads |
These types exist independently of any vendor. A Salesforce lead and a HubSpot contact are both crm.lead@1.
Tool Contracts
Tools declare their inputs and outputs using Semio types. When you look at a tool in the Playground or via discovery.discover, you see its semantic contract:
tool: salesforce@1/get_lead@1
outputs:
- type: crm.lead@1
keys: [id, email]
tool: quickbooks@1/create_invoice@1
inputs:
- name: customer
type: billing.customer@1
required: true
outputs:
- type: billing.invoice@1
The planning engine uses these contracts to verify that workflow steps connect — that what one tool outputs is compatible with what the next tool expects.
Adapters: Type Bridges
When two tools use related but different types, Semio adapters bridge the gap. An adapter declares that one type can be transformed into another, using a shared identity key (like email).
crm.lead@1 ──[adapter]──▶ billing.customer@1
anchor: email
When the planner finds a workflow that requires billing.customer@1 but you only have crm.lead@1, it inserts the adapter automatically. No LLM reasoning needed at execution time.
How This Affects Planning
When you call discovery.plan or use flow.into, the planner works with Semio types:
- Input types: What data do you have to start with?
- Goal type: What type does the final step need to produce?
- Path search: Find a chain of tools and adapters that bridges the gap
- Verification: Check type safety at every step before execution
This is what allows Cognitive Trust Certificates to assert “type safe” as a compile-time proof — the planner checked every type transformation before any tool ran.
Example: Goal is to create an invoice given only an email address:
core.email@1
→ [salesforce@1/get_lead@1]
→ crm.lead@1
→ [adapter: crm.lead@1 → billing.customer@1]
→ billing.customer@1
→ [quickbooks@1/create_invoice@1]
→ billing.invoice@1
The entire path is verified before execution begins.
Type Tiers
Fields within a Semio type are categorized into tiers that guide planning and PII handling:
| Tier | Meaning |
|---|---|
| Core |
Required for basic operations (id, name, email) |
| Useful |
Enhance workflows but aren’t strictly required (company, status) |
| PII |
Personally identifiable — triggers Dynamic Redaction (email, phone) |
| Index |
Optimized for search and lookup (email, company) |
The planner uses tiers to request only the fields a workflow actually needs, and to flag when PII fields require policy clearance.
Identity Anchors (Keys)
Cross-system entity resolution uses shared keys, not system-specific IDs. When a Salesforce lead and a QuickBooks customer represent the same person, they’re matched via a shared key like email — not their respective internal IDs.
Salesforce lead: { id: "00Q...", email: "jane@acme.com" }
QuickBooks customer: { id: "cust_99", email: "jane@acme.com" }
└── matched via email anchor ──┘
This is why Semio adapters specify an anchor key: it’s the identity field that survives the type transformation.
Automatic Enrichment
When a workflow step needs a field that the previous step didn’t return, the planner searches for enrichment tools automatically. If you have a lead with only {id, email} but the next step needs status, the planner finds a tool that can look up status by id and inserts it before proceeding.
This happens transparently — you describe the goal, the planner figures out what data needs to be filled in.
Related
- Core Concepts – How discovery and planning use Semio types
- Discovery Tools – Semantic search over the type graph
- Cognitive Trust Certificates – Type safety as a compile-time proof
- Lab paper: Semio – Full formal treatment of the type system