ClawSeed Architecture Overview¶
Overview¶
ClawSeed is an AI agent runtime written in Rust. It connects to LLM providers (Anthropic, Gemini, Bedrock, OpenAI-compatible endpoints, and more), acts through pluggable tools, and serves clients over HTTP/WebSocket.
Core design principle: runtime, not application. ClawSeed provides crates that applications assemble — it does not bundle channels, dashboards, or integrations. See Runtime vs Application below.
Runtime vs Application¶
An agent runtime should do exactly three things: receive messages, call an LLM, and execute tools. Everything else — where messages come from, how results are displayed, which integrations are wired up — belongs to the application layer.
ClawSeed is a runtime. Applications built on it decide:
- How users interact (CLI, mobile app, chat bot, web dashboard)
- Which channels to connect (Discord, Telegram, email — or none)
- Which tools to expose (built-in, remote from mobile, custom)
- How to handle security and approval flows
# A Discord bot application
[dependencies]
clawseed-agent = "0.7"
clawseed-providers = "0.7"
serenity = "0.12" # App chooses its own Discord SDK
# An Android application
[dependencies]
clawseed-gateway = "0.7"
clawseed-agent = "0.7"
# A CLI tool
[dependencies]
clawseed-agent = "0.7"
clawseed-tools = "0.7"
This is the fundamental architectural split from ZeroClaw. ZeroClaw bundled 40+ channel adapters, hardware peripherals, a TUI, a web dashboard, and an SOP engine into a single binary — making it an application, not a runtime. Adding a new channel meant modifying the runtime. Adding a new integration meant understanding the entire system.
ClawSeed's approach: the runtime provides crates with stable traits; applications compose them. When a new need arises, you write a new application — you don't modify the runtime.
Architecture Overview¶
┌──────────────────────────────────────────────────────────┐
│ gateway (REST / WebSocket) │
│ ↓ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Agent (stable core) │ │
│ │ turn → LLM → dispatch → execute → loop │ │
│ └──┬──────────┬──────────┬──────────┬─────────────┘ │
│ │ │ │ │ │
│ provider tools memory hooks │
│ (dyn) (dyn) (dyn) (pipeline) │
│ │ │ │ │ │
│ Anthropic 25+ SQLite security │
│ Gemini built-in vector audit │
│ Bedrock search approval │
│ OpenAI* + remote ──→ mobile client │
│ Ollama │
│ DeepSeek │
│ Groq │
└──────────────────────────────────────────────────────────┘
* and any OpenAI-compatible endpoint
Dependency Flow¶
Dependencies flow one-way, forming a clean layered architecture:
clawseed-api (zero deps, trait definitions only)
↑
├← clawseed-tools (tool implementations)
├← clawseed-memory (storage backends)
├← clawseed-providers (LLM providers)
└← clawseed-agent (agent core + runtime assembly)
↑
└← clawseed-config (config loading)
↑
└← clawseed-gateway (HTTP/WS server + remote tool bridge)
↑
└← clawseed (binary entry point)
Key rule: clawseed-api is the only crate with broad dependencies, and it depends on no other crate. Core never imports extensions.
Note: The arrows above show crate-level import direction. At runtime,
Agent::from_config_with_registry()directly instantiates provider, memory, and tools from their respective crates — the agent crate is not a pure orchestration layer, it also owns runtime assembly. The gateway usesAgent::from_config_with_shared_components()to reuse sharedAppStatecomponents (provider, memory, observer) across connections instead of creating new ones per connection.
Core Abstractions¶
All extension points in ClawSeed are traits:
| Trait | Purpose | How to extend |
|---|---|---|
Provider |
LLM inference backend | Implement in clawseed-providers, or register a custom ProviderFactory |
Tool |
Agent-callable capability | Implement in clawseed-tools, or register remote tools via WebSocket |
ToolRegistry |
Unified tool registration and lookup | DefaultToolRegistry in clawseed-agent; supports BuiltIn / MCP / Remote sources. MCP is defined in the enum and registry infrastructure supports it, but the actual MCP protocol client is not yet implemented — see "MCP Status" below |
Hook |
Tool call interceptor | Implement before_tool_call / after_tool_call, or create declaratively via HookFactory from config |
Memory |
Conversation memory backend | Implement in clawseed-memory |
Agent Assembly & Loop¶
Agent::from_config_with_registry() is the primary constructor for CLI/embedded use. It does runtime assembly — directly instantiates provider (via ProviderFactoryRegistry), memory (via clawseed_memory::create_memory()), and tools (via clawseed_tools::registry::all_tools()), then selects a dispatcher based on provider.supports_native_tools(). Tools depend on memory being constructed first; dispatcher depends on provider capabilities. All components are passed to Agent::builder() for final construction.
Agent::from_config_with_shared_components() is the constructor for gateway use. It accepts pre-built Arc<dyn Provider>, Arc<dyn Memory>, Arc<dyn Observer>, model name, temperature, and shared_builtin_tools: Arc<[Arc<dyn Tool>]> from AppState — these shared components are reused across all WebSocket/webhook connections. BuiltIn tools are no longer re-created per connection; the shared Arc<dyn Tool> instances are registered into each agent's per-connection DefaultToolRegistry via register_all_arc(). HookRunner remains per-connection (SecurityPolicy rate limits and remote tools must be isolated). The provider field is Arc<dyn Provider> (not Box); AgentBuilder.provider() wraps Box→Arc, and shared_provider() accepts Arc directly.
The agent's core is a turn loop, triggered by each user message:
User message
↓
Build system prompt (prompt.rs)
↓
Call LLM (Provider::chat())
↓
Parse response (ToolDispatcher::parse_response())
├── NativeToolDispatcher: extract directly from provider's native tool_calls
└── XmlToolDispatcher: try ◁▷ format first, fallback to multi-format parser (12+ formats)
├── Text-only response → return to user
└── Contains tool calls → enter tool loop
↓
For each tool call:
1. before_hook interception (can cancel/modify)
2. Tool::execute()
3. after_hook observation
↓
Format tool results, send back to LLM
↓
Return to "parse response" step until LLM returns text-only
Remote Tool Calls¶
Mobile clients register tools over WebSocket. The gateway wraps each spec as a RemoteTool (implementing the Tool trait). Remote tool registration is a three-step flow:
- Register to shared registry —
state.tool_registry.register_or_replace(tool, ToolSource::Remote { session })so/api/toolsreflects the tool globally - Inject into per-connection Agent —
agent.add_remote_tools(tools, session)before processing each message - Cleanup on disconnect —
state.tool_registry.unregister_by_source(&ToolSource::Remote { session })
The agent has no branching for remote vs. local tools:
┌──────────────┐ register_tools ┌──────────────┐
│ Mobile │ ───────────────────────→ │ Gateway │
│ Client │ │ │
│ │ ←── tool_call_request ── │ Agent │
│ (executes │ ──── tool_result ──────→ │ calls it │
│ on device) │ │ like any │
│ │ ←── result_acknowledged─ │ other tool │
└──────────────┘ └──────────────┘
Tool Context¶
Tools receive runtime dependencies (Memory, etc.) via constructor injection. The ToolContext trait provides the workspace directory for file operations:
// Constructor injection — tools receive dependencies at creation time
let tool = MemoryStoreTool::new(Arc::clone(&memory));
// Workspace directory from context
let workspace = ctx.workspace_dir();
Tool Registry¶
The Agent manages all tool sources through the ToolRegistry trait (defined in clawseed-api):
// Three tool sources
pub enum ToolSource {
BuiltIn, // Built-in tools
Mcp { server: String }, // MCP server tools
Remote { session: String }, // Remote client tools (e.g., Android)
}
// Registration and lookup
registry.register(tool, ToolSource::BuiltIn);
registry.register_or_replace(tool, ToolSource::Remote { session });
let tool = registry.get_tool("shell");
let specs = registry.tool_specs(); // Cached ToolSpec list
DefaultToolRegistry (in clawseed-agent) uses DashMap for lock-free concurrent access, with glob pattern-based tool filtering (allowed_tools / denied_tools) and per-MCP-server filtering. In addition to register()/register_all() (which take Box<dyn Tool>), it provides register_arc()/register_all_arc() (which take Arc<dyn Tool>) for reusing shared tool instances without re-construction.
Dual Tool Registry & Shared Components¶
At runtime there are two independent ToolRegistry instances with different scopes:
| Registry | Scope | Created in | Purpose |
|---|---|---|---|
AppState.tool_registry |
Gateway-wide (shared) | clawseed-gateway/src/lib.rs |
/api/tools endpoint visibility, global tool listing |
Agent.tool_registry |
Per-connection (isolated) | clawseed-agent/src/agent.rs (Agent::builder().build()) |
Actual tool dispatch during agent turns |
Implications:
- /api/tools may show tools (from remote connections) that a given agent cannot actually invoke
- Remote tools must be registered in both registries to be both visible and executable
- In single-connection scenarios (current Android demo), the two registries are effectively in sync
Shared components: AppState holds Arc<dyn Provider>, Arc<dyn Memory>, Arc<dyn Observer>, model: String, temperature: f64, and shared_builtin_tools: Arc<[Arc<dyn Tool>]>. Gateway connections use from_config_with_shared_components() to reuse these, avoiding per-connection provider (HTTP connection pools), memory (SQLite connections), and BuiltIn tool duplication. The shared Arc<dyn Tool> instances are registered into each agent's per-connection DefaultToolRegistry (with connection-specific filters) via register_all_arc(), so each agent still has its own registry with independent filtering while sharing the underlying tool objects. HookRunner remains per-connection (SecurityPolicy rate limits and remote tools must be isolated). Config updates via /api/config do not rebuild shared components — restart the gateway for provider/model/temperature/memory/BuiltIn-tool changes to take effect.
MCP Status (planned, not yet implemented)¶
The ToolSource::Mcp enum variant and McpConfig schema exist, and DefaultToolRegistry supports per-server tool filtering. However, all MCP types in crates/clawseed-agent/src/tools.rs (McpRegistry, DeferredMcpToolSet, McpToolWrapper, ToolSearchTool) are stubs — they return empty collections or errors. There is no MCP protocol client library. The gateway has wiring that calls McpRegistry::connect_all(), but it returns immediately without connecting. Do not treat MCP as a usable capability.
Runtime Init Chain¶
The initialization flow from entry point to running agent:
CLI (clawseed/src/main.rs)
└→ Gateway: run_gateway() (clawseed-gateway/src/lib.rs)
├─ Creates AppState with shared provider, memory, observer, model, temperature, shared_builtin_tools, tool_registry
└─ Each WebSocket connection (clawseed-gateway/src/ws.rs):
├─ Agent::from_config_with_shared_components() — reuses shared components
│ ├─ Reuses state.provider, state.mem, state.observer, state.model, state.temperature, state.shared_builtin_tools
│ ├─ Creates per-connection hooks, dispatcher, skill index; BuiltIn tools use shared Arc instances
│ └─ Agent::builder().build() — creates agent-local tool_registry (shared tool objects, per-connection filters)
├─ Remote tools: register to shared registry + inject into agent
└─ Message loop: agent.chat() / agent.run()
Webhook (clawseed-gateway/src/handlers.rs)
└→ Agent::from_config_with_shared_components() — same shared components, per-request Agent
Chat mode (clawseed/src/main.rs)
└→ Agent::from_config() directly — creates own provider/memory, no gateway layer
Provider Factory¶
Providers register through the ProviderFactory trait + ProviderFactoryRegistry:
// Custom provider factory
impl ProviderFactory for MyFactory {
fn name(&self) -> &str { "my-provider" }
fn aliases(&self) -> &[&str] { &["my-alias"] }
fn create(&self, name: &str, api_key: Option<&str>,
base_url: Option<&str>, options: &ProviderRuntimeOptions
) -> Result<Box<dyn Provider>> { /* ... */ }
}
// Register in the registry
let mut reg = ProviderFactoryRegistry::new();
reg.register(MyFactory);
// Create Agent with a custom registry
Agent::from_config_with_registry(&config, Some(Arc::new(reg))).await?;
Replaces the previous 300+ line match chain. Android/embedded scenarios can pass a minimal provider set.
Security Model¶
- Autonomy levels:
ReadOnly/Supervised/Full - SecurityPolicy: Injected as a Hook — implements the
Hooktrait to globally intercept tool calls before execution (checking autonomy level, rate limits, command allowlists, path guards); always the first hook in the pipeline - Command allowlists:
allowed_commandsvalidates shell commands - Path guards: Blocks access to sensitive paths (
/etc/passwd,/root/.ssh, etc.) - Rate limiting:
max_actions_per_hourlimits actions per session - Hook pipeline:
Hook::before_tool_call()can cancel or modify any tool call; SecurityPolicy is always the first hook in the pipeline - Tool filtering:
allowed_tools/denied_toolsglob patterns,mcp_tool_filtersper MCP server
History Management¶
Each agent turn appends messages to a conversation history (Vec<ChatMessage>) that is sent to the LLM on every request. Unbounded history growth causes token overflow and cost escalation, so the agent applies automatic trimming:
trim_history()— Drops the oldest non-system messages when history exceedsmax_history(default 50), always preserving the system prompt at position 0truncate_tool_result()— Truncates oversized tool output tomax_chars, keeping the head (2/3) and tail (1/3) with a[... N characters truncated ...]markerestimate_history_tokens()— Rough token count estimation (content.len() / 4 + 4per message) for budget decisions
System prompt (always kept)
↓
User message ─→ LLM response ─→ tool result ─→ ...
↑ │
└──── trim_history() removes oldest ─────────┘
This ensures long-running sessions remain within token budgets without losing the system prompt.
Memory System¶
History is the short-term conversation context sent to the LLM; Memory is the long-term knowledge store that persists across sessions. They serve different purposes:
| History | Memory | |
|---|---|---|
| Scope | Current session | Cross-session, persistent |
| Storage | In-memory Vec<ChatMessage> |
SQLite database |
| Lifecycle | Cleared when session ends | Survives restarts |
| Access | Automatic (sent to LLM each turn) | Explicit (tools call memory.recall()) |
| Content | Full conversation text | Structured entries with metadata |
Memory is backed by clawseed-memory, implementing the Memory trait from clawseed-api:
┌─────────────────────────────────────┐
│ Memory trait │
│ store / recall / get / list / │
│ forget / count / health_check │
└─────────────┬───────────────────────┘
│
┌────────┴────────┐
│ │
┌────┴─────┐ ┌─────┴──────┐
│SqliteMemory│ │ NoneMemory │
│ (default) │ │ (fallback) │
└────┬──────┘ └────────────┘
│
┌────┴──────────────────────────────┐
│ Retrieval Engine │
│ ┌──────────────┐ ┌─────────────┐ │
│ │ Vector │ │ BM25 │ │
│ │ Similarity │ │ Keyword │ │
│ │ (embedding) │ │ Search │ │
│ └──────┬───────┘ └──────┬──────┘ │
│ └────┬───────────┘ │
│ ↓ │
│ Hybrid Ranking │
└────────────────────────────────────┘
Key features:
- Hybrid search: Combines vector similarity (semantic) and BM25 (keyword) with configurable weights; controlled by SearchMode enum (Hybrid / Embedding / Bm25)
- Memory categories: Core (persistent knowledge), Daily (ephemeral), Conversation (context), Custom(String) (user-defined)
- Consolidation: Heuristic two-phase extraction after each agent turn — creates timestamped Daily entries and promotes high-importance content (≥ 0.6) to Core memory
- Hygiene: Cadence-gated pruning (12-hour cycle) of stale Conversation/Daily entries; Core memories are never pruned
- Snapshot: Exports Core memories to MEMORY_SNAPSHOT.md with auto-hydration on cold boot if brain.db is missing
- Conflict detection: Jaccard similarity on word overlap to find contradictory Core entries; marks older as [SUPERSEDED by 'newer_key']
- Namespace isolation: recall_namespaced() filters by namespace for multi-tenant or per-user separation
- Export: export() with ExportFilter supports filtering by namespace, session, category, and time range
- Graceful degradation: If SQLite initialization fails, NoneMemory is used as a no-op fallback — tools that depend on memory simply skip the feature
Design Principles¶
- Explicit over implicit —
all_tools()lists every tool; the full capability set is visible at a glance - Declarative over imperative — Config drives composition, not code changes
- Traits at boundaries — Core depends on abstractions; implementations live outside
- Graceful degradation — Missing capability → tool skips the feature; failed memory → NoneMemory fallback; flaky provider → ReliableProvider retries
Crate Overview¶
| Crate | Role | Depends on api | Depends on agent |
|---|---|---|---|
clawseed-api |
Trait definitions only | — | — |
clawseed-agent |
Agent loop, hooks, dispatch, parsing, runtime assembly | yes | — |
clawseed-tools |
25+ built-in tools | yes | no |
clawseed-providers |
LLM provider implementations | yes | no |
clawseed-memory |
SQLite-backed memory + vector search | yes | no |
clawseed-config |
TOML config schema and loading | yes | no |
clawseed-gateway |
Axum HTTP/WS server + remote tool bridge | yes | yes |
clawseed |
Binary (CLI) | — | — |