Operations & Packaging Guide (Draft)
Doc status: Draft — normative for migration, trust policy, and config precedence. Last updated: 2025-11-02.
This guide covers operational tasks: configuration layering, KB migrations, vector index maintenance, packaging targets, and trust management.
1. Configuration precedence (normative)
Runloop merges configuration from several layers. Highest precedence wins unless a system policy forbids the change.
- CLI flags (
rlp run --model=…,:budget …inline overrides) - Environment variables (
RUNLOOP_*) - User config
~/.runloop/config.yaml - System config
/etc/runloop/config.yaml - Built-in defaults
1.1 Policy overlays
System config may define policy.* keys that represent hard limits (e.g.,
policy.max_tokens, policy.providers.allowlist,
policy.confirm_external_actions = true). Lower layers may only tighten these
values. Attempts to exceed policy MUST cause the command to fail with a
descriptive error.
1.2 Merge semantics
| Type | Rule |
|---|---|
| Scalars | Last writer wins (respecting precedence). |
| Maps | Deep merge; map entries follow precedence per key. |
| Lists | Replace entirely (last writer). Exceptions: models.providers unions entries before applying allow/deny lists. |
| Capability sets / allowlists | Intersect with policy first, then apply precedence. |
Environment variables mirror YAML paths (upper case, underscores). Examples:
RUNLOOP_MODELS_DEFAULT=local:llama3.1-8b
RUNLOOP_MODELS_BUDGETS_SYSTEM_TOKENS_HARD=750000
RUNLOOP_SECURITY_CONFIRM_EXTERNAL_ACTIONS=true
RUNLOOP_CONFIG=/custom/path/config.yaml
1.5 Runtime socket & discovery (MVP, normative)
- Runloop uses a single Unix domain socket for both the bus and control plane.
- Default naming and discovery precedence:
- If
runtime.socket_pathis non-empty, use it and error immediately if unreachable (no probing). - Else if
runtime.sockets_diris set, use${runtime.sockets_dir}/rmp.sock. - Else
~/.runloop/sock/rmp.sock. - Else
/run/runloop/rmp.sock.
- If
Examples:
runtime:
socket_path: "/run/runloop/rmp.sock" # overrides discovery; short-circuits probing
sockets_dir: "/var/run/runloop" # used only when socket_path is empty
Optional:
runtime.strict_fs_caps: trueforces agent spawn to fail if a declared fs capability root is missing or not a directory (defaultfalse, which warns and skips the preopen).
The CLI refuses to silently fall back to local execution when the daemon is
unavailable. It fails fast with guidance to start the daemon (or re-run with
--local).
1.3 Model broker configuration (MVP)
models.broker.providerslists named backends.kindmay belocal,http(OpenAI-compatible completions),http_openai_chat(OpenAI chat),http_anthropic(Claude/v1/messages),http_ollama(local Ollama), orhttp_gemini(Google GeminigenerateContent). These HTTP kinds acceptbase_url,secret_id, and optional static headers.models.broker.routeis an ordered array of{ pattern, provider, target_model? }entries; the first matching pattern wins. Legacy map syntax like{ "*": "local" }(or the legacy keyrouting) still deserialises into the same shape.models.broker.cacheexposesttl_msandcapacityfor the in-memory LRU. Requests may override TTL viacache_ttl_ms;0disables caching for that call.models.broker.budgetsretainsdefault_tokens,per_request_tokens_cap, andhard_cap_usd. Per-request budgets clamp to the stricter of the request and config-provided values.- Provider
secret_idvalues resolve at runtime via the configured secret store; raw API keys should never be stored in YAML. - To use Gemini, add a provider entry with
kind: http_gemini,base_url: https://generativelanguage.googleapis.com, and asecret_idsuch asrunloop/models/gemini(the runtime will also look for the environment variableRUNLOOP_MODELS_GEMINI). Agents that invoke Gemini still needmodel = trueinpolicy.caps; automation agents that shell out (e.g., to manage tmux) also requireexec = trueplus explicit filesystem whitelists for any touched configs.
1.4 Runtime readiness gate (normative)
- Agents only become visible to supervisors after a two-sided readiness
handshake: Wasmtime instantiates, the bus mailbox subscribes, tracing
context is seeded, and the guest either calls the hostcall
runloop::notify_ready()or enters itsmailbox_recvloop (fallback for pre-ready binaries). runtime.spawn_ready_timeout_ms(default 5000 ms) controls how long the runtime waits for that handshake. Per-agent overrides live inAgentSpec::spawn_ready_timeout_ms; environment variableRUNLOOP_SPAWN_READY_TIMEOUT_MSis the lowest-precedence fallback.- When the timeout elapses, callers receive
Error::ReadyTimeout, the runtime emitsrunloop.runtime.spawn.ready_timeouts_total, and it tears down any partially created bus subscriptions/audit state to prevent ghost agents. - Treat
notify_readyas part of the minimum agent ABI going forward; older agents that cannot be rebuilt should block onmailbox_recvimmediately so the fallback signal still fires.
2. Knowledge Base (POG) operations (normative)
The POG consists of two SQLite files and a derived vector index.
~/.runloop/pog/events.sqlite— append-only ledger (WAL, synchronous=FULL)~/.runloop/pog/pog.sqlite— materialized views (WAL, synchronous=NORMAL)~/.runloop/pog/vectors/— HNSW index files (derived; safe to rebuild)runloopdruns a background materializer that tails the ledger and updates the views. Progress is tracked in the singleton rowpog.sqlite.materializer_statewith columns:id INTEGER PRIMARY KEY CHECK (id = 1)watermark INTEGER NOT NULL
2.1 Migration workflow
rlp kb migrate orchestrates upgrades across both stores.
- Ensure
runloopdis stopped (command refuses to run if sockets are open; override with--force). - Create timestamped backups of both DBs.
- Apply schema migrations to
events.sqlite(rare; append-only). - Rebuild
pog.sqliteby replaying events (events.sqlite→ views). Use--inplaceonly for emergency SQL patches. - Rebuild vector index using the
VectorStore::rebuildpath. - Set
meta.dirty = 0, record newschema_version, and create asnapshotsentry. - Update
materializer_state.watermarkwith the highest applied ledger id.
Supporting commands:
rlp kb verify— referential integrity, hashes, BLAKE3 checksrlp kb backup— consistent hot backup (uses SQLite backup APIs)rlp kb vacuum— optional compaction (requires exclusive lock)rlp kb why <entity>— print ordered source events for a materialized entity key.- Redaction: by default, emails are masked at read time for all interfaces (CLI,
hostcalls, backups). Operators can set
kb.redaction.allow_unredacted_admin=trueto allow privileged reads and should set a deployment-specifickb.redaction.salt. Agents must declarekb_read.contacts_rawto bypass masking; such reads should be audited.
2.2 Metadata tables
Both databases include meta(schema_version TEXT, dirty INTEGER, ts DATETIME).
pog.sqlite also tracks the snapshots table with columns:
id INTEGER PRIMARY KEYts DATETIMEevents_high_watermark INTEGERcomment TEXT
2.3 Retention
- Ledger retains all events; corrections produce new
StateDeltaentries. - Operators can archive older events by copying subsets elsewhere; never delete rows in-place.
- Materialized views compact automatically during rebuild; configure retention
by emitting
StateDeltaevents that mark artifacts/contacts inactive.
3. Vector index lifecycle (normative)
- Implementation milestone 1 uses a pure-Rust HNSW crate (
hnsw_rsclass). Keyword search uses SQLite FTS5; results fuse via Reciprocal Rank Fusion (RRF). - Embeddings are stored in
pog.sqlite(blob column) with metadata. The vector index is derived and can be discarded/rebuilt. VectorStoretrait (conceptual):
#![allow(unused)]
fn main() {
trait VectorStore {
fn upsert(&self, id: ItemId, embedding: &[f32], meta: &Meta) -> Result<()>;
fn delete(&self, id: ItemId) -> Result<()>;
fn search(&self, q: &[f32], k: usize, filter: &MetaFilter) -> Result<Vec<Hit>>;
fn rebuild(&self, iter: impl Iterator<Item = (ItemId, Embedding, Meta)>) -> Result<()>;
}
}
- Provenance filters (
confirmed_only,agent_allowlist) run before final scoring. - Future milestone may integrate Tantivy; implementations must conform to the same trait.
4. Packaging targets (informative)
4.1 Debian 13 (trixie) packages
-
Assets live under
packaging/systemd/(systemd unit, tmpfiles definition, default config, README, maintainer scripts).cargo-debconsumes them directly; no in-treedebian/directory is required. -
Build requirements:
build-essential,cargo,rustc,pkg-config,libssl-dev,libsqlite3-dev,systemd, andcargo-deb(cargo install cargo-deb). Thejust debrecipe runs the three builds in sequence and rebuilds the canonical WASM bundles:just deb # equivalent to: just build-agents-wasm cargo deb -p runloopd cargo deb -p rlp cargo deb -p agtop -
Artifacts land under each crate’s
target/debian/directory (e.g.,crates/runloopd/target/debian/runloopd_0.1.0_amd64.deb). Install withsudo apt install crates/<crate>/target/debian/<pkg>_<ver>_<arch>.deb. -
runloopdpackage duties: install/usr/bin/runloopd, systemd service, tmpfiles definition,/etc/runloop/config.yaml(as a conffile), and docs. The maintainer scripts create therunloopsystem user, chown/var/lib/runloop//var/log/runloop, callsystemd-tmpfiles --create, runsystemctl daemon-reload, and enable but do not startrunloopd.serviceon a first-time install so operators can edit config before launching. They should also create/var/lib/runloop/agentsand/var/lib/runloop/openings(owned byrunloop:runloop) sorlp agent install --root /var/lib/runloop/agentsis immediately usable. The package also ships the default agent bundles under/usr/lib/runloop/agentsand thecompose_emailandsmoke_execopenings under/etc/runloop/openings/compose_email.yamland/etc/runloop/openings/smoke_exec.yaml. It seeds writable copies into/var/lib/runloop/{agents,openings}and refreshes them on upgrade only when they remain unmodified (tracked via a directory hash). Upgrades capture whether the daemon was running prior todpkgstopping it and automatically restartrunloopd.serviceonce the new bits are configured, keeping CLI/agent traffic flowing with zero downtime. -
The CLI (
rlp) and monitor (agtop) ship as independent packages so they can be updated without restarting the daemon; they just depend onca-certificatesplus transitive Rust runtime libraries. -
Purging the daemon package (
sudo apt purge runloopd) removes/etc/runloop,/var/lib/runloop,/var/log/runloop, and therunloopsystem user/group; CLI/TUI packages only drop their binaries/docs.
4.2 Additional artifacts
| Artifact | Location | Status |
|---|---|---|
| Live ISO | packaging/live-build/ | Folders exist; scripts TBD after .deb packaging. |
| Dev container | packaging/container/ | README tracks mounts, base image expectations. |
5. Trust policy & agent signatures (normative)
Runloop enforces signatures on agent bundles before install/launch.
Status:
rlp agent installis available for local bundles and validates manifest digests +tools.jsonschema, but signature verification andrlp trust updateare still landing with the packaging milestone. Edit trust policy files manually per the steps below.rlp agent listshows discovered bundles plus digest status.
- Algorithm: Ed25519 detached signature over
manifest.toml(canonicalized) and referenced files. - Bundle layout:
agent.bundle/
├─ manifest.toml # includes digests of contents
├─ policy.caps
├─ tools.json # optional host tool attachments
├─ agent.wasm
├─ schemas/… (optional)
├─ SBOM/spdx.json (optional)
└─ SIGNATURES/manifest.sig
-
Tool attachments: When present,
tools.jsonMUST follow the schema indocs/tool-attachments.md; its digest appears inmanifest.tomlso the signature covers attachment metadata alongside binaries. -
Trust policy file:
~/.runloop/trust-policy.toml
[anchors]
runloop_release = "ed25519:ABCD..."
dev = {
key = "ed25519:DEAD...",
allow_dev = true
}
[rules]
runloop_release = {
allow_caps = "any",
allow_net = "any",
allow_exec = false
}
dev = {
allow_caps = ["kb_read", "kb_write"],
allow_net = [],
allow_exec = false
}
-
Lifecycle:
- First-party releases signed with Runloop Release key (private material stored outside repo).
- Third-party vendors sign with their key; operators add the corresponding anchor.
rlp trust updatefetches keysets/CRLs.- Install flow:
rlp agent install bundle.tar→ verify digests → (signature verification pending) → enforce trust policy (pending) → stage bundle. - Launch flow re-verifies manifest + signature as defense in depth.
- If
security.allow_unsigned_agents=false,rlp agent installwill refuse bundles until signature verification is implemented.
-
Parameter schemas: agent manifests embed JSON Schemas under
[schemas.with]so tooling can validatewithpayloads before execution. Each schema becomes part of the signed manifest; CLI and daemon consumers load them via the shared agent registry. -
Revocation: increment keyset version or publish revocation list; runtime refuses to start bundles signed by revoked keys.
6. Secrets backends (summary)
See docs/security-model.md for secret-store details. Ops tasks:
Status:
rlp secrets ...tooling is being wired up; use your platform’s secret store CLI until the native commands ship. Default provider isstub(in-memory, for dev) but it will consult environment variables first to preserve existing env-only setups. Prefer a real backend for anything sensitive.
- Planned:
rlp secrets init --backend=secret-service|pass|age - Planned:
rlp secrets put runloop/mail/smtp_api_key(reads from stdin) - Planned:
rlp secrets listandrlp secrets deletefor maintenance
7. Observability (summary)
- Default logging: JSON (ndjson) with keys
ts,level,service.name,trace_id,opening_id,agent_id. - Tracing & metrics via OpenTelemetry OTLP. Configure endpoint, protocol, and
sampling under
observabilityin config. - Bus/TUI metrics snapshots:
observability.metrics_interval_ms(default 1000, allowed 100–60000) controls how oftenrunloopdpublishesCT_METRICS_SNAPSHOTframes torlp/sys/metricsandrlp/agents/<agent_id>/metrics(TTL = 2× interval, minimum interval+250 ms). System frames include queue depth/capacity, drop counters, and broker/hostcall totals; agent frames include RSS/CPU and mailbox depth, with a final zeroed snapshot on teardown. - Model broker exports
runloop_broker_calls_total,runloop_broker_cache_hits_total, andrunloop_broker_errors_total{kind=*}counters for dashboards. agtoppane +rlp tracerely on the metrics exported by agents.- Capability audit volume is gated by
security.caps.audit_on_allowandsecurity.caps.audit_on_deny; the latter defaults totrueso denied hostcalls land in the KB ascap.auditevents.
8. Message bus topics (normative)
- Only UI/TUI processes may publish
action.decision; the bus rejects other publishers and emits an audit event. - The runtime publishes drop notices (
DropNotice) onrlp/sys/dropswhenever TTL expiry or duplicate suppression occurs. Operators should scrape this topic for reliability dashboards.
8.0 Control plane
rlp/ctrlcarriesCT_CTRL_REQandCT_CTRL_RESP. Submit requests use a 30s TTL; the CLI waits up to 2s for acceptance.- After acceptance, the daemon publishes
CT_RUN_EVENTtorlp/runs/<trace_id>/events. This is a live-only stream; historical events are persisted in the KB.
8.1 Bus publisher ACL (configuration)
Configure publisher kinds allowed to emit specific schemas:
bus:
auth:
publishers:
action_decision:
allowed_kinds: ["ui", "tui"]
Defaults permit only ui and tui. Publishers establish identity at connect
time (connect_as). runloopd validates the list at startup; unknown strings
or empty entries cause the daemon to fail fast so operators notice
misconfigurations.
Appendix A. Repo admin checklist
Branch protection (owner: @release-eng)
- Protect
main: require PRs, 1+ code owner review, dismiss stale reviews on changes. - Require status checks: build, test, clippy, fmt, docs-check, commitlint.
- Require branch to be up to date before merging.
- Disallow force-push to
main.
Security features (owner: @release-eng)
- Enable Dependabot alerts & updates.
- Enable secret scanning & push protection.
- Enable code scanning (CodeQL or equivalent).
Labels (owner: @pm)
- Create: bug, feature, task, docs, infra, security, design, good-first-issue, epic, phase:g.
CI secrets (owner: @release-eng)
CRATES_IO_TOKEN(future), signing keys, release GPG key (optional).
Release gates (owner: @pm, @release-eng)
- Tag pattern
v0.x.y. - Required checks green.
- CHANGELOG updated.
- SBOM/signatures attached (when implemented).
Further reading: