Postgres DAG Operations Guide
Postgres with pgvector, pg_trgm, and pgcrypto is Randal's active source of truth for memory, chat, graph/DAG recall, mesh registry, jobs, checkpoints, annotations, delegation audit, and artifact metadata.
Quick Paths
| Path | Use when | First commands |
|---|---|---|
| New local install | You are starting on a laptop or dev box | randal setup, then cd any-project && opencode |
| Managed Postgres | You run on Railway, Supabase, Neon, RDS, Cloud SQL, or similar | Set DATABASE_URL, run randal db migrate --config <config>, then smoke test /health. |
| Database move | You need backup, restore, or copy | Back up first, then use the exact randal db dump/verify-dump/restore/copy/scoped-copy commands in Dump, Restore, And Copy Runbook. |
Expected healthy local status:
Postgres: ok
Schema: public (exists)
Readable: yes
Writable: yes
Migration version: 003
Extensions missing: none
Runtime memory store: postgres
Local Postgres Setup
Use plain Postgres plus pgvector locally. The local path does not require Supabase or randal serve.
- Run setup once from a stable Randal checkout.
randal setup - Start OpenCode from any project.
cd any-project opencode
randal setup starts the bundled Docker Postgres service for the default local profile, applies idempotent migrations, verifies write health, writes generated OpenCode MCP config, and symlinks ~/.config/opencode to the durable generated config. Generated MCP env includes the local database URL/profile/schema and the repo root, so direct OpenCode sessions can recover after a reboot if the local container is stopped.
Manual database commands remain useful for diagnostics:
- Start the bundled database service.
randal db start - Apply idempotent migrations.
randal db migrate - Confirm read/write health, migrations, and extensions.
randal db status --write-check - Start the gateway only for Console/API/channel mode.
randal serve
The compose service uses pgvector/pgvector:pg16. If port 5432 is already in use during default local setup, randal setup automatically selects another local host port, starts Docker Postgres with that port, verifies read/write database health on the selected port, and only then writes the selected URL into generated OpenCode MCP config. If the fallback container starts but the selected port is not reachable, setup fails instead of writing unusable MCP config. If you set an explicit non-default local port yourself, Randal honors it and surfaces startup errors normally. If authentication fails, rotate local credentials and re-run randal db status --write-check; diagnostics redact database credentials.
Docker Compose uses the deterministic project name randal for the bundled local Postgres stack, even when setup is run from a temporary worktree. Docker Desktop may show the service container as randal-postgres-1; this is intentional. Randal does not set exact container_name: randal because Docker container names are global, and an exact name would conflict with any unrelated user container already named randal.
When setup retries after a default 5432 bind failure, it only removes/recreates the postgres service in the randal Compose project. It does not stop or remove unrelated projects, including older worktree-derived projects such as postgres-dag-source-of-truth-pr.
Useful local conflict commands:
- Inspect the deterministic Randal project.
docker compose -p randal -f docker-compose.postgres.yml ps - Remove a stale Randal local Postgres service without deleting the volume.
docker compose -p randal -f docker-compose.postgres.yml rm --stop --force postgres - Remove an unrelated exact-name container only after confirming it is safe.
docker ps -a --filter name='^/randal$' docker rm randal
Local Reset Warning
randal db reset --force --confirm reset-postgres drops and recreates the configured schema. Back up first if the database has any data you need.
Configuration Reference
New installs should set memory.store: postgres and configure the database block or DATABASE_URL/RANDAL_DATABASE_URL.
memory:
store: postgres
database:
profile: local
url: "${DATABASE_URL}"
schema: public
ssl: disable
migrationMode: manual
orgId: "00000000-0000-4000-8000-000000000001"
projectId: "00000000-0000-4000-8000-000000000001"
pool:
min: 0
max: 10
idleTimeoutMillis: 30000
connectionTimeoutMillis: 5000
Use profile: generic for portable managed Postgres. Use profile: supabase only when enabling Supabase-specific overlays. Runtime roles should be least-privilege and scoped to the Randal schema; service-role credentials are for migrations/administration only and must not be printed in logs.
Identity fields:
| Field | Local default | Cloud use |
|---|---|---|
database.orgId | Singleton local org | Tenant organization ID |
database.projectId | Singleton local project | Project/workspace ID |
instanceId or generated instance identity | One local agent | Per-service or per-worker identity |
Artifact storage and retention are represented by artifact rows containing URI, hash, size, media type, retention class, and redaction status. Large object payloads may live outside Postgres; operators must copy those stores separately when moving databases.
Managed Postgres And Supabase
Randal's core SQL path is provider-portable. Required extensions are vector, pg_trgm, and pgcrypto.
Generic managed Postgres checklist:
- Provision Postgres and enable required extensions.
- Store
DATABASE_URLorRANDAL_DATABASE_URLas a secret. - Run
randal db migrate --config <config>with migration/admin credentials. - Run runtime with a least-privilege role that can read/write Randal tables and execute required extension-backed queries.
- Schedule provider backups and test
randal db dump/verify-dumpon a non-production target. - Run smoke tests with memory, chat, DAG recall, jobs, mesh status, and
/health.
Supabase-only optional overlays:
| Overlay | Purpose | Caveat |
|---|---|---|
| Auth/RLS | Tenant isolation enforced by policies | Requires tenant context and policy tests before production |
| Realtime | Push updates for selected tables | Do not rely on realtime for source-of-truth writes |
| Storage | Artifact payload storage | Signed URLs must be redacted and rotated |
Provider caveats: Neon may require connection-pool tuning for serverless suspend/resume; RDS and Cloud SQL may require extension/admin approval; Railway supplies managed Postgres but may restrict owner-level restore permissions, so use --no-owner --no-acl dump/restore paths.
Railway Deployment Guide
Deployment order:
- Provision Railway managed Postgres or attach a compatible external Postgres URL.
- Set
DATABASE_URLorRANDAL_DATABASE_URLas a Railway secret. - Set
RANDAL_REQUIRE_MEMORY=trueso startup and/healthfail closed when Postgres is unavailable. - Run migrations from a one-off shell or CI job.
randal db migrate --config randal.config.railway.yaml randal db status --config randal.config.railway.yaml --write-check - Deploy Randal with
memory.store: postgres. - Run smoke tests:
/health, memory add/search, chat log/search, job submit/status, mesh status, and a representative DAG recall.
Railway copy and backup examples:
randal db dump --file ./db-dumps/randal.dump --output-dir ./db-dumps
randal db verify-dump --file ./db-dumps/randal.dump
randal db restore --file ./db-dumps/randal.dump --force --confirm restore-postgres
randal db copy --target-url "$TARGET_DATABASE_URL" --force --confirm copy-postgres
randal db scoped-export --org-id "$ORG_ID" --project-id "$PROJECT_ID" --file ./project.json
randal db scoped-import --file ./project.json --force --confirm import-scoped-postgres
randal db scoped-copy --target-url "$TARGET_DATABASE_URL" --force --confirm copy-scoped-postgres
Never paste raw database URLs into tickets, logs, or docs. Use environment variables and redacted manifests. Railway permissions may block owner/ACL restore operations; Randal uses no-owner/no-acl flags for managed-database compatibility.
Postgres DAG Conceptual Guide
Randal stores durable entities as source tables and graph projection rows. The graph explains relationships and provenance; the source tables remain authoritative.
memory ──mentions──▶ project decision
│ │
│ derived_from │ verified_by
▼ ▼
job event ──produced──▶ artifact
│
└──delegated_to──▶ mesh instance
Graph nodes can represent memories, chat messages, summaries, jobs, job steps, artifacts, skills, instances, annotations, delegations, and operational findings. Graph edges encode relationships such as depends_on, derived_from, found_in, remediated_by, verified_by, supersedes, produced_artifact, and loose reference edges.
Strict DAG edges reject cycles so dependency/path explanations remain bounded. Loose/reference edges can represent cross-links that are useful for search but not part of strict dependency ordering. Traversal APIs must use depth and row limits; long-running rebuilds use resumable checkpoints.
Hybrid ranking combines FTS, vector similarity, trigram matching, recency, graph neighborhood signals, tenant scope, and reliability annotations. Tenant isolation is enforced by org/project/instance columns and, where enabled, optional provider RLS overlays.
Memory, Chat, Jobs, And Audit User Guide
Memory rows are scoped by org/project/instance/global fields and deduped by content hash. Chat messages persist by thread; summaries compress long threads; pending actions remain explicit until resolved. Job tables answer what ran, when it started, which steps/events/tool invocations occurred, which checkpoints were written, which artifacts were produced, and how a stop/failure/retry was handled.
User-facing expectations:
| Surface | What Postgres stores |
|---|---|
| Memory tools | Scoped memories, categories, sources, hashes, embeddings, metadata |
| Chat tools | Threads, messages, summaries, pending action state |
| Job views | Jobs, steps, event stream, tool summaries, checkpoints |
| Audit/delegation | Delegation records, routing explanation, redacted payloads |
| Artifacts | URI, content hash, byte size, media type, retention class, redaction status |
Agent Tool Behavior
Ordinary memory/chat interactions remain stable: users can ask Randal to remember something, search prior work, recover a chat thread, or list recent context without learning graph internals. Randal uses memory_search, memory_store, memory_recent, chat_search, chat_thread, chat_recent, and chat_log for that default flow.
Graph-aware context appears when the answer depends on structure or provenance. Users can ask for a graph trace, why a result was related, what prior job/artifact a decision came from, or which remediation verified a finding. Randal should answer with bounded path summaries and node/job/artifact IDs, not raw transcripts or unrelated neighborhoods.
Execution audit uses separate job-ledger tools. Users can ask what ran for a job, where a build stopped, which checkpoint is safe to resume from, or which artifacts were produced. Randal should summarize job_timeline, job_events, checkpoints, and artifact metadata with redaction intact.
Dump, Restore, And Copy Runbook
Full database/schema moves use pg_dump/pg_restore and are appropriate for backups, disaster recovery, or replacing an entire target. Scoped moves use application-level JSON snapshots and are appropriate for one org/project.
Full path:
- Back up the source and target.
- Run
randal db dump --file ./db-dumps/randal.dump --output-dir ./db-dumps. - Run
randal db verify-dump --file ./db-dumps/randal.dump. - Restore only into an empty, compatible, or known seed-only target.
- Use
--force --confirm restore-postgresfor destructive restore. - Verify row counts, checksums, migrations, extensions, search, graph traversal, and job timelines.
Scoped path:
- Export with
randal db scoped-export --org-id "$ORG_ID" --project-id "$PROJECT_ID" --file ./project.json. - Review redacted counts/checksums.
- Import with
randal db scoped-import --file ./project.json --force --confirm import-scoped-postgres. - Use
randal db scoped-copy --target-url "$TARGET_DATABASE_URL" --force --confirm copy-scoped-postgresonly when source and target compatibility has been checked.
Dump manifests include redacted database identity, migration version, extension posture, row counts, checksums, artifact references, and redaction status. Artifact payloads outside Postgres must be copied separately based on their URI/hash inventory.
Troubleshooting
| Symptom | Check | Fix |
|---|---|---|
| Cannot connect | randal db status --write-check | Verify secret, host, port, SSL mode, network allowlist |
| Missing extension | Status shows extension missing | Enable vector, pg_trgm, pgcrypto, then rerun migrations |
| Migration drift | Status or restore reports version mismatch | Stop writes, inspect schema_migrations, restore compatible backup or migrate target |
| Pool exhaustion | Gateway latency and DB max connections high | Lower worker concurrency or tune pool/provider limits |
| Failed writes | Job or memory operations error | Treat as source-of-truth failure; do not silently fall back to legacy search |
| Lexical degradation | FTS/trigram results weak | Rebuild indexes/projections and confirm extensions |
| Embedding degradation | Vector ranking absent | Check embedding provider and stored vector dimensions; FTS still works |
| RLS failures | Cloud-only permission errors | Verify tenant context, policies, runtime role grants, and service-role migration separation |
Security, Redaction, And Retention
Stored data includes memory content, chat content, summaries, job metadata, redacted tool args/results, delegation payload summaries, annotations, graph relationships, artifact metadata, and retention/redaction status. Operators must protect database credentials, service-role keys, API tokens, artifact stores, backups, and migration snapshots.
Randal redacts database URLs, password/token-like fields, service credentials, signed artifact URLs, and secret-like payloads in manifests, health output, migration reports, and tool errors. Redaction is defense-in-depth; do not put raw secrets in docs, tickets, screenshots, or chat messages.
Retention classes include ephemeral, standard, audit, and legal-hold. Artifact/transcript payloads may be retained outside Postgres, but their metadata and redaction status remain queryable. Exports can include metadata for externally stored payloads; copy the external payload store separately when required.
Mesh And Posse Operations
The mesh registry is Postgres-backed. Instances write identity, role, expertise, capabilities, model availability, health, active job count, heartbeat timestamps, routing inputs, reliability annotations, delegation records, and delegation events into tenant-scoped rows.
HTTP delegation remains the transport for MVP peer work. Postgres records the registry and audit trail; it is not a hidden permanent queue. Future DB-backed queues must preserve idempotency keys, redaction, retention, and routing explanation IDs.
Stale detection compares heartbeat timestamps and health state. Routing considers expertise, reliability, load, model match, tenant scope, and recent delegation outcomes. Delegation audit rows should explain why a peer was selected without storing raw secrets or full private transcripts.
Release Notes And Operator Checklist
Before release:
- Implementation steps 1-10 are committed and reviewed.
- Step 11 docs inventory and static checks pass.
- Migrations apply to an empty pgvector database.
- Local smoke passes with Postgres.
- Railway deployment has managed Postgres, env vars, migrations, and smoke tests documented.
- Dump/restore/copy commands have redaction, compatibility, checksum/count, and destructive confirmation coverage.
- Security docs state what is stored, redacted, retained, exported, and protected.
- Operator signs off cleanup gates: backups tested, rollback window defined, and docs examples checked.