PostgreSQL as Source of Truth
chris-os uses one PostgreSQL database for everything. Not “one database per service.” Not “one database for the important stuff.” One database. 210 tables across 5 schemas, serving every component in the stack: n8n workflow state, Authelia sessions, Grafana metadata, Home Assistant recorder history, semantic memory vectors, and all personal data.
This is a deliberate choice with real consequences, both good and bad.
Why One Database
Section titled “Why One Database”Cross-domain queries are trivial. When the morning briefing workflow needs to correlate yesterday’s calendar events with health metrics and message patterns, that is a single SQL query with joins. In a multi-database architecture, it would be three API calls, data marshaling, and application-level joining.
Schema evolution is atomic. A migration that adds a column to messages and creates a new view joining messages with calendar_events runs in a single transaction. Either both changes apply or neither does. Cross-database schema changes cannot make that guarantee.
Backups are simple. One pg_dump captures the entire system state. Restore it, and every service comes back with consistent data. Multi-database backups require coordinating dumps across services to avoid inconsistencies.
Monitoring is centralized. One postgres-exporter scrapes one database. Connection counts, query latency, table sizes, and WAL stats for the entire system appear in one Grafana dashboard.
The cost is coupling. If PostgreSQL goes down, everything goes down. But for a single-user system, this is an acceptable tradeoff. The database is the most monitored, most backed-up, and most protected component in the stack. It gets 6GB of the Pi’s 16GB RAM and 2 of its 4 CPU cores.
Schemas
Section titled “Schemas”Five schemas organize the 210 tables by domain and access pattern:
| Schema | Purpose | Primary Role |
|---|---|---|
public | All personal data: messages, health, calendar, contacts, transactions, workouts, documents, config | app |
memory | Semantic memory server: memories, embeddings, entities, relations, clusters | app |
therapy | Clinical and therapy data (Tier 1, isolated) | therapy_app |
n8n | n8n workflow execution state | n8n_role |
ha | Home Assistant recorder history | ha_role |
Schemas provide logical separation within the single database. The therapy schema is fully isolated: only the dedicated therapy role can access it, and that role cannot access any other schema. This enforces the Tier 1 privacy boundary at the database level.
The Canonical View Pattern
Section titled “The Canonical View Pattern”When the same kind of data lives in multiple tables (because it comes from different sources), chris-os uses canonical views to present a unified interface.
The primary example is all_messages, a view created in migration 216 that unions three message sources:
| Source Table | Platform | Key Fields |
|---|---|---|
messages | iMessage | 367K messages, direction (‘inbound’/‘outbound’) |
social_messages | Via WAHA webhook ingestion | |
discord_messages | Discord | Via discord-bot direct writes |
The all_messages view normalizes these into a common schema: sender, recipient, body, timestamp, platform. Any analytics that need cross-platform message data query the view, never the individual tables.
Timezone Discipline
Section titled “Timezone Discipline”Time handling in a database that stores data from multiple sources and displays it in a specific timezone is a common source of bugs. chris-os solves this with a strict rule enforced by convention and code review:
| Column Type | Default Function | Use Case |
|---|---|---|
TIMESTAMPTZ | NOW() | Event timestamps, created_at, updated_at. Stores UTC, renders in any timezone. |
TIMESTAMP (no tz) | pacific_now() | Display-oriented timestamps where Pacific time is the intended meaning. |
DATE | pacific_today() | Calendar dates in Pacific time. |
The critical rules:
- Never use
pacific_now()for aTIMESTAMPTZcolumn. The function returns a bare timestamp; inserting it into a TIMESTAMPTZ column causes PostgreSQL to assume UTC, producing a time that is 7-8 hours wrong. - Never use
CURRENT_DATEfor aDATEcolumn.CURRENT_DATEreturns the date in the server’s timezone (UTC on most deployments). After midnight UTC but before midnight Pacific, it returns tomorrow’s date.pacific_today()returns the correct Pacific date.
These rules are documented in migration 181 and enforced through code review. There is no runtime enforcement; it relies on discipline.
The Migration System
Section titled “The Migration System”chris-os manages schema evolution through numbered migration files. 302 migrations have been applied as of the current state. Each migration is a SQL file that modifies the database schema.
Key properties:
- Append-only. Migrations are never modified after they are applied. If a migration was wrong, a new migration corrects it.
- Sequential. Migrations are numbered (000, 001, … 302) and applied in order. The
schema_migrationstable tracks which migrations have been applied. - Baseline-consolidated. Migrations 000 through 002 are consolidated baselines that establish the initial schema. Everything after 002 is incremental.
- Idempotent where possible. Many migrations use
CREATE TABLE IF NOT EXISTS,ALTER TABLE ... ADD COLUMN IF NOT EXISTS, or similar guards. But this is a convention, not a guarantee.
Migrations are applied via make db-migrate (local Mac) or make pi-db-migrate (production Pi). The migration runner checks schema_migrations, finds unapplied migrations, and runs them in order within a transaction.
Role-Based Access
Section titled “Role-Based Access”Five database roles enforce the principle of least privilege. No application connects as the superuser.
| Role | SELECT | INSERT | UPDATE | DELETE | DDL | Used By |
|---|---|---|---|---|---|---|
superuser | All | All | All | All | All | Admin tasks only (manual psql) |
app | All | All | All | All | No | Dashboard API, n8n, discord-bot, memory server |
readonly | All | No | No | No | No | Dashboard reads, reporting, postgres-exporter |
remote_rw | All | Approved tables | Approved tables | No | No | Remote MCP access (Claude) |
therapy_app | therapy.* | therapy.* | therapy.* | therapy.* | No | Therapy-specific applications |
The remote_rw role deserves special attention. It is the role that AI assistants use when they access the database through MCP. It can SELECT from any table in the public schema (so Claude can answer questions about your data), and it can INSERT and UPDATE on approved tables (so Claude can store memories and update documents). But it cannot DELETE rows, cannot modify the schema, and cannot access the therapy schema at all.
This role is the architectural boundary between “what AI can see” and “what AI can do.” Expanding its permissions is a deliberate decision, not an accident.
pgvector and Semantic Search
Section titled “pgvector and Semantic Search”PostgreSQL is not just a relational database in chris-os. With the pgvector extension, it also serves as a vector database for the semantic memory server.
The memory server stores embeddings as 4096-dimensional vectors generated by the qwen3-embedding:8b model running on Atlas. These vectors enable similarity search: “find memories related to this query” translates to a cosine distance calculation against the vector index.
The memory schema uses binary-quantized HNSW indexes (migration 261) for fast approximate nearest-neighbor search at scale. A two-stage retrieval process first filters candidates via the quantized index, then reranks using full-precision vectors. This keeps memory footprint low while maintaining search quality.
Hybrid search combines three signals via Reciprocal Rank Fusion (RRF): vector similarity (semantic meaning), full-text search via tsvector/GIN indexes (exact keyword matching), and recency (newer memories rank higher). The combination outperforms any single signal alone.