Skip to content

Multi-Tenancy Design (Memory API v1, Phase 3)

Status: Design draft, 2026-05-07. Owner: Memory API pivot. No code yet.

The Memory API exposes durable memory + grounded citations to many agents. Single-tenant is the wrong abstraction for "the layer Claude Code, Cursor, Codex and ChatGPT plug into" — every consumer needs an isolated workspace with its own vault, decisions, memory, and event log. This document scopes the workspace model, the threading strategy, the auth surface, and the migration path from today's single-tenant install.

Goals

  1. Tenant isolation — vault, knowledge DB, memory DB, vector collections, and event log are scoped per workspace. A query in workspace A cannot see a chunk from workspace B.
  2. Cheap to run — a single sb serve process serves N workspaces. We do not require N daemons.
  3. Migration-clean — existing single-tenant installs upgrade transparently to workspace=default.
  4. Audit-clean — every event log entry, decision record, and trajectory carries a workspace_id that can never be forged by a header.
  5. API-clean — workspace identity is determined by the bearer token, not by request body. Tokens cannot cross workspaces.

Non-goals (v1)

  • Cross-workspace federation — no "search across all my workspaces" yet.
  • Per-workspace quotas / rate limits — observable in the event log; not enforced in v1. Phase 5 (eval as product) brings quotas.
  • Per-actor permissions inside a workspace — the actor header stays attribution-only for v1. Role-based access is post-v1.

Storage map (from 2026-05-07 survey)

Two roots gate everything:

Root What lives there Resolver
state_dir 6 SQLite DBs (runtime, memory, knowledge, work, travel, antahkarana) + Chroma collections brain.config.get_state_dir() / brain.db.topology.resolve_db_path()
vault_dir Markdown vault (00_inbox, 01_projects, 03_decisions, …) brain.config.get_vault_dir()

Critical insight from the survey: there are 1,428+ direct call sites that open the knowledge DB and 509+ direct vault path consumers. Threading a workspace_id parameter through all of them would be a multi-month rewrite. We don't need to. We can scope at the resolver layer instead.

Architecture: ContextVar-scoped resolvers

HTTP middleware                Resolvers                       Storage
┌──────────────────┐          ┌──────────────────┐           ┌─────────────┐
│ AuthMiddleware   │  ──set──▶│ get_state_dir()  │  ──reads──▶│ state/<ws>/ │
│  - validates     │          │ get_vault_dir()  │  ──reads──▶│ vault/<ws>/ │
│    token         │          │ resolve_db_path()│           │             │
│  - sets          │          │ get_settings()   │           │             │
│    workspace_ctx │          └──────────────────┘           └─────────────┘
└──────────────────┘                  ▲
                          (read by all 1,428 call sites,
                           which need NO change)

A contextvars.ContextVar[str] named workspace_ctx holds the active workspace_id for the current request. The resolvers read it. Every existing call site that goes through resolve_db_path() or get_vault_dir() automatically sees the right path — no code change.

Sites that bypass the resolvers (e.g., hardcoded Path("vault") or direct sqlite3.connect("/path/runtime.db")) need patching, but the survey suggests this is < 50 sites — a tractable cleanup.

Workspace data model

@dataclass(frozen=True)
class Workspace:
    workspace_id: str                # slug, e.g. "team-platform"
    display_name: str
    state_dir: Path                  # state/<workspace_id>/
    vault_dir: Path                  # vault/<workspace_id>/
    created_at: datetime
    metadata: dict[str, Any]         # tags, owner email, etc.

Workspaces are stored in a registry DB (single global SQLite), separate from per-workspace state:

state/
  _registry.db                  # Workspace catalog + token bindings (global)
  default/                      # Today's single-tenant data, auto-renamed
    runtime.db
    memory.db
    knowledge.db
    work.db
    travel.db
    antahkarana.db
    chroma/
  team-platform/                # New workspace
    runtime.db
    ...
vault/
  default/                      # Today's vault
    00_inbox/
    01_projects/
    ...
  team-platform/
    ...

Token model

Authorization: Bearer ws_<random_32_bytes>

Tokens are SHA-256-hashed in the registry DB; the prefix ws_ is the only human-readable part. A token row binds:

(token_hash, workspace_id, allowed_actor_ids[], created_at, last_used_at,
 revoked_at, scopes[])

x-sb-actor-id is auth-bound: if the header value is not in allowed_actor_ids, the request is rejected with 403. (Today's single optional actor header is attribution-only — that's a security gap closed by this change.)

CLI for token management:

sb workspace create <slug> [--display-name "..."]
sb workspace list
sb workspace token issue --workspace <slug> [--actor <id>]
sb workspace token revoke <token_id>

Middleware

Single FastAPI middleware, runs first:

class WorkspaceMiddleware:
    async def __call__(self, request, call_next):
        token = _extract_bearer(request)
        if token is None:
            raise HTTPException(401, "missing workspace token")
        binding = registry.lookup(token_hash(token))
        if binding is None or binding.revoked_at:
            raise HTTPException(401, "invalid or revoked token")
        actor = request.headers.get("x-sb-actor-id")
        if actor and actor not in binding.allowed_actor_ids:
            raise HTTPException(403, "actor not allowed for this token")
        ws_token_var.set(binding.workspace_id)
        actor_var.set(actor or binding.allowed_actor_ids[0])
        try:
            return await call_next(request)
        finally:
            ws_token_var.reset(...)

Resolvers consult ws_token_var.get() to compute paths.

Migration from single-tenant

One-time migration on first launch after upgrade:

  1. Detect legacy layout (state files at state/runtime.db rather than state/<ws>/runtime.db).
  2. Move state/*state/default/.
  3. Move vault/* (except vault/_* system dirs) → vault/default/.
  4. Initialize state/_registry.db with one workspace default.
  5. Issue an initial token via existing SB_SERVE_TOKEN env (so existing bearers keep working).
  6. Print: "Multi-tenant mode enabled. Existing data is now in workspace default. Issue additional workspace tokens with sb workspace token issue."

The migration is idempotent and reversible by symlink (we keep the original directory locations as symlinks to default/ for one release cycle to ease rollback).

Feature flag rollout

Phase 3 lands behind SB_MULTI_TENANT=1. Default off. With the flag off: - No middleware, no registry DB. - Resolvers fall back to the original single-tenant behavior. - Existing test suite continues to pass without changes.

Once the flag flips to default-on (Phase 3.5), the migration runs on first launch.

Threading audit (the 1,428 call sites)

The survey identified 5 storage layers. Here is the per-layer plan:

Layer Sites Action
SQLite via resolve_db_path() ~1,400 No change needed — resolver reads ContextVar
Chroma via VectorStore.__init__() ~30 primary + 3 secondary Cache by workspace_id; lazy per-workspace clients
Vault paths via get_vault_dir() ~509 No change needed — resolver reads ContextVar
Direct hardcoded paths (bypass resolvers) ~30–50 Manual sweep: replace with resolver calls
Event log 10 instantiations Single resolver change — EventLog(get_runtime_db_path())

Estimated PR count: 4–6 (registry, middleware, VectorStore caching, hardcoded-path sweep, migration script, CLI). All small enough to review individually.

Auditability invariants

Every persisted artifact (event_log row, decision_catalog row, memory row, ContextItem) gains a workspace_id column. The constraint is enforced at the storage layer:

def write(...):
    ws = ws_token_var.get()
    assert ws, "no workspace context — refusing to persist"
    conn.execute("INSERT ... workspace_id = ?, ...", (ws, ...))

If workspace_ctx is unset (CLI path with no middleware), the writer falls back to workspace_id = "default" and logs a warning. CLI paths set the ContextVar at command entry from --workspace flag or SB_WORKSPACE env.

Open questions

  1. Per-workspace embedding model? Different workspaces might prefer different embedders (e.g., one uses Voyage, another uses bge-m3). Likely yes for v1.1; v1 keeps a single global model.
  2. Trajectory storage cross-workspace? A grounded trajectory is the single most-audited artifact. Probably co-located with the workspace that ran it.
  3. Decision namespace ownership? Today namespace keys are global (travel.*, support.meeting.*). Per-workspace namespace might be useful but complicates federation. Out of scope for v1.

Implementation tasks (not yet ticketed)

  • T-301 Registry DB schema + migrations
  • T-302 ContextVar plumbing in resolvers (state_dir, vault_dir, db_path)
  • T-303 WorkspaceMiddleware + token validation
  • T-304 VectorStore per-workspace caching
  • T-305 Hardcoded-path sweep
  • T-306 Migration script + rollback symlinks
  • T-307 CLI: sb workspace create|list|token issue|revoke
  • T-308 Workspace_id columns on event_log, decision_catalog, memory tables
  • T-309 Tests: workspace isolation, token rejection across workspaces, middleware enforcement, migration idempotency

Risk register

  • Risk: ContextVar leakage across async tasks. Mitigation: always use with / try/finally to reset; test with concurrent requests.
  • Risk: Hardcoded paths in 3rd-party-integration code (Notion sync, Apple Mail) reach into vault/ directly. Mitigation: sweep + add a pre-commit grep for raw vault/ strings in brain/.
  • Risk: Migration corrupts data if interrupted. Mitigation: migration is two-phase (copy then symlink-flip); failure leaves originals intact.
  • Risk: Per-workspace Chroma clients balloon memory. Mitigation: LRU cache with idle eviction; default cap of 16 active workspaces per process.