Multi-Tenancy Design (Memory API v1, Phase 3)¶
Status: Design draft, 2026-05-07. Owner: Memory API pivot. No code yet.
The Memory API exposes durable memory + grounded citations to many agents. Single-tenant is the wrong abstraction for "the layer Claude Code, Cursor, Codex and ChatGPT plug into" — every consumer needs an isolated workspace with its own vault, decisions, memory, and event log. This document scopes the workspace model, the threading strategy, the auth surface, and the migration path from today's single-tenant install.
Goals¶
- Tenant isolation — vault, knowledge DB, memory DB, vector collections, and event log are scoped per workspace. A query in workspace A cannot see a chunk from workspace B.
- Cheap to run — a single
sb serveprocess serves N workspaces. We do not require N daemons. - Migration-clean — existing single-tenant installs upgrade transparently
to
workspace=default. - Audit-clean — every event log entry, decision record, and trajectory
carries a
workspace_idthat can never be forged by a header. - API-clean — workspace identity is determined by the bearer token, not by request body. Tokens cannot cross workspaces.
Non-goals (v1)¶
- Cross-workspace federation — no "search across all my workspaces" yet.
- Per-workspace quotas / rate limits — observable in the event log; not enforced in v1. Phase 5 (eval as product) brings quotas.
- Per-actor permissions inside a workspace — the actor header stays attribution-only for v1. Role-based access is post-v1.
Storage map (from 2026-05-07 survey)¶
Two roots gate everything:
| Root | What lives there | Resolver |
|---|---|---|
state_dir |
6 SQLite DBs (runtime, memory, knowledge, work, travel, antahkarana) + Chroma collections | brain.config.get_state_dir() / brain.db.topology.resolve_db_path() |
vault_dir |
Markdown vault (00_inbox, 01_projects, 03_decisions, …) | brain.config.get_vault_dir() |
Critical insight from the survey: there are 1,428+ direct call sites that
open the knowledge DB and 509+ direct vault path consumers. Threading a
workspace_id parameter through all of them would be a multi-month rewrite.
We don't need to. We can scope at the resolver layer instead.
Architecture: ContextVar-scoped resolvers¶
HTTP middleware Resolvers Storage
┌──────────────────┐ ┌──────────────────┐ ┌─────────────┐
│ AuthMiddleware │ ──set──▶│ get_state_dir() │ ──reads──▶│ state/<ws>/ │
│ - validates │ │ get_vault_dir() │ ──reads──▶│ vault/<ws>/ │
│ token │ │ resolve_db_path()│ │ │
│ - sets │ │ get_settings() │ │ │
│ workspace_ctx │ └──────────────────┘ └─────────────┘
└──────────────────┘ ▲
│
(read by all 1,428 call sites,
which need NO change)
A contextvars.ContextVar[str] named workspace_ctx holds the active
workspace_id for the current request. The resolvers read it. Every existing
call site that goes through resolve_db_path() or get_vault_dir()
automatically sees the right path — no code change.
Sites that bypass the resolvers (e.g., hardcoded Path("vault") or direct
sqlite3.connect("/path/runtime.db")) need patching, but the survey
suggests this is < 50 sites — a tractable cleanup.
Workspace data model¶
@dataclass(frozen=True)
class Workspace:
workspace_id: str # slug, e.g. "team-platform"
display_name: str
state_dir: Path # state/<workspace_id>/
vault_dir: Path # vault/<workspace_id>/
created_at: datetime
metadata: dict[str, Any] # tags, owner email, etc.
Workspaces are stored in a registry DB (single global SQLite), separate from per-workspace state:
state/
_registry.db # Workspace catalog + token bindings (global)
default/ # Today's single-tenant data, auto-renamed
runtime.db
memory.db
knowledge.db
work.db
travel.db
antahkarana.db
chroma/
team-platform/ # New workspace
runtime.db
...
vault/
default/ # Today's vault
00_inbox/
01_projects/
...
team-platform/
...
Token model¶
Tokens are SHA-256-hashed in the registry DB; the prefix ws_ is the only
human-readable part. A token row binds:
x-sb-actor-id is auth-bound: if the header value is not in
allowed_actor_ids, the request is rejected with 403. (Today's single
optional actor header is attribution-only — that's a security gap closed by
this change.)
CLI for token management:
sb workspace create <slug> [--display-name "..."]
sb workspace list
sb workspace token issue --workspace <slug> [--actor <id>]
sb workspace token revoke <token_id>
Middleware¶
Single FastAPI middleware, runs first:
class WorkspaceMiddleware:
async def __call__(self, request, call_next):
token = _extract_bearer(request)
if token is None:
raise HTTPException(401, "missing workspace token")
binding = registry.lookup(token_hash(token))
if binding is None or binding.revoked_at:
raise HTTPException(401, "invalid or revoked token")
actor = request.headers.get("x-sb-actor-id")
if actor and actor not in binding.allowed_actor_ids:
raise HTTPException(403, "actor not allowed for this token")
ws_token_var.set(binding.workspace_id)
actor_var.set(actor or binding.allowed_actor_ids[0])
try:
return await call_next(request)
finally:
ws_token_var.reset(...)
Resolvers consult ws_token_var.get() to compute paths.
Migration from single-tenant¶
One-time migration on first launch after upgrade:
- Detect legacy layout (state files at
state/runtime.dbrather thanstate/<ws>/runtime.db). - Move
state/*→state/default/. - Move
vault/*(exceptvault/_*system dirs) →vault/default/. - Initialize
state/_registry.dbwith one workspacedefault. - Issue an initial token via existing
SB_SERVE_TOKENenv (so existing bearers keep working). - Print: "Multi-tenant mode enabled. Existing data is now in workspace
default. Issue additional workspace tokens withsb workspace token issue."
The migration is idempotent and reversible by symlink (we keep the
original directory locations as symlinks to default/ for one release cycle
to ease rollback).
Feature flag rollout¶
Phase 3 lands behind SB_MULTI_TENANT=1. Default off. With the flag off:
- No middleware, no registry DB.
- Resolvers fall back to the original single-tenant behavior.
- Existing test suite continues to pass without changes.
Once the flag flips to default-on (Phase 3.5), the migration runs on first launch.
Threading audit (the 1,428 call sites)¶
The survey identified 5 storage layers. Here is the per-layer plan:
| Layer | Sites | Action |
|---|---|---|
SQLite via resolve_db_path() |
~1,400 | No change needed — resolver reads ContextVar |
Chroma via VectorStore.__init__() |
~30 primary + 3 secondary | Cache by workspace_id; lazy per-workspace clients |
Vault paths via get_vault_dir() |
~509 | No change needed — resolver reads ContextVar |
| Direct hardcoded paths (bypass resolvers) | ~30–50 | Manual sweep: replace with resolver calls |
| Event log | 10 instantiations | Single resolver change — EventLog(get_runtime_db_path()) |
Estimated PR count: 4–6 (registry, middleware, VectorStore caching, hardcoded-path sweep, migration script, CLI). All small enough to review individually.
Auditability invariants¶
Every persisted artifact (event_log row, decision_catalog row, memory row,
ContextItem) gains a workspace_id column. The constraint is enforced at
the storage layer:
def write(...):
ws = ws_token_var.get()
assert ws, "no workspace context — refusing to persist"
conn.execute("INSERT ... workspace_id = ?, ...", (ws, ...))
If workspace_ctx is unset (CLI path with no middleware), the writer falls
back to workspace_id = "default" and logs a warning. CLI paths set the
ContextVar at command entry from --workspace flag or SB_WORKSPACE env.
Open questions¶
- Per-workspace embedding model? Different workspaces might prefer different embedders (e.g., one uses Voyage, another uses bge-m3). Likely yes for v1.1; v1 keeps a single global model.
- Trajectory storage cross-workspace? A grounded trajectory is the single most-audited artifact. Probably co-located with the workspace that ran it.
- Decision namespace ownership? Today namespace keys are global
(
travel.*,support.meeting.*). Per-workspace namespace might be useful but complicates federation. Out of scope for v1.
Implementation tasks (not yet ticketed)¶
- T-301 Registry DB schema + migrations
- T-302 ContextVar plumbing in resolvers (state_dir, vault_dir, db_path)
- T-303 WorkspaceMiddleware + token validation
- T-304 VectorStore per-workspace caching
- T-305 Hardcoded-path sweep
- T-306 Migration script + rollback symlinks
- T-307 CLI:
sb workspace create|list|token issue|revoke - T-308 Workspace_id columns on event_log, decision_catalog, memory tables
- T-309 Tests: workspace isolation, token rejection across workspaces, middleware enforcement, migration idempotency
Risk register¶
- Risk: ContextVar leakage across async tasks. Mitigation: always
use
with/try/finallyto reset; test with concurrent requests. - Risk: Hardcoded paths in 3rd-party-integration code (Notion sync,
Apple Mail) reach into vault/ directly. Mitigation: sweep + add a
pre-commit grep for raw
vault/strings inbrain/. - Risk: Migration corrupts data if interrupted. Mitigation: migration is two-phase (copy then symlink-flip); failure leaves originals intact.
- Risk: Per-workspace Chroma clients balloon memory. Mitigation: LRU cache with idle eviction; default cap of 16 active workspaces per process.