Company Brain¶

Company Brain is a workspace-scoped layer on top of the SecondBrain substrate, not a parallel runtime. Each workspace owns a dedicated SQLite storage directory under <state_dir>/company_brain/<workspace_id>/ — storage is physically isolated, never shared with the global SecondBrain memory store or with other workspaces. The shared classes (MemoryStore, MemoryRetriever, DecisionCatalog, ContextPackService, kernel Runner) are reused; only the file path differs per tenant.

The legacy standalone datastore.py / query.py / skill_runner.py / review.py modules have been removed. Every read and write now flows through CompanyBrainAdapter.

Architecture¶

┌─────────────────────────────────────────────────────────────────────────┐
│ CLI: sb company-brain {init,workspaces,ingest,answer,context-build,…} │
│ HTTP: /company-brain/{workspaces,search,ingest,context,decisions}      │
│ MCP:  company_brain.{list-workspaces,search,ingest,context-pack,…}     │
└──────────┬──────────────────────────────────────────────────────────────┘
           │
           ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ brain/company_brain/                                                     │
│   workspace.py     — registry + workspace_storage_dir helpers          │
│   policies.py      — SourceAuthority + SensitivityLevel (composes      │
│                      brain/security/content_safety + redaction)        │
│   adapters.py      — CompanyBrainAdapter (per-workspace façade with    │
│                      its own MemoryStore + DecisionCatalog)            │
│   compile_bridge.py — mirror compile reports into the substrate        │
│   mcp_tools.py     — register_company_brain_tools(registry)            │
└──────────┬──────────────────────────────────────────────────────────────┘
           │ instantiates per-workspace stores pointed at isolated paths
           ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ Per-workspace storage: <state_dir>/company_brain/<workspace_id>/         │
│   memory.db        — MemoryStore (LTM + SourceEvidenceUnit + FTS +      │
│                      semantic + content-hash dedup + supersession)      │
│   knowledge.db     — DecisionCatalog (company.* namespace)              │
│   vectors/         — VectorStore persist root                           │
│   baselines/       — compile-baseline JSON snapshots                    │
└──────────┬──────────────────────────────────────────────────────────────┘
           │ classes reused; only the file path differs per workspace
           ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ Shared SecondBrain primitives                                            │
│   brain/memory/store.py + retriever.py — MemoryStore + MemoryRetriever  │
│   brain/decisions/catalog.py            — DecisionCatalog               │
│   brain/context_packs/                  — ContextPackService            │
│   brain/kernel/runner.py + yaml_loader.py — kernel Runner + skill YAML  │
│   brain/ingest/parser.py + chunker.py   — multi-format parse + chunk    │
│   brain/security/content_safety + redaction — PII / injection scans    │
│   brain/state/vector_store.py           — pluggable vector backends    │
└──────────────────────────────────────────────────────────────────────────┘

Each workspace's adapter opens its own MemoryStore and DecisionCatalog pointed at the workspace's SQLite files. Writes to workspace A never appear in workspace B's store, and the global <state_dir>/memory.db is untouched.

Reuse map (gap → existing module)¶

Gap	Reused primitive
Connectors (Git, GitHub, Notion, Slack, Gmail, Calendar, files)	`brain/connectors/*` + `brain/ingest/git_ingest.py`
PDF / DOCX / HTML / XLSX / PPTX / CSV / EPUB / Markdown parsing	`brain/ingest/parser.py`
Chunking + dedup	`brain/ingest/chunker.py` + `MemoryStore.content_hash`
Vector index	`brain/state/vector_store.py` (LanceDB / sqlite-vec / Qdrant / Chroma)
Hybrid retrieval (BM25 + semantic + RRF)	`brain/memory/retriever.py`
Decisions	`brain/decisions/catalog.py` + canonical `company.*` namespace
Entity graph	`brain/graph/` + `brain/ontology/`
Skills (tool binding, approvals, retry, eval)	`brain/kernel/{tool,approvals,retry_policy,runner,eval,yaml_loader}`
Secret / PII / injection redaction	`brain/security/redaction.py` + `brain/security/content_safety.py` + `brain/kernel/secret_redactor.py`
Context packs (task-aware, conflict, open loops)	`brain/context_packs/service.py`
MCP server + governance	`brain/mcp/`
FastAPI surface	`brain/serve/app.py` + `brain/serve/routers/`

Decision namespace¶

Company Brain decisions live under the canonical company.* namespace, registered in brain/decisions/namespace.py:

company.source.ingest             company.skill.execution
company.source.authority          company.skill.approval
company.workspace.create          company.context.compilation
company.workspace.scope           company.context.injection
company.knowledge.promotion       company.retrieval.policy
company.knowledge.supersession    company.sensitivity.gate

Source authority + sensitivity¶

brain/company_brain/policies.py defines an 8-tier SourceAuthority (SCRATCH < CHAT < EMAIL < ISSUE < DOC < DECISION < ADR < POLICY) and a 4-tier SensitivityLevel (PUBLIC < INTERNAL < CONFIDENTIAL < RESTRICTED). The adapter:

Calls authority_for(source_kind) to score each ingest.
Calls classify_sensitivity() which composes the existing content_safety.scan() and redaction.redact_text() — no new pattern lists are introduced.
At search time, re-weights MemoryRetriever hits by authority so an ADR-grounded match beats a Slack-grounded match at equal raw score.

CLI surface¶

# Workspace lifecycle.
sb company-brain init <id> --name … --root <dir> --connectors file,github
sb company-brain workspaces list
sb company-brain workspaces remove <id>

# Ingest.
sb company-brain ingest run <id> <source-ref> --kind doc --from-file path.md
sb company-brain ingest from-dir <id> <dir>  --glob "**/*.md"

# Retrieve.
sb company-brain answer <id> "what is the refund process?"
sb company-brain context-build run <id> --task "draft a refund policy update"

# Decisions.
sb company-brain compile <dir> --mirror-to-workspace <id>

# Skills.
sb company-brain skills validate path/to/skill.yaml

The legacy search, query, context, db *, skills list/get/run subcommands and the standalone datastore have all been removed. Workspace data lives in per-workspace SQLite files; use answer, context-build, ingest, and workspaces instead. compile --mirror-to-workspace <id> keeps the old compile pipeline alive while pushing its output into the new workspace storage.

HTTP surface¶

Mounted by brain/serve/app.py:

Method	Path	Notes
`GET`	`/company-brain/workspaces`	List workspaces
`POST`	`/company-brain/workspaces`	Upsert
`DELETE`	`/company-brain/workspaces/{id}`	Remove
`POST`	`/company-brain/search`	Workspace-scoped hybrid search
`POST`	`/company-brain/ingest`	Ingest one source unit
`POST`	`/company-brain/context`	Build a ContextPackV2
`POST`	`/company-brain/decisions`	Record decision
`GET`	`/company-brain/decisions?workspace_id=…`	List decisions

MCP surface¶

register_company_brain_tools(registry) (see brain/company_brain/mcp_tools.py) attaches:

company_brain.list-workspaces (READ_ONLY)
company_brain.search (READ_ONLY)
company_brain.ingest (LOCAL_WRITE)
company_brain.context-pack (READ_ONLY)
company_brain.record-decision (LOCAL_WRITE)
company_brain.list-decisions (READ_ONLY)

Drop the registrar into the MCP server bootstrap alongside register_ontology_tools.

Extending¶

New skills are kernel AgentDefinition YAMLs validated by sb company-brain skills validate <yaml>. New evals are brain/kernel/eval.py YAML suites consumed by sb agent eval. Both feed the same kernel Runner that all other agents use, so guardrails, approvals and retry policies apply for free.

Storage isolation¶

Each workspace is materialised the first time you call sb company-brain init <id> (or implicitly on first adapter use). Layout:

<state_dir>/
└── company_brain/
    ├── workspaces.json          # registry (workspace_id, name, root_path, …)
    └── <workspace_id>/
        ├── memory.db            # MemoryStore SQLite (FTS + semantic + LTM)
        ├── knowledge.db         # DecisionCatalog SQLite (company.* namespace)
        ├── vectors/             # VectorStore persist root
        └── baselines/           # compile-baseline JSON snapshots

To wipe a workspace and its storage:

sb company-brain workspaces remove <id> --delete-storage

Without --delete-storage the registry entry is removed but the SQLite files are preserved for forensic / archival use.

What's intentionally not here¶

No bespoke BM25 / embeddings layer (use MemoryRetriever).
No regex-based skill runner (use the kernel Runner + YAML).
No bespoke PII / secret patterns (compose brain/security/*).
No new vector backend (use brain/state/vector_store.py).
No standalone Company Brain datastore (datastore.py, query.py, review.py, skill_runner.py and the db * CLI subapp are all removed). Each workspace owns its own SQLite files under <state_dir>/company_brain/<id>/.