Company Brain Readiness Checklist¶

YC's Company Brain framing is specific: it is not just company-wide search or a chatbot over documents. A real Company Brain should pull fragmented company knowledge together, structure it, keep it current, and convert it into an executable skills file for AI.

For SecondBrain, treat the implementation as:

brain/company_brain/

This checklist is an audit document for the current SecondBrain implementation. The status column is intentionally conservative: a capability is marked done only when it is working end-to-end, tested, and documented in the current tree.

Current Readiness Snapshot¶

Current module shape:

brain/company_brain/
  __init__.py
  compiler.py       # deterministic Markdown compiler and skill draft renderer
  datastore.py      # standalone SQLite datastore
  models.py         # compile, datastore, search, skill, and run models
  query.py          # standalone FTS search and context-pack assembly
  review.py         # focused Antahkarana/Buddhi reuse for evidence review
  skill_runner.py   # dry-run planning for stored procedure skills
  store.py          # baseline report persistence

The current implementation is a credible local-first seed: it can compile local Markdown into cited operating knowledge, store source documents, evidence receipts, knowledge atoms, procedure skills, compile runs, and skill dry-run traces in a separate company_brain.db, search those atoms with SQLite FTS, build evidence-carrying context packs, and surface contradiction review through focused Antahkarana reuse.

It should not yet be called a complete Company Brain. The main gaps are Git ingestion, incremental sync, first-class entity/relationship/decision/open-loop tables, vector retrieval, permission-aware retrieval, redaction, feedback-driven memory mutation, eval suites, HTTP API, MCP tools, and real policy-gated action execution.

Target Internal Structure¶

Recommended long-term structure:

brain/company_brain/
  models.py
  store.py
  ingest/
    pipeline.py
    parsers.py
    dedupe.py
    extractors.py
    freshness.py
  connectors/
    local_files.py
    git.py
    github.py
    notion.py
    gmail.py
    slack.py
    calendar.py
  graph/
    entities.py
    relationships.py
    resolver.py
  skills/
    model.py
    compiler.py
    validator.py
    executor.py
    exporter.py
  context/
    compiler.py
    packer.py
    redactor.py
  evals/
    datasets.py
    judges.py
    suites.py
  api.py
  cli.py

Scoring¶

Score	Meaning
✅ Done	Working end-to-end, tested, documented
🟡 Partial	Exists but incomplete, brittle, or not validated
❌ Missing	Not implemented
⚠️ Risk	Implemented but unsafe, unscalable, or unclear

1. Core Product Fit¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
1	Not just RAG/chatbot	P0	Can the system do more than answer questions over docs?	It supports memory, context compilation, decisions, skills, and governed actions.	🟡 Partial - context packs, decisions-as-atoms, skills, and dry-runs exist; governed action execution is not real yet.
2	Company knowledge map	P0	Does it model how work actually happens?	People, projects, systems, decisions, workflows, tools, and dependencies are represented.	🟡 Partial - workflows/decisions/open loops are extracted as atoms; people/projects/systems/dependencies are not first-class.
3	Living memory	P0	Does knowledge update over time?	New docs/conversations/events update memory without full manual rebuild.	🟡 Partial - repeated local store runs update deterministic records; no incremental watcher or connector sync.
4	Evidence-backed answers	P0	Does every answer show receipts?	Source document, line/span, timestamp, and confidence are visible.	✅ Done - search/query/context preserve evidence receipts from local Markdown.
5	Executable skills	P0	Can it convert knowledge into reusable agent instructions?	SOPs/runbooks/checklists can become structured skills.	🟡 Partial - process/policy sections become skill drafts and stored procedure skills; schema is still light.
6	Governed execution	P0	Can agents safely act, not just reason?	Side effects go through policy and approval gates.	🟡 Partial - dry-run flags approval-required steps and disables side effects; no human approval workflow or executor.
7	Human correction loop	P0	Can users correct wrong memory?	Corrections are stored, traceable, and affect future retrieval.	❌ Missing - no feedback/mutation path for Company Brain records.
8	Agent-ready interface	P1	Can other agents/tools use it?	CLI/API/MCP endpoints expose search, context, skills, memory, and audit.	🟡 Partial - CLI exists; API and MCP tools are missing.
9	Local-first mode	P1	Can it run privately on local machine/company infra?	Core ingestion, retrieval, memory, and skills work without mandatory cloud dependency.	✅ Done - current compiler, store, search, context, and dry-run are offline-safe.
10	Multi-workspace support	P2	Can different projects/companies stay isolated?	Separate workspaces with isolated indexes, policies, and audit logs.	🟡 Partial - workspace IDs exist in the standalone DB; indexes/policies/audit are not fully isolated.

2. Source Ingestion¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
11	Local file ingestion	P0	Can it ingest Markdown, TXT, PDF, JSON, YAML, code docs?	`sb ingest` or `sb company-brain ingest` works on local folders with stable document IDs.	🟡 Partial - local Markdown works through `db store`; other formats and `ingest run` surface are missing.
12	Git repo ingestion	P0	Can it understand source repos?	Reads README, docs, ADRs, code comments, tests, configs, commit metadata.	❌ Missing - no Company Brain Git connector yet.
13	Incremental ingestion	P0	Does it avoid reprocessing everything?	Changed files are detected by hash, mtime, version, or Git diff.	🟡 Partial - content hashes and stable IDs exist; no incremental pipeline.
14	Document versioning	P0	Does it track changes?	Same document has version history and supersession metadata.	🟡 Partial - source documents have `version` and timestamps; no version history table.
15	Conversation import	P1	Can it ingest chat transcripts?	Markdown/JSON chat import extracts decisions, facts, open loops.	❌ Missing.
16	GitHub/GitLab connector	P1	Can it ingest PRs/issues/releases?	PR discussions, issue status, labels, owners, and links are indexed.	❌ Missing.
17	Email connector	P1	Can it ingest email threads safely?	Thread-aware ingestion with sender, date, attachments, and redaction.	❌ Missing.
18	Calendar/meeting connector	P1	Can it ingest meeting context?	Participants, agenda, notes, actions, decisions are captured.	❌ Missing.
19	Notion/Confluence/GDocs connector	P1	Can it ingest canonical company docs?	Page hierarchy, owner, last edited timestamp, and permissions are preserved.	❌ Missing.
20	Slack/Teams connector	P2	Can it handle noisy channels?	Channel allowlist, thread extraction, decision detection, noise filtering.	❌ Missing.
21	Connector registry	P1	Are connectors pluggable?	Each connector has type, config, sync mode, permission model, and status.	❌ Missing.
22	Failed ingestion handling	P0	What happens when parsing fails?	Failures are logged, retryable, and visible in diagnostics.	🟡 Partial - compiler warnings exist; no retryable ingestion diagnostics.
23	Source authority metadata	P0	Does system know which source is canonical?	ADR > official docs > merged PR > chat message, or configurable ranking.	❌ Missing.

3. Knowledge Modeling¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
24	SourceDocument object	P0	Is every source stored as a canonical object?	Has ID, source URI, checksum, owner, timestamp, permissions.	🟡 Partial - ID, URI/path, hash, timestamps, metadata exist; owner/permissions are not complete.
25	KnowledgeAtom object	P0	Are facts extracted below document level?	Important facts are stored as atomic, evidence-linked units.	✅ Done - local decisions, processes, policies, and open loops become evidence-linked atoms.
26	Entity model	P0	Are people/projects/systems modeled?	Entities have aliases, type, description, evidence, and relationships.	❌ Missing.
27	Relationship model	P0	Can it represent dependencies?	Supports owns, depends_on, decided_by, supersedes, blocks, related_to.	❌ Missing.
28	DecisionRecord model	P0	Are decisions first-class?	Stores decision, rationale, alternatives, owner, date, evidence, status.	🟡 Partial - decisions are atoms/context refs; no first-class decision table.
29	OpenLoop model	P1	Can it track pending work/questions?	Extracts unresolved items, owner, due date, source, status.	🟡 Partial - open loops are extracted as atoms; no owner/due/status object.
30	Skill model	P0	Are workflows represented structurally?	Skill has trigger, inputs, steps, tools, policy, approval rules, expected output.	🟡 Partial - stored skills have trigger/instructions/evidence/safety; inputs/tools/outputs/version policy are missing.
31	EvidenceReceipt model	P0	Are claims traceable?	Every fact/answer/skill step points to one or more receipts.	✅ Done for current local flows.
32	MemoryMutation model	P0	Are memory changes auditable?	Add/update/merge/deprecate/delete events are append-only.	❌ Missing.
33	Freshness metadata	P0	Does knowledge age correctly?	Every object has created_at, updated_at, valid_from, valid_until, freshness_score.	🟡 Partial - atoms have these fields; not every object does, and ranking does not fully use them.
34	Confidence score	P0	Does it distinguish weak vs strong facts?	Confidence combines source quality, extraction certainty, recency, and corroboration.	🟡 Partial - confidence fields exist; score is mostly deterministic/static.
35	Supersession model	P0	Can old knowledge be replaced?	Objects can supersede or be superseded without silent deletion.	🟡 Partial - atom supersession fields exist; no operational mutation/deprecation flow.
36	Sensitivity labels	P0	Is sensitive data marked?	Objects carry sensitivity level: public/internal/confidential/secret/PII.	🟡 Partial - atom `sensitivity_level` exists; no classifier or enforcement.

4. Retrieval and Answer Quality¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
37	Hybrid retrieval	P0	Does retrieval combine semantic and keyword search?	Vector + BM25/keyword + recency + authority ranking are used.	🟡 Partial - FTS/keyword scoring exists; vector, recency, and authority ranking are missing.
38	Entity-aware retrieval	P0	Does it use graph/entity context?	Query for a project/system pulls related owners, decisions, docs, issues.	❌ Missing.
39	Temporal retrieval	P0	Can user ask "latest", "last month", "as of X"?	Time filters and freshness-aware ranking work.	❌ Missing.
40	Permission-aware retrieval	P0	Can retrieval filter restricted data?	Unauthorized docs/facts never enter context.	❌ Missing.
41	Authority-aware ranking	P0	Does canonical knowledge rank higher?	Official docs/ADRs beat stale chats or random notes.	❌ Missing.
42	Conflict detection	P0	Does it detect contradictory sources?	Answer says "sources disagree" and shows both.	🟡 Partial - Buddhi contradiction review exists for retrieved hits; not integrated with all answer flows.
43	Stale knowledge warning	P0	Does it warn on old evidence?	Uses "possibly stale" marker when source is outdated or superseded.	❌ Missing.
44	Citation/receipt preservation	P0	Does answer preserve source links?	Final answer includes source references for key claims.	✅ Done for current search/query/context commands.
45	No-evidence behavior	P0	What happens when answer is unsupported?	It says insufficient evidence instead of hallucinating.	✅ Done for deterministic `query` fallback.
46	Multi-hop retrieval	P1	Can it follow dependencies?	Example: service -> owner -> runbook -> recent incident -> fix PR.	❌ Missing.
47	Query decomposition	P1	Can it break complex questions?	Multi-part questions are decomposed into subqueries with trace.	❌ Missing.
48	Retrieval diagnostics	P1	Can developer inspect why result was chosen?	Shows score components: vector, keyword, recency, authority, graph, permission.	🟡 Partial - result score and matched terms exist; component diagnostics are missing.
49	Golden query suite	P0	Do you test retrieval quality?	Fixed benchmark questions run in CI or local eval.	🟡 Partial - CLI/search behavior tests exist; no dedicated quality eval suite.

5. Executable Skills Layer¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
50	Manual skill authoring	P0	Can user define a skill by hand?	YAML/Markdown skill file can be loaded and validated.	🟡 Partial - generated `SKILL.md` output exists; no Company Brain skill import/validation command.
51	Skill schema validation	P0	Are skills strongly typed?	Required fields, inputs, outputs, tools, and approvals are validated.	🟡 Partial - Pydantic models exist; schema is not full and no validation CLI exists.
52	Skill extraction from docs	P1	Can docs/runbooks become skills?	System proposes structured skill from SOP/runbook with evidence.	✅ Done for local Markdown process/policy sections.
53	Skill versioning	P0	Can skills evolve safely?	Every skill has version, changelog, author, evidence, and rollback path.	❌ Missing.
54	Skill dry run	P0	Can skill be simulated?	`--dry-run` shows planned steps without side effects.	✅ Done behaviorally through `sb company-brain skills run`; an explicit `--dry-run` flag is still a CLI polish item.
55	Skill execution trace	P0	Is every step traceable?	Each step logs inputs, tools, outputs, evidence, policy decision.	🟡 Partial - dry-run traces persist planned steps, inputs, evidence, and policy controls; tool outputs are not implemented.
56	Tool binding	P0	Can skill call tools?	Skill steps can bind to CLI/API/MCP tools.	❌ Missing.
57	Approval gates	P0	Are risky steps blocked?	Write/send/delete/deploy/payment actions require explicit approval.	🟡 Partial - risky steps are flagged; no approval workflow enforces execution.
58	Skill failure handling	P1	What happens when a step fails?	Failure is classified, logged, and recoverable.	❌ Missing.
59	Skill feedback loop	P1	Can failed runs improve the skill?	Failed traces generate suggested skill patches.	❌ Missing.
60	Skill export	P1	Can skills be exported for agents?	Exports to `SKILLS.md`, MCP resource, or agent-specific instructions.	🟡 Partial - `skills-write` exports `SKILL.md`; MCP/agent-specific exports are missing.
61	Skill marketplace/templates	P2	Can users reuse common skills?	Includes templates for incident response, PR review, release, customer escalation.	❌ Missing.

6. Context Compiler¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
62	Task-aware context pack	P0	Does it compile context for a specific task?	`sb company-brain context build --task ...` returns focused context, not raw dump.	🟡 Partial - `sb company-brain context "task"` builds focused packs; target `context build --task` surface is missing.
63	Role-aware context	P1	Does context differ by role?	Engineer, founder, PM, support agent receive different context.	🟡 Partial - role is captured in CLI payload; selection strategy does not change by role.
64	Token budget control	P0	Can it fit within model limits?	Context has hard budget, prioritization, and truncation strategy.	✅ Done - `ContextPackV2.enforce_budget` is used.
65	Evidence-carrying context	P0	Do facts retain receipts inside context?	Context pack includes source IDs, spans, timestamps, and confidence.	✅ Done for current local evidence receipts.
66	Freshness-aware packing	P0	Does current info beat stale info?	Latest valid decisions/docs are preferred.	❌ Missing.
67	Conflict-aware packing	P0	Does it include conflict warnings?	Conflicting context is explicitly marked.	🟡 Partial - contradictions from review are included as context conflicts when detected.
68	Permission-aware packing	P0	Does it exclude restricted material?	Context compiler applies same ACL as retrieval.	❌ Missing.
69	Agent-specific export	P1	Can it export for Claude/Cursor/CLI/MCP?	Same task can be formatted for different consuming agents.	🟡 Partial - Markdown/JSON context exists; agent-specific formats are missing.
70	Context diffing	P2	Can user see what changed between packs?	Pack versions can be compared.	❌ Missing.

7. Security, Governance, and Safety¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
71	Source allowlist	P0	Can user control what is ingested?	Only configured sources are indexed.	🟡 Partial - commands ingest explicit local paths; no connector allowlist registry.
72	Secret detection	P0	Are tokens/API keys caught?	Common secrets are redacted before indexing.	❌ Missing.
73	PII detection	P0	Is personal data handled safely?	PII is labeled, redacted, or permission-restricted.	❌ Missing.
74	Policy engine	P0	Is there centralized policy evaluation?	Read/write/action decisions go through policy layer.	❌ Missing for Company Brain.
75	Read vs write separation	P0	Are side effects separated from search?	Search cannot accidentally mutate external systems.	✅ Done for current commands; search/query/context are read-only, and skill run is dry-run-only.
76	Approval workflow	P0	Can humans approve risky actions?	Approval is required for configured actions.	❌ Missing.
77	Audit log	P0	Is every retrieval/action recorded?	Query, retrieved docs, answer, tool call, approval, mutation are logged.	🟡 Partial - compile runs and skill dry-run traces persist; retrieval/query audit is missing.
78	Workspace isolation	P1	Can two workspaces stay separate?	Indexes, policies, logs, and secrets are isolated.	🟡 Partial - workspace filtering exists; policy/log/secret isolation is missing.
79	Data deletion	P0	Can user delete/tombstone knowledge?	Delete request removes or tombstones source and derived atoms.	🟡 Partial - low-level delete helpers exist for some records; no safe tombstone/delete workflow.
80	Model routing policy	P1	Can sensitive context stay local?	Policy controls which model/provider can see which data.	❌ Missing for Company Brain.
81	Prompt injection defense	P0	Are source instructions treated as untrusted?	Ingested docs cannot override system/tool policy.	🟡 Partial - no model execution in current path; future action runner needs explicit defense.
82	Tool sandboxing	P1	Are tools constrained?	Tools run with least privilege and clear allow/deny rules.	❌ Missing for Company Brain skill execution.

8. Evaluation and Observability¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
83	Trace ID per run	P0	Can every answer/action be debugged?	Every request returns a trace ID.	🟡 Partial - compile runs and skill dry-runs have IDs; search/query/context requests do not persist trace IDs.
84	Retrieval evals	P0	Do you measure precision/recall?	Golden questions verify expected source retrieval.	❌ Missing dedicated eval suite.
85	Faithfulness evals	P0	Do you detect unsupported claims?	Answer claims are checked against retrieved evidence.	❌ Missing.
86	Freshness evals	P0	Do you test latest/current behavior?	Queries like "current policy" prefer latest valid source.	❌ Missing.
87	Permission evals	P0	Do you test access leaks?	Restricted docs never appear for unauthorized user/agent.	❌ Missing.
88	Conflict evals	P1	Do you test contradictory sources?	System detects and reports conflict instead of merging silently.	🟡 Partial - contradiction unit/CLI tests exist; no eval suite.
89	Skill evals	P1	Do you test skill execution quality?	Skills have test cases with expected plan/tool/policy behavior.	🟡 Partial - dry-run behavior tests exist; no broader skill eval dataset.
90	Regression suite	P0	Can changes break existing behavior?	CI/local test suite catches retrieval, policy, schema, and skill regressions.	🟡 Partial - focused tests cover current compile/store/search/context/dry-run; policy/schema breadth is incomplete.
91	Latency tracking	P0	Do you measure performance?	Ingestion, retrieval, context build, answer generation timings are logged.	❌ Missing.
92	Cost tracking	P1	Do you track model/token cost?	Prompt/completion tokens and model calls are recorded.	❌ Missing; current Company Brain path is model-free.
93	Feedback analytics	P1	Can you see repeated failures?	Wrong answers and corrections are grouped by source/entity/skill.	❌ Missing.
94	Observability dashboard	P2	Is there a visual view?	Dashboard shows runs, latency, retrieval quality, failed skills, policy blocks.	❌ Missing.

9. CLI, API, and MCP Readiness¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
95	CLI init	P0	Can user start easily?	`sb company-brain init` creates workspace/config/index.	🟡 Partial - `sb company-brain db init` exists; top-level `init` does not.
96	CLI ingest	P0	Can user ingest sources?	`sb company-brain ingest run` works with logs and summary.	🟡 Partial - `db store` / `db ingest-local` exist; target `ingest run` surface is missing.
97	CLI search	P0	Can user search memory?	`sb company-brain search "..."` returns ranked evidence.	✅ Done.
98	CLI answer	P0	Can user ask grounded questions?	`sb company-brain answer "..."` returns answer plus receipts.	🟡 Partial - command is named `query`, not `answer`.
99	CLI context build	P0	Can user build context packs?	`sb company-brain context build --task ...` works.	🟡 Partial - `context "task"` exists; target nested surface is missing.
100	CLI skills	P0	Can user list/validate/run skills?	`sb company-brain skills list/validate/run --dry-run` works.	🟡 Partial - list/get/run exist; validate and explicit `--dry-run` flag are missing.
101	HTTP API	P1	Can external apps call it?	Search, answer, context, skills, feedback, audit endpoints exist.	❌ Missing.
102	MCP server	P1	Can agent clients use it?	Exposes `company_brain.search`, `company_brain.context_pack`, `company_brain.get_skill`, `company_brain.run_skill`.	❌ Missing.
103	Stable contracts	P1	Are payloads typed and versioned?	Pydantic/JSON Schema contracts exist for core objects.	🟡 Partial - Pydantic models and schema strings exist; no exported JSON Schema contracts.
104	Developer docs	P1	Can others integrate quickly?	README explains setup, examples, contracts, and extension points.	🟡 Partial - how-to and this audit exist; extension docs are missing.

10. Storage and Indexing¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
105	SQLite source of truth	P0	Is canonical state durable?	Documents, atoms, entities, skills, decisions, mutations stored in SQLite.	🟡 Partial - documents, atoms, evidence, skills, compile runs, and skill runs are durable; entities/relationships/decisions/mutations are missing.
106	Vector index	P0	Is semantic search supported?	Documents, atoms, skills, decisions indexed for vector retrieval.	❌ Missing for Company Brain.
107	Keyword index	P0	Is exact-match search supported?	BM25/ripgrep/FTS works for names, APIs, errors, commands.	✅ Done - SQLite FTS5 powers atom search.
108	Graph edges	P1	Are relationships queryable?	Entity relationships stored and traversable.	❌ Missing.
109	Index rebuild	P0	Can index be rebuilt safely?	Rebuild does not corrupt canonical store.	🟡 Partial - schema init backfills FTS; no explicit rebuild command.
110	Backup/restore	P1	Can user preserve brain state?	Workspace export/import or DB backup works.	❌ Missing.
111	Migration support	P1	Can schema evolve?	Alembic/simple migrations with version tracking.	🟡 Partial - schema version is tracked and `CREATE IF NOT EXISTS` evolves tables; no migration framework.
112	Large corpus behavior	P2	Does it scale beyond toy data?	Tested on thousands/lakhs of chunks with acceptable latency.	❌ Missing.

11. Open-Source Product Quality¶

#	Capability	Priority	Vetting Question	Pass Criteria	Status
113	One-command setup	P0	Can a new user run it quickly?	Install + init + ingest + ask works from README.	🟡 Partial - separate commands work; one-command setup is not documented.
114	Sample workspace	P0	Is there demo data?	Includes fake company brain dataset with docs, decisions, skills, evals.	🟡 Partial - tests create fake data; no committed demo Company Brain workspace/evals.
115	Clear positioning	P0	Does README say what this is?	"Local-first Company Brain for evidence-backed memory and agent skills."	🟡 Partial - docs describe this; top-level README positioning needs review.
116	Architecture diagram	P1	Can users understand internals?	README has flow: connectors -> ingestion -> memory/graph/skills -> context -> agents.	❌ Missing.
117	Extension guide	P1	Can contributors add connectors/skills?	Documented plugin interfaces.	❌ Missing.
118	Test coverage	P0	Are core flows tested?	Tests cover ingestion, retrieval, policy, context, skills, mutations.	🟡 Partial - compile/store/search/context/dry-run tests exist; policy/mutations/connectors missing.
119	MIT license	P0	Is licensing clear?	LICENSE file present and package metadata aligned.	🟡 Partial - verify at release; not audited in this checklist pass.
120	No private/org leakage	P0	Is repo safe to publish?	No company names, secrets, internal URLs, personal data, or private examples.	⚠️ Risk - requires dedicated repository scan before release.

Minimum Bar To Call It A Real Company Brain¶

Gate	Must Pass	Current Status
Ingest	Local files + Git repo ingestion with incremental updates	🟡 Partial - local Markdown only; no Git/incremental pipeline.
Store	Durable SQLite-backed source documents, knowledge atoms, decisions, skills, evidence receipts	🟡 Partial - separate DB stores docs/atoms/evidence/skills/runs; decisions are atoms, not records.
Retrieve	Hybrid search with recency, authority, and evidence	🟡 Partial - FTS + evidence; no vector/recency/authority.
Answer	Grounded answers with citations/receipts and no-evidence fallback	✅ Done for deterministic local query.
Model	Entities, relationships, decisions, open loops, skills	🟡 Partial - open loops/decisions as atoms and skills exist; entities/relationships missing.
Compile	Task-specific context packs with token budget and evidence	✅ Done for current standalone context command.
Execute	At least manual skills with dry-run and policy-gated execution	🟡 Partial - dry-run exists; policy-gated execution missing.
Govern	Secret/PII redaction, permission-aware retrieval, approval for side effects	❌ Missing.
Observe	Trace ID, audit log, eval suite, feedback loop	🟡 Partial - compile and skill-run traces exist; audit/evals/feedback missing.
Expose	CLI first, then API/MCP	🟡 Partial - CLI exists; API/MCP missing.

Suggested Audit Tracking Format¶

Use this table in implementation reviews:

ID	Status	Evidence	Gap	Owner	Target
1	🟡	`brain/company_brain/compiler.py`, `query.py`, `skill_runner.py`	No governed action runtime	TBD	P0
37	🟡	`brain/company_brain/query.py` uses SQLite FTS and scoring helpers	No vector, recency, authority, or graph ranking	TBD	P0
57	🟡	`brain/company_brain/skill_runner.py` flags approval reasons	No approval workflow integration	TBD	P0
83	🟡	`company_brain_run` and `procedure_skill_run` tables	Query/context traces are not persisted	TBD	P0

Code-Level Vetting Map¶

Checklist Area	Suggested Module/File	Current Evidence	What To Validate Next
Core models	`brain/company_brain/models.py`	Workspace, source document, atom, evidence, skill, search, review, run models exist.	Add entity, relationship, decision record, open loop, memory mutation, skill version models.
Durable store	`brain/company_brain/datastore.py`	Standalone SQLite CRUD for documents, atoms, receipts, skills, runs.	Add migrations, entities, relationships, decisions, mutations, tombstones.
Ingestion pipeline	`brain/company_brain/ingest/pipeline.py`	Missing.	Build incremental pipeline with diagnostics.
Parsers	`brain/company_brain/ingest/parsers.py`	Markdown parsing lives inside `compiler.py`.	Split parsers and add TXT/JSON/YAML/PDF/code docs.
Deduplication	`brain/company_brain/ingest/dedupe.py`	Deterministic entry dedupe exists in compiler.	Add hash and semantic dedupe across source versions.
Extraction	`brain/company_brain/ingest/extractors.py`	Process/policy/decision/open-loop extraction exists in compiler.	Split extraction, add entities/relationships/owners.
Freshness	`brain/company_brain/ingest/freshness.py`	Atom freshness fields exist.	Compute and rank with freshness; support supersession.
Connectors	`brain/company_brain/connectors/`	Missing.	Add local files and Git first.
Graph	`brain/company_brain/graph/`	Missing.	Entity resolution, aliases, relationship traversal.
Skills	`brain/company_brain/skills/`	Skill compilation/export in compiler and dry-run in `skill_runner.py`.	Split model/compiler/validator/executor/exporter modules.
Context compiler	`brain/company_brain/context/`	Context assembly exists in `query.py`.	Split compiler/packer/redactor; add role/permission/freshness packing.
Evals	`brain/company_brain/evals/`	Missing.	Retrieval, faithfulness, freshness, permission, skill eval suites.
API	`brain/company_brain/api.py`	Missing.	Search, answer, context, skills, feedback, audit API surface.
CLI	`brain/cli/company_brain.py`	Registered under `sb company-brain`.	Consider moving feature-local CLI assembly or adding aliases/nested target commands.

Recommended CLI Surface¶

Prefer one consistent naming convention.

Official command:

sb company-brain init
sb company-brain ingest run
sb company-brain search "how do we handle releases?"
sb company-brain answer "why did we choose LanceDB?"
sb company-brain context build --task "debug retrieval issue"
sb company-brain skills list
sb company-brain skills validate skills/release.yaml
sb company-brain skills run release.checklist --dry-run
sb company-brain eval run --suite core
sb company-brain audit show --trace <trace_id>

Short alias:

sb cb init
sb cb ingest run
sb cb search "how do we handle releases?"
sb cb answer "why did we choose LanceDB?"
sb cb context build --task "debug retrieval issue"
sb cb skills list
sb cb eval run --suite core

Support both:

company-brain = official command
cb = short alias

Current CLI differences to close:

Target	Current Equivalent	Gap
`sb company-brain init`	`sb company-brain db init`	Add top-level convenience command.
`sb company-brain ingest run`	`sb company-brain db store` / `db ingest-local`	Add connector-oriented ingest namespace.
`sb company-brain answer`	`sb company-brain query`	Add alias or rename with compatibility.
`sb company-brain context build --task`	`sb company-brain context "task"`	Add nested target command.
`sb company-brain skills run --dry-run`	`sb company-brain skills run` dry-runs by default	Add explicit flag before real execution exists.
`sb cb ...`	Missing	Add short alias if desired.

Recommended MCP Tool Names¶

Expose these from the SecondBrain MCP server:

company_brain.search
company_brain.answer
company_brain.context_pack
company_brain.get_skill
company_brain.run_skill
company_brain.validate_skill
company_brain.record_decision
company_brain.record_feedback
company_brain.find_open_loops
company_brain.resolve_entity
company_brain.audit_trace

Suggested P0 Implementation Backlog¶

Step	Work Item	Module	Current Status
1	Define Pydantic models for workspace, source document, atom, entity, relationship, decision record, procedure skill, evidence receipt, memory mutation	`brain/company_brain/models.py`	🟡 Partial - core storage models exist; graph/mutation models missing.
2	Add SQLite tables and repository methods	`brain/company_brain/datastore.py`	🟡 Partial - docs/atoms/evidence/skills/runs exist; graph/mutations missing.
3	Build local files connector	`brain/company_brain/connectors/local_files.py`	🟡 Partial - local Markdown compile exists, connector module missing.
4	Build Git repo connector	`brain/company_brain/connectors/git.py`	❌ Missing.
5	Add ingestion pipeline with chunking, hash-based incremental updates, and error reporting	`brain/company_brain/ingest/pipeline.py`	❌ Missing.
6	Add extraction layer for entities, decisions, facts, open loops	`brain/company_brain/ingest/extractors.py`	🟡 Partial - compiler extracts decisions/open loops/processes/policies; entities missing.
7	Add hybrid retrieval integration with current SecondBrain retriever	`brain/company_brain/api.py` or `query.py`	🟡 Partial - FTS and scoring reuse exist; vector integration missing.
8	Add grounded answer function with receipts and no-evidence fallback	`brain/company_brain/api.py`	🟡 Partial - CLI query has deterministic answer; no API function module.
9	Add context compiler with token budget and evidence preservation	`brain/company_brain/context/compiler.py`	🟡 Partial - implemented in `query.py`; split module later.
10	Add manual skill schema, validation, and dry-run	`brain/company_brain/skills/`	🟡 Partial - dry-run exists; validation/import missing.
11	Add policy checks for retrieval and skill execution	existing policy layer plus `brain/company_brain/`	❌ Missing.
12	Add CLI commands	`brain/cli/company_brain.py`	🟡 Partial - core commands exist; target aliases/namespaces missing.
13	Add eval suites for retrieval, faithfulness, freshness, permissions	`brain/company_brain/evals/`	❌ Missing.
14	Add README/demo dataset	`examples/company_brain/` or docs	❌ Missing.

Final Warning¶

The most important failure mode is becoming "RAG over my repo." A real Company Brain must also model decisions, ownership, workflows, skills, freshness, policy, and feedback-driven memory mutation.

For SecondBrain, brain/company_brain/ should become the organizational memory and skill compiler module, not merely another retrieval package.