Skip to content

Company Brain Readiness Checklist

YC's Company Brain framing is specific: it is not just company-wide search or a chatbot over documents. A real Company Brain should pull fragmented company knowledge together, structure it, keep it current, and convert it into an executable skills file for AI.

For SecondBrain, treat the implementation as:

brain/company_brain/

This checklist is an audit document for the current SecondBrain implementation. The status column is intentionally conservative: a capability is marked done only when it is working end-to-end, tested, and documented in the current tree.

Current Readiness Snapshot

Current module shape:

brain/company_brain/
  __init__.py
  compiler.py       # deterministic Markdown compiler and skill draft renderer
  datastore.py      # standalone SQLite datastore
  models.py         # compile, datastore, search, skill, and run models
  query.py          # standalone FTS search and context-pack assembly
  review.py         # focused Antahkarana/Buddhi reuse for evidence review
  skill_runner.py   # dry-run planning for stored procedure skills
  store.py          # baseline report persistence

The current implementation is a credible local-first seed: it can compile local Markdown into cited operating knowledge, store source documents, evidence receipts, knowledge atoms, procedure skills, compile runs, and skill dry-run traces in a separate company_brain.db, search those atoms with SQLite FTS, build evidence-carrying context packs, and surface contradiction review through focused Antahkarana reuse.

It should not yet be called a complete Company Brain. The main gaps are Git ingestion, incremental sync, first-class entity/relationship/decision/open-loop tables, vector retrieval, permission-aware retrieval, redaction, feedback-driven memory mutation, eval suites, HTTP API, MCP tools, and real policy-gated action execution.

Target Internal Structure

Recommended long-term structure:

brain/company_brain/
  models.py
  store.py
  ingest/
    pipeline.py
    parsers.py
    dedupe.py
    extractors.py
    freshness.py
  connectors/
    local_files.py
    git.py
    github.py
    notion.py
    gmail.py
    slack.py
    calendar.py
  graph/
    entities.py
    relationships.py
    resolver.py
  skills/
    model.py
    compiler.py
    validator.py
    executor.py
    exporter.py
  context/
    compiler.py
    packer.py
    redactor.py
  evals/
    datasets.py
    judges.py
    suites.py
  api.py
  cli.py

Scoring

Score Meaning
✅ Done Working end-to-end, tested, documented
🟡 Partial Exists but incomplete, brittle, or not validated
❌ Missing Not implemented
⚠️ Risk Implemented but unsafe, unscalable, or unclear

1. Core Product Fit

# Capability Priority Vetting Question Pass Criteria Status
1 Not just RAG/chatbot P0 Can the system do more than answer questions over docs? It supports memory, context compilation, decisions, skills, and governed actions. 🟡 Partial - context packs, decisions-as-atoms, skills, and dry-runs exist; governed action execution is not real yet.
2 Company knowledge map P0 Does it model how work actually happens? People, projects, systems, decisions, workflows, tools, and dependencies are represented. 🟡 Partial - workflows/decisions/open loops are extracted as atoms; people/projects/systems/dependencies are not first-class.
3 Living memory P0 Does knowledge update over time? New docs/conversations/events update memory without full manual rebuild. 🟡 Partial - repeated local store runs update deterministic records; no incremental watcher or connector sync.
4 Evidence-backed answers P0 Does every answer show receipts? Source document, line/span, timestamp, and confidence are visible. ✅ Done - search/query/context preserve evidence receipts from local Markdown.
5 Executable skills P0 Can it convert knowledge into reusable agent instructions? SOPs/runbooks/checklists can become structured skills. 🟡 Partial - process/policy sections become skill drafts and stored procedure skills; schema is still light.
6 Governed execution P0 Can agents safely act, not just reason? Side effects go through policy and approval gates. 🟡 Partial - dry-run flags approval-required steps and disables side effects; no human approval workflow or executor.
7 Human correction loop P0 Can users correct wrong memory? Corrections are stored, traceable, and affect future retrieval. ❌ Missing - no feedback/mutation path for Company Brain records.
8 Agent-ready interface P1 Can other agents/tools use it? CLI/API/MCP endpoints expose search, context, skills, memory, and audit. 🟡 Partial - CLI exists; API and MCP tools are missing.
9 Local-first mode P1 Can it run privately on local machine/company infra? Core ingestion, retrieval, memory, and skills work without mandatory cloud dependency. ✅ Done - current compiler, store, search, context, and dry-run are offline-safe.
10 Multi-workspace support P2 Can different projects/companies stay isolated? Separate workspaces with isolated indexes, policies, and audit logs. 🟡 Partial - workspace IDs exist in the standalone DB; indexes/policies/audit are not fully isolated.

2. Source Ingestion

# Capability Priority Vetting Question Pass Criteria Status
11 Local file ingestion P0 Can it ingest Markdown, TXT, PDF, JSON, YAML, code docs? sb ingest or sb company-brain ingest works on local folders with stable document IDs. 🟡 Partial - local Markdown works through db store; other formats and ingest run surface are missing.
12 Git repo ingestion P0 Can it understand source repos? Reads README, docs, ADRs, code comments, tests, configs, commit metadata. ❌ Missing - no Company Brain Git connector yet.
13 Incremental ingestion P0 Does it avoid reprocessing everything? Changed files are detected by hash, mtime, version, or Git diff. 🟡 Partial - content hashes and stable IDs exist; no incremental pipeline.
14 Document versioning P0 Does it track changes? Same document has version history and supersession metadata. 🟡 Partial - source documents have version and timestamps; no version history table.
15 Conversation import P1 Can it ingest chat transcripts? Markdown/JSON chat import extracts decisions, facts, open loops. ❌ Missing.
16 GitHub/GitLab connector P1 Can it ingest PRs/issues/releases? PR discussions, issue status, labels, owners, and links are indexed. ❌ Missing.
17 Email connector P1 Can it ingest email threads safely? Thread-aware ingestion with sender, date, attachments, and redaction. ❌ Missing.
18 Calendar/meeting connector P1 Can it ingest meeting context? Participants, agenda, notes, actions, decisions are captured. ❌ Missing.
19 Notion/Confluence/GDocs connector P1 Can it ingest canonical company docs? Page hierarchy, owner, last edited timestamp, and permissions are preserved. ❌ Missing.
20 Slack/Teams connector P2 Can it handle noisy channels? Channel allowlist, thread extraction, decision detection, noise filtering. ❌ Missing.
21 Connector registry P1 Are connectors pluggable? Each connector has type, config, sync mode, permission model, and status. ❌ Missing.
22 Failed ingestion handling P0 What happens when parsing fails? Failures are logged, retryable, and visible in diagnostics. 🟡 Partial - compiler warnings exist; no retryable ingestion diagnostics.
23 Source authority metadata P0 Does system know which source is canonical? ADR > official docs > merged PR > chat message, or configurable ranking. ❌ Missing.

3. Knowledge Modeling

# Capability Priority Vetting Question Pass Criteria Status
24 SourceDocument object P0 Is every source stored as a canonical object? Has ID, source URI, checksum, owner, timestamp, permissions. 🟡 Partial - ID, URI/path, hash, timestamps, metadata exist; owner/permissions are not complete.
25 KnowledgeAtom object P0 Are facts extracted below document level? Important facts are stored as atomic, evidence-linked units. ✅ Done - local decisions, processes, policies, and open loops become evidence-linked atoms.
26 Entity model P0 Are people/projects/systems modeled? Entities have aliases, type, description, evidence, and relationships. ❌ Missing.
27 Relationship model P0 Can it represent dependencies? Supports owns, depends_on, decided_by, supersedes, blocks, related_to. ❌ Missing.
28 DecisionRecord model P0 Are decisions first-class? Stores decision, rationale, alternatives, owner, date, evidence, status. 🟡 Partial - decisions are atoms/context refs; no first-class decision table.
29 OpenLoop model P1 Can it track pending work/questions? Extracts unresolved items, owner, due date, source, status. 🟡 Partial - open loops are extracted as atoms; no owner/due/status object.
30 Skill model P0 Are workflows represented structurally? Skill has trigger, inputs, steps, tools, policy, approval rules, expected output. 🟡 Partial - stored skills have trigger/instructions/evidence/safety; inputs/tools/outputs/version policy are missing.
31 EvidenceReceipt model P0 Are claims traceable? Every fact/answer/skill step points to one or more receipts. ✅ Done for current local flows.
32 MemoryMutation model P0 Are memory changes auditable? Add/update/merge/deprecate/delete events are append-only. ❌ Missing.
33 Freshness metadata P0 Does knowledge age correctly? Every object has created_at, updated_at, valid_from, valid_until, freshness_score. 🟡 Partial - atoms have these fields; not every object does, and ranking does not fully use them.
34 Confidence score P0 Does it distinguish weak vs strong facts? Confidence combines source quality, extraction certainty, recency, and corroboration. 🟡 Partial - confidence fields exist; score is mostly deterministic/static.
35 Supersession model P0 Can old knowledge be replaced? Objects can supersede or be superseded without silent deletion. 🟡 Partial - atom supersession fields exist; no operational mutation/deprecation flow.
36 Sensitivity labels P0 Is sensitive data marked? Objects carry sensitivity level: public/internal/confidential/secret/PII. 🟡 Partial - atom sensitivity_level exists; no classifier or enforcement.

4. Retrieval and Answer Quality

# Capability Priority Vetting Question Pass Criteria Status
37 Hybrid retrieval P0 Does retrieval combine semantic and keyword search? Vector + BM25/keyword + recency + authority ranking are used. 🟡 Partial - FTS/keyword scoring exists; vector, recency, and authority ranking are missing.
38 Entity-aware retrieval P0 Does it use graph/entity context? Query for a project/system pulls related owners, decisions, docs, issues. ❌ Missing.
39 Temporal retrieval P0 Can user ask "latest", "last month", "as of X"? Time filters and freshness-aware ranking work. ❌ Missing.
40 Permission-aware retrieval P0 Can retrieval filter restricted data? Unauthorized docs/facts never enter context. ❌ Missing.
41 Authority-aware ranking P0 Does canonical knowledge rank higher? Official docs/ADRs beat stale chats or random notes. ❌ Missing.
42 Conflict detection P0 Does it detect contradictory sources? Answer says "sources disagree" and shows both. 🟡 Partial - Buddhi contradiction review exists for retrieved hits; not integrated with all answer flows.
43 Stale knowledge warning P0 Does it warn on old evidence? Uses "possibly stale" marker when source is outdated or superseded. ❌ Missing.
44 Citation/receipt preservation P0 Does answer preserve source links? Final answer includes source references for key claims. ✅ Done for current search/query/context commands.
45 No-evidence behavior P0 What happens when answer is unsupported? It says insufficient evidence instead of hallucinating. ✅ Done for deterministic query fallback.
46 Multi-hop retrieval P1 Can it follow dependencies? Example: service -> owner -> runbook -> recent incident -> fix PR. ❌ Missing.
47 Query decomposition P1 Can it break complex questions? Multi-part questions are decomposed into subqueries with trace. ❌ Missing.
48 Retrieval diagnostics P1 Can developer inspect why result was chosen? Shows score components: vector, keyword, recency, authority, graph, permission. 🟡 Partial - result score and matched terms exist; component diagnostics are missing.
49 Golden query suite P0 Do you test retrieval quality? Fixed benchmark questions run in CI or local eval. 🟡 Partial - CLI/search behavior tests exist; no dedicated quality eval suite.

5. Executable Skills Layer

# Capability Priority Vetting Question Pass Criteria Status
50 Manual skill authoring P0 Can user define a skill by hand? YAML/Markdown skill file can be loaded and validated. 🟡 Partial - generated SKILL.md output exists; no Company Brain skill import/validation command.
51 Skill schema validation P0 Are skills strongly typed? Required fields, inputs, outputs, tools, and approvals are validated. 🟡 Partial - Pydantic models exist; schema is not full and no validation CLI exists.
52 Skill extraction from docs P1 Can docs/runbooks become skills? System proposes structured skill from SOP/runbook with evidence. ✅ Done for local Markdown process/policy sections.
53 Skill versioning P0 Can skills evolve safely? Every skill has version, changelog, author, evidence, and rollback path. ❌ Missing.
54 Skill dry run P0 Can skill be simulated? --dry-run shows planned steps without side effects. ✅ Done behaviorally through sb company-brain skills run; an explicit --dry-run flag is still a CLI polish item.
55 Skill execution trace P0 Is every step traceable? Each step logs inputs, tools, outputs, evidence, policy decision. 🟡 Partial - dry-run traces persist planned steps, inputs, evidence, and policy controls; tool outputs are not implemented.
56 Tool binding P0 Can skill call tools? Skill steps can bind to CLI/API/MCP tools. ❌ Missing.
57 Approval gates P0 Are risky steps blocked? Write/send/delete/deploy/payment actions require explicit approval. 🟡 Partial - risky steps are flagged; no approval workflow enforces execution.
58 Skill failure handling P1 What happens when a step fails? Failure is classified, logged, and recoverable. ❌ Missing.
59 Skill feedback loop P1 Can failed runs improve the skill? Failed traces generate suggested skill patches. ❌ Missing.
60 Skill export P1 Can skills be exported for agents? Exports to SKILLS.md, MCP resource, or agent-specific instructions. 🟡 Partial - skills-write exports SKILL.md; MCP/agent-specific exports are missing.
61 Skill marketplace/templates P2 Can users reuse common skills? Includes templates for incident response, PR review, release, customer escalation. ❌ Missing.

6. Context Compiler

# Capability Priority Vetting Question Pass Criteria Status
62 Task-aware context pack P0 Does it compile context for a specific task? sb company-brain context build --task ... returns focused context, not raw dump. 🟡 Partial - sb company-brain context "task" builds focused packs; target context build --task surface is missing.
63 Role-aware context P1 Does context differ by role? Engineer, founder, PM, support agent receive different context. 🟡 Partial - role is captured in CLI payload; selection strategy does not change by role.
64 Token budget control P0 Can it fit within model limits? Context has hard budget, prioritization, and truncation strategy. ✅ Done - ContextPackV2.enforce_budget is used.
65 Evidence-carrying context P0 Do facts retain receipts inside context? Context pack includes source IDs, spans, timestamps, and confidence. ✅ Done for current local evidence receipts.
66 Freshness-aware packing P0 Does current info beat stale info? Latest valid decisions/docs are preferred. ❌ Missing.
67 Conflict-aware packing P0 Does it include conflict warnings? Conflicting context is explicitly marked. 🟡 Partial - contradictions from review are included as context conflicts when detected.
68 Permission-aware packing P0 Does it exclude restricted material? Context compiler applies same ACL as retrieval. ❌ Missing.
69 Agent-specific export P1 Can it export for Claude/Cursor/CLI/MCP? Same task can be formatted for different consuming agents. 🟡 Partial - Markdown/JSON context exists; agent-specific formats are missing.
70 Context diffing P2 Can user see what changed between packs? Pack versions can be compared. ❌ Missing.

7. Security, Governance, and Safety

# Capability Priority Vetting Question Pass Criteria Status
71 Source allowlist P0 Can user control what is ingested? Only configured sources are indexed. 🟡 Partial - commands ingest explicit local paths; no connector allowlist registry.
72 Secret detection P0 Are tokens/API keys caught? Common secrets are redacted before indexing. ❌ Missing.
73 PII detection P0 Is personal data handled safely? PII is labeled, redacted, or permission-restricted. ❌ Missing.
74 Policy engine P0 Is there centralized policy evaluation? Read/write/action decisions go through policy layer. ❌ Missing for Company Brain.
75 Read vs write separation P0 Are side effects separated from search? Search cannot accidentally mutate external systems. ✅ Done for current commands; search/query/context are read-only, and skill run is dry-run-only.
76 Approval workflow P0 Can humans approve risky actions? Approval is required for configured actions. ❌ Missing.
77 Audit log P0 Is every retrieval/action recorded? Query, retrieved docs, answer, tool call, approval, mutation are logged. 🟡 Partial - compile runs and skill dry-run traces persist; retrieval/query audit is missing.
78 Workspace isolation P1 Can two workspaces stay separate? Indexes, policies, logs, and secrets are isolated. 🟡 Partial - workspace filtering exists; policy/log/secret isolation is missing.
79 Data deletion P0 Can user delete/tombstone knowledge? Delete request removes or tombstones source and derived atoms. 🟡 Partial - low-level delete helpers exist for some records; no safe tombstone/delete workflow.
80 Model routing policy P1 Can sensitive context stay local? Policy controls which model/provider can see which data. ❌ Missing for Company Brain.
81 Prompt injection defense P0 Are source instructions treated as untrusted? Ingested docs cannot override system/tool policy. 🟡 Partial - no model execution in current path; future action runner needs explicit defense.
82 Tool sandboxing P1 Are tools constrained? Tools run with least privilege and clear allow/deny rules. ❌ Missing for Company Brain skill execution.

8. Evaluation and Observability

# Capability Priority Vetting Question Pass Criteria Status
83 Trace ID per run P0 Can every answer/action be debugged? Every request returns a trace ID. 🟡 Partial - compile runs and skill dry-runs have IDs; search/query/context requests do not persist trace IDs.
84 Retrieval evals P0 Do you measure precision/recall? Golden questions verify expected source retrieval. ❌ Missing dedicated eval suite.
85 Faithfulness evals P0 Do you detect unsupported claims? Answer claims are checked against retrieved evidence. ❌ Missing.
86 Freshness evals P0 Do you test latest/current behavior? Queries like "current policy" prefer latest valid source. ❌ Missing.
87 Permission evals P0 Do you test access leaks? Restricted docs never appear for unauthorized user/agent. ❌ Missing.
88 Conflict evals P1 Do you test contradictory sources? System detects and reports conflict instead of merging silently. 🟡 Partial - contradiction unit/CLI tests exist; no eval suite.
89 Skill evals P1 Do you test skill execution quality? Skills have test cases with expected plan/tool/policy behavior. 🟡 Partial - dry-run behavior tests exist; no broader skill eval dataset.
90 Regression suite P0 Can changes break existing behavior? CI/local test suite catches retrieval, policy, schema, and skill regressions. 🟡 Partial - focused tests cover current compile/store/search/context/dry-run; policy/schema breadth is incomplete.
91 Latency tracking P0 Do you measure performance? Ingestion, retrieval, context build, answer generation timings are logged. ❌ Missing.
92 Cost tracking P1 Do you track model/token cost? Prompt/completion tokens and model calls are recorded. ❌ Missing; current Company Brain path is model-free.
93 Feedback analytics P1 Can you see repeated failures? Wrong answers and corrections are grouped by source/entity/skill. ❌ Missing.
94 Observability dashboard P2 Is there a visual view? Dashboard shows runs, latency, retrieval quality, failed skills, policy blocks. ❌ Missing.

9. CLI, API, and MCP Readiness

# Capability Priority Vetting Question Pass Criteria Status
95 CLI init P0 Can user start easily? sb company-brain init creates workspace/config/index. 🟡 Partial - sb company-brain db init exists; top-level init does not.
96 CLI ingest P0 Can user ingest sources? sb company-brain ingest run works with logs and summary. 🟡 Partial - db store / db ingest-local exist; target ingest run surface is missing.
97 CLI search P0 Can user search memory? sb company-brain search "..." returns ranked evidence. ✅ Done.
98 CLI answer P0 Can user ask grounded questions? sb company-brain answer "..." returns answer plus receipts. 🟡 Partial - command is named query, not answer.
99 CLI context build P0 Can user build context packs? sb company-brain context build --task ... works. 🟡 Partial - context "task" exists; target nested surface is missing.
100 CLI skills P0 Can user list/validate/run skills? sb company-brain skills list/validate/run --dry-run works. 🟡 Partial - list/get/run exist; validate and explicit --dry-run flag are missing.
101 HTTP API P1 Can external apps call it? Search, answer, context, skills, feedback, audit endpoints exist. ❌ Missing.
102 MCP server P1 Can agent clients use it? Exposes company_brain.search, company_brain.context_pack, company_brain.get_skill, company_brain.run_skill. ❌ Missing.
103 Stable contracts P1 Are payloads typed and versioned? Pydantic/JSON Schema contracts exist for core objects. 🟡 Partial - Pydantic models and schema strings exist; no exported JSON Schema contracts.
104 Developer docs P1 Can others integrate quickly? README explains setup, examples, contracts, and extension points. 🟡 Partial - how-to and this audit exist; extension docs are missing.

10. Storage and Indexing

# Capability Priority Vetting Question Pass Criteria Status
105 SQLite source of truth P0 Is canonical state durable? Documents, atoms, entities, skills, decisions, mutations stored in SQLite. 🟡 Partial - documents, atoms, evidence, skills, compile runs, and skill runs are durable; entities/relationships/decisions/mutations are missing.
106 Vector index P0 Is semantic search supported? Documents, atoms, skills, decisions indexed for vector retrieval. ❌ Missing for Company Brain.
107 Keyword index P0 Is exact-match search supported? BM25/ripgrep/FTS works for names, APIs, errors, commands. ✅ Done - SQLite FTS5 powers atom search.
108 Graph edges P1 Are relationships queryable? Entity relationships stored and traversable. ❌ Missing.
109 Index rebuild P0 Can index be rebuilt safely? Rebuild does not corrupt canonical store. 🟡 Partial - schema init backfills FTS; no explicit rebuild command.
110 Backup/restore P1 Can user preserve brain state? Workspace export/import or DB backup works. ❌ Missing.
111 Migration support P1 Can schema evolve? Alembic/simple migrations with version tracking. 🟡 Partial - schema version is tracked and CREATE IF NOT EXISTS evolves tables; no migration framework.
112 Large corpus behavior P2 Does it scale beyond toy data? Tested on thousands/lakhs of chunks with acceptable latency. ❌ Missing.

11. Open-Source Product Quality

# Capability Priority Vetting Question Pass Criteria Status
113 One-command setup P0 Can a new user run it quickly? Install + init + ingest + ask works from README. 🟡 Partial - separate commands work; one-command setup is not documented.
114 Sample workspace P0 Is there demo data? Includes fake company brain dataset with docs, decisions, skills, evals. 🟡 Partial - tests create fake data; no committed demo Company Brain workspace/evals.
115 Clear positioning P0 Does README say what this is? "Local-first Company Brain for evidence-backed memory and agent skills." 🟡 Partial - docs describe this; top-level README positioning needs review.
116 Architecture diagram P1 Can users understand internals? README has flow: connectors -> ingestion -> memory/graph/skills -> context -> agents. ❌ Missing.
117 Extension guide P1 Can contributors add connectors/skills? Documented plugin interfaces. ❌ Missing.
118 Test coverage P0 Are core flows tested? Tests cover ingestion, retrieval, policy, context, skills, mutations. 🟡 Partial - compile/store/search/context/dry-run tests exist; policy/mutations/connectors missing.
119 MIT license P0 Is licensing clear? LICENSE file present and package metadata aligned. 🟡 Partial - verify at release; not audited in this checklist pass.
120 No private/org leakage P0 Is repo safe to publish? No company names, secrets, internal URLs, personal data, or private examples. ⚠️ Risk - requires dedicated repository scan before release.

Minimum Bar To Call It A Real Company Brain

Gate Must Pass Current Status
Ingest Local files + Git repo ingestion with incremental updates 🟡 Partial - local Markdown only; no Git/incremental pipeline.
Store Durable SQLite-backed source documents, knowledge atoms, decisions, skills, evidence receipts 🟡 Partial - separate DB stores docs/atoms/evidence/skills/runs; decisions are atoms, not records.
Retrieve Hybrid search with recency, authority, and evidence 🟡 Partial - FTS + evidence; no vector/recency/authority.
Answer Grounded answers with citations/receipts and no-evidence fallback ✅ Done for deterministic local query.
Model Entities, relationships, decisions, open loops, skills 🟡 Partial - open loops/decisions as atoms and skills exist; entities/relationships missing.
Compile Task-specific context packs with token budget and evidence ✅ Done for current standalone context command.
Execute At least manual skills with dry-run and policy-gated execution 🟡 Partial - dry-run exists; policy-gated execution missing.
Govern Secret/PII redaction, permission-aware retrieval, approval for side effects ❌ Missing.
Observe Trace ID, audit log, eval suite, feedback loop 🟡 Partial - compile and skill-run traces exist; audit/evals/feedback missing.
Expose CLI first, then API/MCP 🟡 Partial - CLI exists; API/MCP missing.

Suggested Audit Tracking Format

Use this table in implementation reviews:

ID Status Evidence Gap Owner Target
1 🟡 brain/company_brain/compiler.py, query.py, skill_runner.py No governed action runtime TBD P0
37 🟡 brain/company_brain/query.py uses SQLite FTS and scoring helpers No vector, recency, authority, or graph ranking TBD P0
57 🟡 brain/company_brain/skill_runner.py flags approval reasons No approval workflow integration TBD P0
83 🟡 company_brain_run and procedure_skill_run tables Query/context traces are not persisted TBD P0

Code-Level Vetting Map

Checklist Area Suggested Module/File Current Evidence What To Validate Next
Core models brain/company_brain/models.py Workspace, source document, atom, evidence, skill, search, review, run models exist. Add entity, relationship, decision record, open loop, memory mutation, skill version models.
Durable store brain/company_brain/datastore.py Standalone SQLite CRUD for documents, atoms, receipts, skills, runs. Add migrations, entities, relationships, decisions, mutations, tombstones.
Ingestion pipeline brain/company_brain/ingest/pipeline.py Missing. Build incremental pipeline with diagnostics.
Parsers brain/company_brain/ingest/parsers.py Markdown parsing lives inside compiler.py. Split parsers and add TXT/JSON/YAML/PDF/code docs.
Deduplication brain/company_brain/ingest/dedupe.py Deterministic entry dedupe exists in compiler. Add hash and semantic dedupe across source versions.
Extraction brain/company_brain/ingest/extractors.py Process/policy/decision/open-loop extraction exists in compiler. Split extraction, add entities/relationships/owners.
Freshness brain/company_brain/ingest/freshness.py Atom freshness fields exist. Compute and rank with freshness; support supersession.
Connectors brain/company_brain/connectors/ Missing. Add local files and Git first.
Graph brain/company_brain/graph/ Missing. Entity resolution, aliases, relationship traversal.
Skills brain/company_brain/skills/ Skill compilation/export in compiler and dry-run in skill_runner.py. Split model/compiler/validator/executor/exporter modules.
Context compiler brain/company_brain/context/ Context assembly exists in query.py. Split compiler/packer/redactor; add role/permission/freshness packing.
Evals brain/company_brain/evals/ Missing. Retrieval, faithfulness, freshness, permission, skill eval suites.
API brain/company_brain/api.py Missing. Search, answer, context, skills, feedback, audit API surface.
CLI brain/cli/company_brain.py Registered under sb company-brain. Consider moving feature-local CLI assembly or adding aliases/nested target commands.

Prefer one consistent naming convention.

Official command:

sb company-brain init
sb company-brain ingest run
sb company-brain search "how do we handle releases?"
sb company-brain answer "why did we choose LanceDB?"
sb company-brain context build --task "debug retrieval issue"
sb company-brain skills list
sb company-brain skills validate skills/release.yaml
sb company-brain skills run release.checklist --dry-run
sb company-brain eval run --suite core
sb company-brain audit show --trace <trace_id>

Short alias:

sb cb init
sb cb ingest run
sb cb search "how do we handle releases?"
sb cb answer "why did we choose LanceDB?"
sb cb context build --task "debug retrieval issue"
sb cb skills list
sb cb eval run --suite core

Support both:

company-brain = official command
cb = short alias

Current CLI differences to close:

Target Current Equivalent Gap
sb company-brain init sb company-brain db init Add top-level convenience command.
sb company-brain ingest run sb company-brain db store / db ingest-local Add connector-oriented ingest namespace.
sb company-brain answer sb company-brain query Add alias or rename with compatibility.
sb company-brain context build --task sb company-brain context "task" Add nested target command.
sb company-brain skills run --dry-run sb company-brain skills run dry-runs by default Add explicit flag before real execution exists.
sb cb ... Missing Add short alias if desired.

Expose these from the SecondBrain MCP server:

company_brain.search
company_brain.answer
company_brain.context_pack
company_brain.get_skill
company_brain.run_skill
company_brain.validate_skill
company_brain.record_decision
company_brain.record_feedback
company_brain.find_open_loops
company_brain.resolve_entity
company_brain.audit_trace

Suggested P0 Implementation Backlog

Step Work Item Module Current Status
1 Define Pydantic models for workspace, source document, atom, entity, relationship, decision record, procedure skill, evidence receipt, memory mutation brain/company_brain/models.py 🟡 Partial - core storage models exist; graph/mutation models missing.
2 Add SQLite tables and repository methods brain/company_brain/datastore.py 🟡 Partial - docs/atoms/evidence/skills/runs exist; graph/mutations missing.
3 Build local files connector brain/company_brain/connectors/local_files.py 🟡 Partial - local Markdown compile exists, connector module missing.
4 Build Git repo connector brain/company_brain/connectors/git.py ❌ Missing.
5 Add ingestion pipeline with chunking, hash-based incremental updates, and error reporting brain/company_brain/ingest/pipeline.py ❌ Missing.
6 Add extraction layer for entities, decisions, facts, open loops brain/company_brain/ingest/extractors.py 🟡 Partial - compiler extracts decisions/open loops/processes/policies; entities missing.
7 Add hybrid retrieval integration with current SecondBrain retriever brain/company_brain/api.py or query.py 🟡 Partial - FTS and scoring reuse exist; vector integration missing.
8 Add grounded answer function with receipts and no-evidence fallback brain/company_brain/api.py 🟡 Partial - CLI query has deterministic answer; no API function module.
9 Add context compiler with token budget and evidence preservation brain/company_brain/context/compiler.py 🟡 Partial - implemented in query.py; split module later.
10 Add manual skill schema, validation, and dry-run brain/company_brain/skills/ 🟡 Partial - dry-run exists; validation/import missing.
11 Add policy checks for retrieval and skill execution existing policy layer plus brain/company_brain/ ❌ Missing.
12 Add CLI commands brain/cli/company_brain.py 🟡 Partial - core commands exist; target aliases/namespaces missing.
13 Add eval suites for retrieval, faithfulness, freshness, permissions brain/company_brain/evals/ ❌ Missing.
14 Add README/demo dataset examples/company_brain/ or docs ❌ Missing.

Final Warning

The most important failure mode is becoming "RAG over my repo." A real Company Brain must also model decisions, ownership, workflows, skills, freshness, policy, and feedback-driven memory mutation.

For SecondBrain, brain/company_brain/ should become the organizational memory and skill compiler module, not merely another retrieval package.