Skip to content

Ingest And Context

This guide covers the current ingestion path, promoted knowledge flow, and context-compilation surfaces.

Purpose

  • bring local files into the system
  • understand the difference between raw capture, promotion, and compiled runtime context
  • verify the state used by sb ask, sb chat, and sb data-agent

Who It Is For

  • operators building a local knowledge base
  • contributors changing ingest or retrieval behavior
  • users debugging why an answer did or did not see a document

Core Commands

sb capture note "Remember this"
sb capture file path/to/file.pdf
sb capture web https://example.com
sb capture meeting --title "Weekly Sync" --file transcript.md
sb capture status
sb knowledge shravan add "I learned ..." --source book --title "..."
sb knowledge manan reflect <shravan_id>
sb knowledge nididhyasan implement <manan_id>
sb knowledge status
sb ingest <path>
sb ingest-pipeline <path>
sb ingest-refresh
sb ingest-stats
sb ingest-inspect <artifact_id>
sb context index
sb context sources
sb context compile "query"
sb context annotate
sb context cards

Use sb <command> --help for exact options and newer flags.

1. Ingest local material

sb capture file path/to/doc.pdf
sb ingest vault/

Use sb capture ... for one-off notes, web references, meetings, clipboard content, and single files. Use sb ingest for the normal folder path. It can register sources for later refresh and uses the promotion-aware capture pipeline by default.

For source-aware intake that should be deliberately assimilated, start a knowledge lifecycle record:

sb knowledge shravan add "Agents need eval harnesses before production" \
  --source engineering_discussion \
  --title "Agent eval discussion" \
  --authority secondary \
  --trust-score 0.8

This records what was heard or read, the source authority, provenance, why it matters, and open questions before the idea is promoted into long-term memory.

2. Run the explicit promotion pipeline when you need step visibility

sb ingest-pipeline vault/

This is the best debugging path when you want to reason about add, normalize, score, and promote as separate phases.

3. Build the optional offline index

sb context index
sb context sources

sb index is the same operation. This compiles the offline context index from promoted knowledge plus bounded evidence. Use sb context sources when you want to see which files, docs, or persisted records actually contributed rows to the compiled index.

4. Preview the runtime context for a question

sb context compile "What changed this week?"

Use this when you want to inspect what the runtime would assemble before you ask through chat or Data Agent.

Annotation Cards

The context subsystem also supports annotation cards under ops/cards/:

sb context annotate
sb context cards

Use these when you want a lightweight, curated description of a dataset, tool, or subsystem to be visible to retrieval and runtime context assembly.

Mental Model

Think of the current flow as three layers:

  1. raw capture and normalization
  2. promoted knowledge and curated annotations
  3. compiled runtime context for a specific question or task

sb ask, sb chat, and sb data-agent all benefit from the promoted/compiled layers, not just raw file presence.

The knowledge assimilation lifecycle adds a maturity path inside this model:

  1. Shravan: source-aware intake with provenance, authority, and open questions
  2. Manan: reflection that extracts claims, doubts, counterpoints, connections, and principles
  3. Nididhyasan: practice loops that turn reflected knowledge into habits, tasks, rules, checklists, projects, or operating principles

Use:

sb knowledge manan reflect <shravan_id>
sb knowledge nididhyasan implement <manan_id>
sb knowledge review --weekly

sb knowledge status shows how many items are only captured, reflected but not practiced, or already converted into practice.

Common Checks

sb ingest-stats
sb ingest-inspect <artifact_id>
sb data-agent status
sb context compile "query" --json

Use these when answers look sparse or when a newly ingested source is not showing up where expected.