Documentation
Code intelligence for AI agents and the humans alongside them.
Prism turns a codebase into a queryable knowledge layer. A drop-in replacement for grep and “read the whole file” that pays for its tokens many times over.
Overview
Two surfaces, one index
Instead of giving an AI assistant raw grep and a few thousand tokens of guesswork, Prism gives it a hosted index that understands your repo: its symbols, its imports, its architectural shape, and the natural-language meaning of every function in it.
The same gateway, the same tokens, the same index. Two ergonomics for two audiences.
prism
Coding agents
Claude Code, Cursor, Continue, and any editing agent call the CLI directly. Automatic git-overlay is on by default, so unpushed local edits are visible to search and refactor tools before they are committed.
Hosted Gateway
Non-editing agents
JWT-authenticated, per-tenant Postgres-schema-isolated. Wiki agents, architect agents, ChatGPT, Claude Desktop, and custom internal tooling consume it over Streamable HTTP.
What makes Prism different
Four-leg hybrid retrieval
Most “hybrid” tools fuse vector + BM25. Prism fuses four legs via Reciprocal Rank Fusion (k=60): vector, BM25, description-vector, and exact ILIKE.
Architecture intelligence
PageRank + Leiden clustering means Prism knows which files matter and which files belong together. get_module_map and get_architecture return ranked components, not file lists.
Live unpushed-file overlay
The CLI runs git status, attaches dirty files as an overlay payload, and the gateway re-parses them in-memory before answering. No one else does this without forcing the agent to ship full-file payloads on every call.
Adaptive response shaping
Tools return ~200–400 tokens compact, ~500–3000 with snippets, automatically windowed when results are large. Confidence and provenance are stamped on every response so agents know when to act vs. reformulate.
Reuse-first discipline as a tool
check_exists matches a natural-language intent against LLM descriptions to surface existing helpers before the agent writes a duplicate. prepare_to_edit bundles source, top callers, test files, and rules in one call.
Per-tenant schema isolation
Not row-level filtering, not application-layer auth: physical schema separation as tenant_{id}. One SQL injection in a tenant query cannot escape the tenant.
The Indexing Pipeline
Eleven steps, one knowledge layer
The pipeline is a topologically-ordered DAG of self-registering steps (Kahn's algorithm). Each step is independently re-runnable; structural-change detection automatically flips the pipeline between delta and full-rebuild modes per step.
cloneorchestrationShallow git clone into a per-job tempdir, with the working tree wiped before each run.
→ Safe parallel indexing across many tenants on shared infrastructure.
scanorchestrationSHA-256 hash per file, diffed against file_index_state. Tags every file as created, modified, deleted, or unchanged.
→ A typical PR-sized push reindexes in seconds, not minutes.
chunk_and_embedvector embeddingTree-sitter AST chunking at function, class, and method boundaries. Wave-bounded concurrent embedding via Vertex AI gemini-embedding-001 (2000-dim). COPY bulk store.
→ Function-level search hits instead of random text windows. Large repos index reliably under quota turbulence.
index_rulesorchestrationParses .cursor/rules/*.mdc and .prule/* files at section granularity, then embeds each section.
→ An agent can pull a single rule section without loading the whole 200-line rule file. Saves ~80–90% of the context.
populate_symbol_refsorchestrationTree-sitter scope analysis across 11 languages. Extracts every symbol reference and tags it as definition, import, call, or assignment.
→ Refactoring is safe. Find all callers of User.save returns the real blast radius, not noise from imports and prose.
generate_descriptionsLLMAnthropic Batch API generates a one-sentence summary per chunk, embeds the summary, and stores it as a second embedding column alongside the code embedding.
→ Natural-language queries work as well as exact-keyword queries. search_code("rate limiting sliding window") finds code that doesn't contain those words.
build_repo_mapgraph algorithmImport-graph extraction across all source languages, then PageRank over the graph. Outputs per-module importance_score (0–100) and tier (core / significant / peripheral).
→ The same algorithm that built Google Search, applied to your codebase. The foundational files automatically rise to the top.
persist_edgesorchestrationPersists import edges to the database. Two modes: full rebuild on structural change, delta update for content-only edits. The pipeline picks the right mode automatically.
→ Webhook-driven incremental indexing stays fast even on monorepos with thousands of modules.
analyze_architecturegraph algorithmLeiden clustering over the import graph (modern community-detection algorithm, successor to Louvain). Produces components with LLM-generated summaries, capability index, and divergence detection.
→ Architects see the system as it really behaves, not as the folder tree suggests. Drift is flagged before it degrades the architecture.
enrich_chunksorchestrationUPDATE-JOIN propagating component_id from module_summaries back into code_chunks.
→ Search results know which architectural component they belong to. Agents scope queries to "the auth component" without knowing the folder structure.
(post-pipeline)orchestrationPostgreSQL tsvector triggers maintain the BM25 full-text index automatically as data flows in.
→ The exact-keyword and BM25 legs of search are always in sync with the latest pipeline run, with zero operational overhead.
Hybrid Retrieval
Four legs, one ranked answer
Every query is evaluated against four independent retrieval strategies, then fused with Reciprocal Rank Fusion (k=60, Cormack et al. 2009).
Vector cosine
Code embeddings over tree-sitter AST chunks. Finds code by structural and lexical similarity.
BM25 (tsvector)
PostgreSQL full-text index over identifiers, keywords, and comments. Updated automatically by triggers.
Description vector
LLM-summarized chunks indexed as a second embedding. Lets natural-language queries match what the code does, not what it says.
Exact ILIKE
Strict superset of grep. Guarantees agents never need to fall back to text search.
Every result carries provenance
Responses include a confidence band (high / medium / low), the legs that contributed to each result, and a next_actions hint so the agent knows when to act vs. reformulate.
{
"confidence": "high",
"provenance": ["vector", "description", "bm25"],
"results": [
{
"symbol": "RateLimiter.allow",
"file": "src/middleware/rate-limit.ts",
"line": 88,
"description": "Sliding-window rate limiter keyed by tenant id."
}
],
"next_actions": ["prism prepare-edit RateLimiter.allow"],
"indexed_at": "2026-05-06T09:14:22Z",
"index_age_seconds": 142
}Always-Fresh Index
Incremental, branch overlays, webhooks
A code index that's a day stale is wrong. Prism is engineered so the index reflects the repo within seconds of a push.
Incremental indexing
The scan step hashes every file and compares to file_index_state. The pipeline only re-chunks, re-embeds, and re-describes the delta. PageRank rebuilds only on structural change; content-only edits skip the global rebuild. A 40k-chunk repo update from a typical PR completes in seconds.
Branch overlays
Long-lived feature branches get a delta-only overlay: only files that differ from the default branch are stored, tagged with the branch name. Search results scoped to the branch combine base rows with overlay rows, with the overlay winning on changed files.
Three-layer cleanup keeps overlays from accumulating: GitHub delete webhook (fast), PR-merge webhook (backup), and a 14-day staleness reaper (final safety net). Agents create and delete overlays via create_branch_index / delete_branch_index.
Webhook triggers
GitHub push events hit a Cloud Run webhook that runs the pipeline in incremental mode, scoped to changed files. Vertex 429s back off for 90s and retry up to 2 times. The Console exposes POST /api/v1/repos/{repo_id}/backfill-descriptions for re-running just the LLM-description step over historical chunks.
Freshness telemetry
Every API response carries indexed_at and index_age_seconds. Agents that get back an index_age_seconds > 3600 know to either trust-with-skepticism or trigger a reindex.
The CLI
prism — for coding agents
Auto-overlay is on by default for editing-relevant commands. Agents call the CLI exactly like they would call grep, but they get a JSON response from the hosted gateway with the agent's local edits factored in.
prism search "<query>"alias: grep4-leg hybrid search. Returns ranked results with confidence and provenance.
prism find-refs "<symbol>"alias: refsScope-aware references grouped by kind. --kind callers for refactoring, --kind imports for dependency analysis.
prism def "<symbol>"Go-to-definition. ~200 tokens. The cheapest navigation tool. Returns suggestions on miss.
prism body "<symbol>"Full source of one function, class, or method. Use after def when you need the whole implementation.
prism outline <file>All symbols in a file, nested by class hierarchy. ~50 tokens. Use before reading a large file.
prism module-map [path]alias: mapPageRank-ranked architecture map. With --query, re-ranks by relevance × PageRank (RRF fusion). The first call when exploring a new area.
prism deps <path>Module-level import graph (depends-on / depended-on-by).
prism prepare-edit "<symbol>"One call: source + top 3 callers + test files + relevant rules + public-API warnings. Replaces the 4–5-call recipe before editing.
prism check "<intent>"Reuse-first guard. “I need to parse an ISO 8601 duration” returns up to 5 existing helpers ranked by description similarity.
prism find <pattern>Glob-style file finder.
prism pingGateway health check.
prism list-reposAll repos visible to the current tenant.
Global flags
--repo Owner/Name
--branch <name>
--auto-overlay / --no-overlay
--prettyMCP Gateway
prism-hosted — for non-editing agents
Same gateway, MCP-protocol surface. JWT RS256 auth. Per-tenant default repo via the X-Prism-Repo header. Streamable HTTP transport (no SSE session state to lose).
Tool
Purpose
Cost
search_code4-leg hybrid search. Returns confidence + provenance + next_actions. Branch and overlay aware.
~200–1000 tok
get_module_mapPageRank-ranked file or directory hierarchy. Optional query re-ranks by task-relevance × PageRank. Tier classification.
~200–500 tok
get_file_outlineAll symbols in a file, classes nested with their methods.
~300–1500 tok
get_function_bodyFull source of a specific symbol.
~100–800 tok
get_symbol_definitionGo-to-definition. Batch up to 20 symbols. Fuzzy-match suggestions on miss.
~100–800 tok
find_referencesRefs grouped by definitions / imports / callers / assignments. Batch up to 10 symbols. source_only strips test noise (~80%).
~200–500 tok
get_dependenciesModule-level import graph with configurable depth.
~20–400 tok
get_architectureSystem overview (mode 1) or component drill-down (mode 2). LLM-summarized capabilities, divergence flags, top entry points.
~200–4000 tok
get_repo_contextVision / architecture / readme / decisions docs. Filterable by section.
~80–800 tok
get_coding_standards.cursor/rules index, semantic search, or specific section fetch. Audience and category filters.
~100–500 tok
search_docsSearch restricted to markdown / .mdc files only. Same shape as search_code.
~200–500 tok
prepare_to_editPre-edit bundle: source + top callers + test files + relevant rules + warnings.
~500–2000 tok
check_existsIntent-matched existing-helper finder.
~300–1000 tok
create_branch_index / delete_branch_indexOverlay lifecycle for long-lived feature branches.
minimal
list_repos / pingDiscovery and health.
~20–500 tok
Tool profiles
The ?profile= query parameter lets agent frameworks scope which tools are advertised. The wiki profile exposes only navigation, architecture, and docs tools and hides the editing-flow tools.
architect | devlead | developer | reviewer | wiki | fullAuthentication
Every request requires a JWT RS256 bearer token. Tokens are scoped to a tenant and a default repository.
Authorization: Bearer <jwt-rs256-token>
X-Prism-Repo: myorg/myrepoBeyond Code
Wiki Agent + Architect Agent
Turn understanding a codebase from a months-long apprenticeship into a 10-minute conversation. Both agents are built on Prism tool profiles and cite every claim back to a file and line.
Wiki Agent
profile=wiki“What does this codebase do? How are the major components connected? Walk me through how a request flows from the API to the database.”
Tools called
- get_architecture
- get_module_map
- get_dependencies
- search_docs
- get_function_body
Output
A guided tour with file:line citations. Every claim links back to a real location in the repo.
Perfect for
Onboarding, code-review prep, handover from a departing engineer, executive questions.
Architect Agent
profile=architect“We're adding rate-limiting per tenant. What's the shape of the existing middleware layer? Are there patterns I should follow? What components would this touch?”
Tools called
- get_architecture
- search_code
- get_coding_standards
- get_dependencies
- get_repo_context
Output
A placement recommendation with reasoning that an Architecture Board would actually accept, anchored in existing patterns and prior ADRs.
Perfect for
Design reviews, refactor scoping, tech-debt triage, reuse-vs-rebuild decisions.
The pitch in one sentence
Prism is the only code-intelligence platform that combines hybrid 4-leg semantic retrieval, PageRank-ranked architecture awareness, LLM-summarized chunk descriptions, scope-aware cross-language refs, branch overlays, and live git-overlay-via-CLI, served from a hosted multi-tenant gateway that scales from a single agent in your editor to a whole organization's AI workforce.