The architecture has been updated
This commit is contained in:
parent
805f7a017e
commit
a01257ead9
1119 changed files with 226 additions and 352 deletions
|
|
@ -1,89 +0,0 @@
|
|||
---
|
||||
sidebar_position: 5
|
||||
title: "Prompt Assembly"
|
||||
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
|
||||
---
|
||||
|
||||
# Prompt Assembly
|
||||
|
||||
Hermes deliberately separates:
|
||||
|
||||
- **cached system prompt state**
|
||||
- **ephemeral API-call-time additions**
|
||||
|
||||
This is one of the most important design choices in the project because it affects:
|
||||
|
||||
- token usage
|
||||
- prompt caching effectiveness
|
||||
- session continuity
|
||||
- memory correctness
|
||||
|
||||
Primary files:
|
||||
|
||||
- `run_agent.py`
|
||||
- `agent/prompt_builder.py`
|
||||
- `tools/memory_tool.py`
|
||||
|
||||
## Cached system prompt layers
|
||||
|
||||
The cached system prompt is assembled in roughly this order:
|
||||
|
||||
1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py`
|
||||
2. tool-aware behavior guidance
|
||||
3. Honcho static block (when active)
|
||||
4. optional system message
|
||||
5. frozen MEMORY snapshot
|
||||
6. frozen USER profile snapshot
|
||||
7. skills index
|
||||
8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1
|
||||
9. timestamp / optional session ID
|
||||
10. platform hint
|
||||
|
||||
When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead.
|
||||
|
||||
## API-call-time-only layers
|
||||
|
||||
These are intentionally *not* persisted as part of the cached system prompt:
|
||||
|
||||
- `ephemeral_system_prompt`
|
||||
- prefill messages
|
||||
- gateway-derived session context overlays
|
||||
- later-turn Honcho recall injected into the current-turn user message
|
||||
|
||||
This separation keeps the stable prefix stable for caching.
|
||||
|
||||
## Memory snapshots
|
||||
|
||||
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
|
||||
|
||||
## Context files
|
||||
|
||||
`agent/prompt_builder.py` scans and sanitizes project context files using a **priority system** — only one type is loaded (first match wins):
|
||||
|
||||
1. `.hermes.md` / `HERMES.md` (walks to git root)
|
||||
2. `AGENTS.md` (recursive directory walk)
|
||||
3. `CLAUDE.md` (CWD only)
|
||||
4. `.cursorrules` / `.cursor/rules/*.mdc` (CWD only)
|
||||
|
||||
`SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice.
|
||||
|
||||
Long files are truncated before injection.
|
||||
|
||||
## Skills index
|
||||
|
||||
The skills system contributes a compact skills index to the prompt when skills tooling is available.
|
||||
|
||||
## Why prompt assembly is split this way
|
||||
|
||||
The architecture is intentionally optimized to:
|
||||
|
||||
- preserve provider-side prompt caching
|
||||
- avoid mutating history unnecessarily
|
||||
- keep memory semantics understandable
|
||||
- let gateway/ACP/CLI add context without poisoning persistent prompt state
|
||||
|
||||
## Related docs
|
||||
|
||||
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
||||
- [Session Storage](./session-storage.md)
|
||||
- [Gateway Internals](./gateway-internals.md)
|
||||
Loading…
Add table
Add a link
Reference in a new issue